-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feature: create roi selection #90
base: main
Are you sure you want to change the base?
Conversation
Tests now include inputs with np.nan values and values that are not 1.
This allows e.g. vertical regions to be split by row, by horizontal regions not be split by column.
1045834
to
170ceea
Compare
@DaniBodor @ajonkman This feature is now ready for review. It has become vastly more complicated than I first intended due to these factors:
These features have now been built in. I'm not entirely sure this API will work best, yet. I think it might be better to save the matrices to the object itself when calling |
TODO
|
np.outer(a, b) is the same as a[:, np.newaxis] @ b[np.newaxis, :], when a and b are 1D arrays.
np.newaxis is an alias of None, but more clear when reading the code.
This class has been moved to its own branch 115_functionallungspace_psomhorst
8e7a244
to
af172d7
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still need to look carefully at the tests, but thought I'd already open these comments/suggestions up anyway, as #119 has higher priority right now.
GridSelection allows for the creation a list of 2D arrays that can be used to divide a two- or | ||
higher-dimensional array into several regions structured in a grid. An instance of | ||
GridSelection contains information about how to subdivide an input matrix. Calling | ||
`find_grid(data)`, where data is a 2D array, results in a list of arrays with the same | ||
dimension as `data`, each representing a single region. Each resulting 2D array contains the | ||
value 0 for pixels that do not belong to the region, and the value 1 or any number between 0 | ||
and 1 for pixels that (partly) belong to the region. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Part of this is a rather generic explanation of what an ROI is/how we define it and handle it. Should that part of this docstring perhaps move to the docstring of the parent class?
also, see typo below:
GridSelection allows for the creation a list of 2D arrays that can be used to divide a two- or | |
higher-dimensional array into several regions structured in a grid. An instance of | |
GridSelection contains information about how to subdivide an input matrix. Calling | |
`find_grid(data)`, where data is a 2D array, results in a list of arrays with the same | |
dimension as `data`, each representing a single region. Each resulting 2D array contains the | |
value 0 for pixels that do not belong to the region, and the value 1 or any number between 0 | |
and 1 for pixels that (partly) belong to the region. | |
GridSelection allows for the creation of a list of 2D arrays that can be used to divide a two- or | |
higher-dimensional array into several regions structured in a grid. An instance of | |
GridSelection contains information about how to subdivide an input matrix. Calling | |
`find_grid(data)`, where data is a 2D array, results in a list of arrays with the same | |
dimension as `data`, each representing a single region. Each resulting 2D array contains the | |
value 0 for pixels that do not belong to the region, and the value 1 or any number between 0 | |
and 1 for pixels that (partly) belong to the region. |
If the number of rows or columns can not split evenly, a row or column can be split among two | ||
regions. This behaviour is controlled by `split_rows` and `split_columns`. | ||
|
||
If `split_rows` is `False` (default), rows will not be split between two groups. A warning will | ||
be shown stating regions don't contain equal numbers of rows. The regions towards the top will | ||
be larger. E.g., when a (5, 2) array is split in two vertical regions, the first region will | ||
contain the first three rows, and the second region the last two rows. | ||
|
||
If `split_rows` is `True`, e.g. a (5, 2) array that is split in two vertical regions, the first | ||
region will contain the first two rows and half of each pixel of the third row. The second | ||
region contains half of each pixel in the third row, and the last two rows. | ||
|
||
`split_columns` has the same effect on columns as `split_rows` has on rows. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would it be clearer to add this to "Args" or examples section of the docstring instead? Not sure if that is more or less clear, tbh.
Regions are ordered according to C indexing order. The `matrix_layout()` method provides a map | ||
showing how the regions are ordered. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unclear what "C indexing order" is. Maybe refer to example below?
I also rephrased slightly to make it clearer that the method refers to the same thing.
Regions are ordered according to C indexing order. The `matrix_layout()` method provides a map | |
showing how the regions are ordered. | |
Regions are ordered according to C indexing order, and a map of this ordering can be produced | |
using the `matrix_layout()` method (see example below). |
>>> gs = GridSelection(3, 1, split_pixels=False) | ||
>>> matrices = gs.find_grid(pixel_map) | ||
>>> matrices[0] * pixel_map | ||
array([[1, 2, 3], | ||
[4, 5, 6], | ||
[0, 0, 0], | ||
[0, 0, 0], | ||
[0, 0, 0], | ||
[0, 0, 0]]) | ||
>>> gs.matrix_layout() | ||
array([[0], | ||
[1], | ||
[2]]) | ||
>>> gs2 = GridSelection(2, 2, split_pixels=True) | ||
>>> matrices2 = gs.find_grid(pixel_map) | ||
>>> gs2.matrix_layout() | ||
array([[0, 1], | ||
[2, 3]]) | ||
>>> matrices2[2] | ||
array([[0. , 0. , 0. ], | ||
[0. , 0. , 0. ], | ||
[0. , 0. , 0. ], | ||
[1. , 0.5, 0. ], | ||
[1. , 0.5, 0. ], | ||
[1. , 0.5, 0. ]]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Few suggestions (all collapsed into one below), see also 4f78ebb:
- Correct typo of
split_pixels
. - rename
matrices
->rois
- move matrix layout above the resulting map
- add one more example to each gs, for added clarity.
>>> gs = GridSelection(3, 1, split_pixels=False) | |
>>> matrices = gs.find_grid(pixel_map) | |
>>> matrices[0] * pixel_map | |
array([[1, 2, 3], | |
[4, 5, 6], | |
[0, 0, 0], | |
[0, 0, 0], | |
[0, 0, 0], | |
[0, 0, 0]]) | |
>>> gs.matrix_layout() | |
array([[0], | |
[1], | |
[2]]) | |
>>> gs2 = GridSelection(2, 2, split_pixels=True) | |
>>> matrices2 = gs.find_grid(pixel_map) | |
>>> gs2.matrix_layout() | |
array([[0, 1], | |
[2, 3]]) | |
>>> matrices2[2] | |
array([[0. , 0. , 0. ], | |
[0. , 0. , 0. ], | |
[0. , 0. , 0. ], | |
[1. , 0.5, 0. ], | |
[1. , 0.5, 0. ], | |
[1. , 0.5, 0. ]]) | |
>>> gs = GridSelection(3, 1, split_rows=False) | |
>>> rois = gs.find_grid(pixel_map) | |
>>> gs.matrix_layout() | |
array([[0], | |
[1], | |
[2]]) | |
>>> rois[0] * pixel_map | |
array([[1, 2, 3], | |
[4, 5, 6], | |
[0, 0, 0], | |
[0, 0, 0], | |
[0, 0, 0], | |
[0, 0, 0]]) | |
>>> rois[1] * pixel_map | |
array([[0, 0, 0], | |
[0, 0, 0], | |
[7, 8, 9], | |
[10, 11, 12], | |
[0, 0, 0], | |
[0, 0, 0]]) | |
>>> gs2 = GridSelection(2, 2, split_columns=True) | |
>>> rois2 = gs.find_grid(pixel_map) | |
>>> gs2.matrix_layout() | |
array([[0, 1], | |
[2, 3]]) | |
>>> rois2[2] | |
array([[0. , 0. , 0. ], | |
[0. , 0. , 0. ], | |
[0. , 0. , 0. ], | |
[1. , 0.5, 0. ], | |
[1. , 0.5, 0. ], | |
[1. , 0.5, 0. ]]) | |
>>> rois2[3] | |
array([[0. , 0. , 0. ], | |
[0. , 0. , 0. ], | |
[0. , 0. , 0. ], | |
[0. , 0.5, 1. ], | |
[0. , 0.5, 1. ], | |
[0. , 0.5, 1. ]]) |
def __post_init__(self): | ||
self._check_attribute_type("v_split", int) | ||
self._check_attribute_type("h_split", int) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we so strict on using the correct types beyond giving type hints?
I don't see the point in raising an error if someone passes 2.0 or something. This could happen if some external package uses floats by default for whatever reason.
If we do want to remain strict, see 62bc8d4 for a suggestion of how to simplify coding this.
split_columns: Allows columns to be split over two regions. | ||
|
||
Examples: | ||
>>> pixel_map = array([[ 1, 2, 3], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
>>> pixel_map = array([[ 1, 2, 3], | |
>>> pixel_map = np.array([[ 1, 2, 3], |
class InvalidDivision(Exception): | ||
"""Raised when the data can't be divided into regions.""" | ||
|
||
|
||
class InvalidHorizontalDivision(InvalidDivision): | ||
"""Raised when the data can't be divided into horizontal regions.""" | ||
|
||
|
||
class InvalidVerticalDivision(InvalidDivision): | ||
"""Raised when the data can't be divided into vertical regions.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what's the added value of these over a ValueError
?
and what's the added value of having separate horizontal and vertical versions, over specifying this in the error message (same question for warnings below)?
split_rows: bool = False | ||
split_columns: bool = False | ||
ignore_nan_rows: bool = True | ||
ignore_nan_columns: bool = True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be nice to have a single argument for each of split and ignore, as I think that more often than not a user would want the same behavior for both.
We can set it so that it accepts either a bool or a 2-part tuple of bools (for horizontal and vertical).
split_rows: bool = False | |
split_columns: bool = False | |
ignore_nan_rows: bool = True | |
ignore_nan_columns: bool = True | |
split_pixels: bool | tuple[bool, bool] = False | |
ignore_nan_edges: bool | tuple[bool, bool] = True |
Also, what would happen and what do we want to happen if the center row/column of an array is nan?
def matrix_layout(self) -> NDArray: | ||
"""Returns a 2D array showing the layout of the matrices returned by | ||
`find_grid`.""" | ||
n_regions = self.v_split * self.h_split | ||
return np.reshape(np.arange(n_regions), (self.v_split, self.h_split)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it'd be nice to make this a property rather than a function. See 33905ee for suggestion
Beginnings of ROI selection features