-
Notifications
You must be signed in to change notification settings - Fork 68
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Imagestack.transform executes applied function on xarray objects instead of numpy arrays #1035
Conversation
Codecov Report
@@ Coverage Diff @@
## master #1035 +/- ##
==========================================
+ Coverage 88.7% 88.74% +0.03%
==========================================
Files 164 164
Lines 6036 6048 +12
==========================================
+ Hits 5354 5367 +13
+ Misses 682 681 -1
Continue to review full report at Codecov.
|
Thanks, @ambrosejcarr! I am not very experienced with multiprocessing, so I think I should leave the approval to @shanaxel42. I did have a question though: in general, why do the filtering functions in all of the filters retain compatibility for |
Not all of them did, but this is by design: xarray wraps a numpy array and implements or delegates most of the numpy array functionality. Pretty cool right? :-) |
Sorry for being unclear. I was more specifically wondering if the fact that the type hint for |
Just to indicate compatibility. 👍 There is also the case that someone calls the private method with a numpy array. The only execution flow that I'm aware of in the codebase now involves xarray objects. |
Got it. Thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good, just a few comments
if np.any(data < 0): | ||
data[array < 0] = 0 | ||
if np.any(array > 1): | ||
data[data < 0] = 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why did this have to change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🐛 !
If data is an xarray, we need to convert it to a numpy array to run these comparisons because indexing works differently between the two objects: http://xarray.pydata.org/en/stable/indexing.html#indexing-rules
This function was just never properly tested with an xarray input. np.ndarray[xr.DataArray] does not work.
dims = [ | ||
name.value for (name, data) in | ||
sorted(AXES_DATA.items(), key=lambda kv: kv[1].order) | ||
] + [Axes.Y.value, Axes.X.value] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we're just trying to maintain the current axes order here right? why not just dims = [dim for dim in self.xarray.dims]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but this is a static method so does not have access to self
. Happy to simplify this if you can find an easier way!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ohhhh I missed that. hmm well you could make dims_order a param you pass into _processing_workflow, then do then above in transform()... might be a little more clear
|
||
# build and then slice the xarray to get the piece needed for this worker |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This does not set the xarray coordinates correctly. I imagine that if you use crop-on-load, your labels will all be incorrect.
There are some places in the codebase where we need a numpy array instead of an xarray. Switch all instances where we do this to `np.asarray(...)`. Also, fixed type hints where appropriate. This should fix the master-breaking bug in #1035, but does not resolve the issue of (xarray) coordinates not being transferred to the constituent xarrays.
There are some places in the codebase where we need a numpy array instead of an xarray. Switch all instances where we do this to `np.asarray(...)`. Also, fixed type hints where appropriate. This should fix the master-breaking bug in #1035, but does not resolve the issue of (xarray) coordinates not being transferred to the constituent xarrays.
ImageStack.apply uses slice indices based on the labels, but the xarray passed to the worker process does not have the labels. This PR passes the dims and the coordinates of the original xarray to the worker process so it can reconstitute an identical xarray. Test plan: Added a test that creates an ImageStack with labeled indices, and runs apply on it. Without the code change, it fails. With the code change, it works. Fixes #1108, and the comment raised in #1035 (comment) :)
ImageStack.apply uses slice indices based on the labels, but the xarray passed to the worker process does not have the labels. This PR passes the dims and the coordinates of the original xarray to the worker process so it can reconstitute an identical xarray. Test plan: Added a test that creates an ImageStack with labeled indices, and runs apply on it. Without the code change, it fails. With the code change, it works. Fixes #1108, and the comment raised in #1035 (comment) :)
@kevinyamauchi pointed out in #987 and #983 that our apply method hampers our ability to write expressive
PipelineComponent
s because it returns numpy arrays instead of xarray DataArray objects.This PR:
Modifies
ImageStack.transform
so that methods used byapply
can take advantage ofxarray
named axes. Because we are rebuilding an array from thesharedmemory
object, this does not give us access to data stored in the coordinates. Thus, the returned arrays only guarantee that the Axes are in the correct order. Ifis_volume
is True, xarrays generated byImageStack.transform
have dims =(z, y, x)
and no coordinates. Ifis_volume
is False, xarrays generated byImageStack.transform
have dims =(y, x)
and no coordinates. These are set based on the information instarfish.imagestack.dataorder.AXES_DATA
Fixes a bug in
preserve_float_range
. That function has obviously never seen anxarray
before. 😆Minor updates to
SpotFinder
methods to support receivingxr.DataArray
objectsUpdates to docstrings to reflect flexibility to receive
Union[xr.DataArray, np.ndarray]
images.