-
Notifications
You must be signed in to change notification settings - Fork 94
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use memoryviews and allow nogil in gradient search #455
Conversation
Codecov Report
@@ Coverage Diff @@
## main #455 +/- ##
=======================================
Coverage 94.28% 94.28%
=======================================
Files 69 69
Lines 12388 12388
=======================================
Hits 11680 11680
Misses 708 708
Flags with carried forward coverage won't be shown. Click here to find out more. Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One small comment, but otherwise if it compiles and works then 👍
Might be nice in the future to not have to do np.full
everywhere or if they are intermediate arrays use C-level buffers/arrays, but that's in the future.
image = np.full([z_size, y_size, x_size], np.nan, dtype=DTYPE) | ||
cdef DTYPE_t [:, :, :] image_view = image |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You may benefit from defining these as [:, :, ::1]
when you know they are C contiguous. This does (from my experience) require all arguments where this is passed to also define their arguments as [:, :, ::1]
which seems dumb to me but anyway...
My understanding is that cython can make better indexing and for looping choices when it knows that everything is contiguous (makes sense).
Do you have a plot of the differences? Dask diagnostics plot if you can |
It might be easier to see if you do multiple datasets at the same time? Like running a satpy Scene with multiple bands loaded and all being resampled at the same time. Actually yeah, if the nogil sections you have now are all run serially for the most part then dask may be making it look like high CPU usage. You could also try passing ...or you forgot to recompile between tests. |
I did recompile! Checking now with other settings |
Yeah that CPU line definitely looks better. The amount that it drops down still makes me think the chunk size could be larger (if the files are a good size for it). |
chunk size is 1024 in the previous examples |
I guess the flat 100 % in the beginning is the Satpy file handler creation we've already discussed in pytroll/satpy#2186 ? |
yes, that looks like it. Far to long imo, we should look a that. |
@mraspaud can you give more details on what your code and target AreaDefinition looked like for your last comment's plots? I'm wondering if we can come up with a case that shows a little more improvement. |
@djhoese that was seviri data full disk resampled onto a full earth mollweide: moll:
description: moll
projection:
ellps: WGS84
lon_0: 0.0
proj: moll
lat_0: 0.0
shape:
height: 4500
width: 9000
area_extent:
lower_left_xy: [-18040095.696147293, -9020047.848073646]
upper_right_xy: [18040095.696147293, 9020047.848073646]
units: m then I tried with fci but got my computer to crash.. |
Trying it with ABI to a latlong projection (not the full data region). Before this PR: After: So I see better CPU usage and a faster run time by almost 30 seconds. This was generating a fully corrected true_color. Note: This is using my "pass through" profiling where I just compute the chunks and throw them away (no saving to disk). |
Thanks for trying it out @djhoese ! Should we merge? |
This PR uses moryviews and allows nogil in gradient search