-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Speed up generation of pixel cluster masks overlay #546
Conversation
@cliu72 would you mind testing this PR too for speed purposes? It may not be completely accurate yet, but I'd like to get a sense of the speed performance on a different machine. |
For 12 1024x1024 FOVs: |
Is that on the cluster or your laptop? |
Google cloud, but baby instance (4 CPU, 32 GB memory) |
Hmm. Currently in my docker, it's taking closer to 30 seconds for each 2048x2048 FOV. Could be some non-linear scaling with FOV size, or just laptop vs real computer issues. |
Ok, I think we should this function to create masks for just the provided FOVs, not all FOVs (initially wanted to support the user being able to visualize a different set of FOVs without needing re-running this cell again). In any case, FOV indexing is faster now too. @cliu72 @ngreenwald did you run into any correctness issues with the pixel masks generated? Indexing looked fine on my end but just wanted to verify before requesting a review. |
Didn't run into problems in my tests |
…nalysis into pixel_mask_boost
@ngreenwald from benchmarking, we get a far more significant speedup using 1-D indexing. Here are the times for each fov using tuple-based indexing on a 2-D array: These are the equivalent times for 1-D indexing into a flattened array, then reshaping back into a 2-D array: |
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
What is the purpose of this PR?
Closes #535.
generate_pixel_cluster_masks
indata_utils.py
runs very slowly and needs to be optimized.How did you implement your changes
The main issue is in the way the mask for each FOV is currently indexed. Currently, each FOV image is accessed in its original 2D format. Due to how 2D arrays are stored in memory, this leads to several cache misses that adds a significant bottleneck.
By flattening the array using
.ravel
and converting the coordinates so we can do 1D indexing, significant time is saved as 1D arrays are stored contiguously in memory. We get about a 3x speedup per FOV.Remaining issues
If this implementation is still too slow, we should examine a different option. Unfortunately, the per-FOV iteration is a bottleneck we can't eliminate.