[python] Optimization of `ExperimentAxisQuery` `to_anndata` #3359

bkmartinjr · 2024-11-21T03:37:58Z

Apply partitioning optimization and improved concurrency to the ExperimentAxisQuery.to_anndata, similar to the approach used in #3328

Other changes related to this:

removed numba package dependency, as it is no longer needed
removal of dead code in both ExperimentAxisQuery and the old CSR converter
remove obsolete comments related to resource handling (no more reference cycles now that we use the context thread pool).
fixed a constant value typo in outgest
style/cleanup of small amount of code in ExperimentAxisQuery._read
improve empty matrix handling in CompresedMatrix.from_soma

As part of testing I validated this on both the sparse-with-dups and sparse-without-dups readers via S3. There was a performance improvement on both.

codecov · 2024-11-21T04:02:59Z

Codecov Report

Attention: Patch coverage is 94.33962% with 3 lines in your changes missing coverage. Please review.

Project coverage is 85.85%. Comparing base (ea3cd1a) to head (155fcad).
Report is 4 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #3359      +/-   ##
==========================================
+ Coverage   85.64%   85.85%   +0.20%     
==========================================
  Files          57       56       -1     
  Lines        6201     6107      -94     
==========================================
- Hits         5311     5243      -68     
+ Misses        890      864      -26

Flag	Coverage Δ
python	`85.85% <94.33%> (+0.20%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Components	Coverage Δ
python_api	`85.85% <94.33%> (+0.20%)`	⬆️
libtiledbsoma	`∅ <ø> (∅)`

---- 🚨 Try these New Features:

Flaky Tests Detection - Detect and resolve failed and flaky tests

johnkerl

This is super-cool @bkmartinjr ! Just some minor comments.

apis/python/src/tiledbsoma/_query.py

apis/python/src/tiledbsoma/io/outgest.py

johnkerl

🚢
Thanks @bkmartinjr !! :)

johnkerl · 2024-11-21T22:53:16Z

apis/python/src/tiledbsoma/_query.py

+        )
+
+    approx_X_shape = tuple(b - a + 1 for a, b in matrix.non_empty_domain())
+    # heuristically derived number (benchmarking). Thesis is that this is roughly 80% of a 1 GiB io buffer,


Thanks @bkmartinjr !

bkmartinjr added 2 commits November 21, 2024 03:34

optimization of ExperimentAxisQuery to_anndata

3a27b31

Merge branch 'main' into eaq-x-reader

b74d8c8

johnkerl changed the title ~~optimization of ExperimentAxisQuery to_anndata~~ [python] Optimization of ExperimentAxisQuery to_anndata Nov 21, 2024

Merge branch 'main' into eaq-x-reader

5682958

bkmartinjr marked this pull request as ready for review November 21, 2024 18:31

bkmartinjr requested review from ryan-williams and johnkerl November 21, 2024 18:31

johnkerl reviewed Nov 21, 2024

View reviewed changes

PR f/b

155fcad

johnkerl approved these changes Nov 21, 2024

View reviewed changes

bkmartinjr merged commit 4d7bff2 into main Nov 21, 2024
21 checks passed

bkmartinjr deleted the eaq-x-reader branch November 21, 2024 22:58

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[python] Optimization of `ExperimentAxisQuery` `to_anndata` #3359

[python] Optimization of `ExperimentAxisQuery` `to_anndata` #3359

bkmartinjr commented Nov 21, 2024 •

edited

Loading

codecov bot commented Nov 21, 2024 •

edited

Loading

johnkerl left a comment

johnkerl left a comment

johnkerl Nov 21, 2024

[python] Optimization of ExperimentAxisQuery to_anndata #3359

[python] Optimization of ExperimentAxisQuery to_anndata #3359

Conversation

bkmartinjr commented Nov 21, 2024 • edited Loading

codecov bot commented Nov 21, 2024 • edited Loading

Codecov Report

johnkerl left a comment

Choose a reason for hiding this comment

johnkerl left a comment

Choose a reason for hiding this comment

johnkerl Nov 21, 2024

Choose a reason for hiding this comment

[python] Optimization of `ExperimentAxisQuery` `to_anndata` #3359

[python] Optimization of `ExperimentAxisQuery` `to_anndata` #3359

bkmartinjr commented Nov 21, 2024 •

edited

Loading

codecov bot commented Nov 21, 2024 •

edited

Loading