Remove nth's linear search overhead in BCF reader #294

athos · 2023-12-05T02:17:15Z

When I was profiling the BCF reader before, I found that the invocation to nth took up most of the time, which does linear search per sample.

This PR removes the overhead of the linear search and improve the performance of the BCF reader by replacing the sequential collection returned from read-typed-value with a vector, not a lazy sequence.

Here are the profiling results before and after the change:

before change	after change

By this fix, the BCF reader is now roughly 7x faster than before:

(time
 (with-open [r (vcf/reader ".cavia/large.bcf")]
   (run! (constantly nil) (vcf/read-variants-randomly r {:chr "chr1" :end 30000000} {}))))

;; before change
"Elapsed time: 7973.314958 msecs"

;; after change
"Elapsed time: 1139.505833 msecs"

codecov · 2023-12-05T02:19:38Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (d68c01d) 88.33% compared to head (50870bd) 88.75%.

Additional details and impacted files

@@            Coverage Diff             @@
##           master     #294      +/-   ##
==========================================
+ Coverage   88.33%   88.75%   +0.42%     
==========================================
  Files          81       83       +2     
  Lines        7028     7255     +227     
  Branches      495      515      +20     
==========================================
+ Hits         6208     6439     +231     
+ Misses        325      324       -1     
+ Partials      495      492       -3

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

matsutomo81

Thank you for working on this!
LGTM👍

alumi

LGTM 👍 Thanks!

athos self-assigned this Dec 5, 2023

Remove nth's linear search overhead in bcf reader

50870bd

athos force-pushed the fix/bcf-nth-overhead branch from 0330ac4 to 50870bd Compare December 5, 2023 03:00

athos changed the base branch from feature/lsb-revamp to master December 5, 2023 03:00

athos marked this pull request as ready for review December 5, 2023 03:03

athos requested review from alumi and a team as code owners December 5, 2023 03:03

athos requested review from matsutomo81 and removed request for a team December 5, 2023 03:03

athos assigned alumi and matsutomo81 Dec 5, 2023

matsutomo81 approved these changes Dec 7, 2023

View reviewed changes

alumi approved these changes Dec 8, 2023

View reviewed changes

alumi merged commit d6d1af8 into master Dec 8, 2023
17 checks passed

alumi deleted the fix/bcf-nth-overhead branch December 8, 2023 01:48

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove nth's linear search overhead in BCF reader #294

Remove nth's linear search overhead in BCF reader #294

athos commented Dec 5, 2023 •

edited

Loading

codecov bot commented Dec 5, 2023 •

edited

Loading

matsutomo81 left a comment

alumi left a comment

Remove nth's linear search overhead in BCF reader #294

Remove nth's linear search overhead in BCF reader #294

Conversation

athos commented Dec 5, 2023 • edited Loading

codecov bot commented Dec 5, 2023 • edited Loading

Codecov Report

matsutomo81 left a comment

Choose a reason for hiding this comment

alumi left a comment

Choose a reason for hiding this comment

athos commented Dec 5, 2023 •

edited

Loading

codecov bot commented Dec 5, 2023 •

edited

Loading