Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extract depth algorithm and improve performance. #91

Merged
merged 1 commit into from
Jul 28, 2017
Merged

Conversation

alumi
Copy link
Member

@alumi alumi commented Jul 28, 2017

Changes

  • Extracted and renamed cljam.algo.pileup.pileup/pileup function to cljam.algo.depth/lazy-depth.
  • Added eager-version cljam.algo.depth/depth optimized for performance.
  • Added some utility functions to manipulate region.

Benchmark

Computing depth for chr1:1-248956422 BAM file containing about 3500000 alignments in the region.

Environment
  • CPU
    • Corei7-6700K
    • 4.00GHz
    • 4 cores
    • 8 threads
  • JVM
    • HotSpot 1.8.0_131
    • MaxHeapSize 16861102080
  • criterium 0.4.4
(with-open [r (sam/reader "BAM_FILE")]
  (count (depth/FN r {:chr "chr1", :start 1, :end 248956422} {:n-threads 8, OPTS})))
Results
fn time
lazy-depth 83934ms
depth 7848ms
depth unchecked 6500ms

The eager version is >10x faster.

@codecov
Copy link

codecov bot commented Jul 28, 2017

Codecov Report

Merging #91 into master will increase coverage by 0.22%.
The diff coverage is 76.92%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #91      +/-   ##
==========================================
+ Coverage   82.71%   82.93%   +0.22%     
==========================================
  Files          60       60              
  Lines        4107     4161      +54     
  Branches      437      452      +15     
==========================================
+ Hits         3397     3451      +54     
+ Misses        273      258      -15     
- Partials      437      452      +15
Impacted Files Coverage Δ
src/cljam/algo/pileup.clj 100% <ø> (ø) ⬆️
src/cljam/tools/cli.clj 53.57% <50%> (-0.87%) ⬇️
src/cljam/algo/depth.clj 73.8% <73.8%> (ø)
src/cljam/util.clj 95.77% <96%> (+0.12%) ⬆️
src/cljam/io/sam.clj 100% <0%> (+2.22%) ⬆️
src/cljam/io/bam/reader.clj 76.66% <0%> (+2.66%) ⬆️
src/cljam/io/bam/decoder.clj 65.95% <0%> (+6.38%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c6b1e2c...42724df. Read the comment docs.

Copy link
Member

@totakke totakke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome improvement. Almost good. Please just remove unused vars.

[bam-reader {:keys [chr start end] :or {start 1 end Long/MAX_VALUE}}
& [{:keys [step unchecked? n-threads] :or {step default-step unchecked? false n-threads 1}}]]
{:pre [chr start end (pos? start) (pos? end) (<= start end)]}
(when-let [{:keys [len] :as r} (sam-util/ref-by-name (sam/read-refs bam-reader) chr)]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

r is never used.

"Returns a position of first alignment in left-right, or nil."
[bam-reader region]
(plp/first-pos bam-reader region))

(def ^:private default-pileup-option
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

default-pileup-option is no longer needed.

@alumi
Copy link
Member Author

alumi commented Jul 28, 2017

Thank you for reviewing! I've just removed the unused vars.

@totakke totakke merged commit 12c5cf0 into master Jul 28, 2017
@totakke totakke deleted the feature/depth branch July 28, 2017 08:40
@totakke
Copy link
Member

totakke commented Jul 28, 2017

Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants