Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add protocols for I/O APIs. #78

Merged
merged 1 commit into from
May 29, 2017
Merged

Add protocols for I/O APIs. #78

merged 1 commit into from
May 29, 2017

Conversation

alumi
Copy link
Member

@alumi alumi commented May 26, 2017

This PR includes breaking changes

Summary

This PR adds new protocols for I/O classes for the sake of

  • Abstraction of concrete implementation of file formats
  • Improving consistency of APIs
  • Convenience for REPL use

Changes

  • Add protocols to cljam.io
  • Modify readers/writers to conform to the protocols
  • Modify arguments to take region map like {:chr "chr1", :start 1, :end 3} and option as another map if required.
  • Add util function for manipulating region map.
  • Use unique aliases: cio for clojure.java.io and io for cljam.io
  • Replace vararg [& {:keys [...]}] with map [{:keys [...]}]
  • Arity overloadings default to call same fn with larger arity

Problems

read-in-region reads contents in certain region.
In current implementation, readers without index (SAM, VCF...) are still supporting the function by filter.
Even though they will warn about performance, I'm still thinking if we should disable this.

@codecov
Copy link

codecov bot commented May 26, 2017

Codecov Report

Merging #78 into master will decrease coverage by 0.68%.
The diff coverage is 87.19%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #78      +/-   ##
==========================================
- Coverage   82.76%   82.07%   -0.69%     
==========================================
  Files          59       59              
  Lines        3840     3922      +82     
  Branches      422      423       +1     
==========================================
+ Hits         3178     3219      +41     
- Misses        240      280      +40     
- Partials      422      423       +1
Impacted Files Coverage Δ
src/cljam/fasta.clj 100% <ø> (ø) ⬆️
src/cljam/bam/writer.clj 73.13% <ø> (ø) ⬆️
src/cljam/sam.clj 100% <ø> (ø) ⬆️
src/cljam/bam/reader.clj 71.96% <ø> (-12.88%) ⬇️
src/cljam/bam_index/writer.clj 74.87% <ø> (ø) ⬆️
src/cljam/bam_indexer.clj 100% <100%> (ø) ⬆️
src/cljam/bam_index/core.clj 70.17% <100%> (ø) ⬆️
src/cljam/bcf/writer.clj 86.7% <100%> (+0.07%) ⬆️
src/cljam/util/chromosome.clj 100% <100%> (ø) ⬆️
src/cljam/fasta/core.clj 100% <100%> (ø) ⬆️
... and 27 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 7c8d955...60f08b0. Read the comment docs.

Copy link
Member

@totakke totakke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good job! It's better abstraction. I've added some comments about format. Please check them.

src/cljam/io.clj Outdated
(reader-path [this]
"Returns the file's absolute path.")
(read [this] [this option]
"Sequentially reads cotents of the file."))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cotents -> contents

src/cljam/io.clj Outdated
"Returns the file's absolute path."))

(defprotocol IRandomReader
(read-in-region [this {:keys [chr start end]}] [this {:keys [chr start end]} option]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use region consistently instead of {:keys [chr start end]} in protocol definition.


(defn divide-region
"Divides a region [start end] into several chunks with maximum length 'step'.
Returns a lazy sequence of vector."
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indent of the second line is incorrect.

(defn divide-region
  "Divides a region [start end] into several chunks with maximum length 'step'.
  Returns a lazy sequence of vector."
  ...)

is correct.

c.f. clojure-style-guide#align-docstring-lines

@totakke
Copy link
Member

totakke commented May 29, 2017

read-in-region reads contents in certain region.
In current implementation, readers without index (SAM, VCF...) are still supporting the function by filter.
Even though they will warn about performance, I'm still thinking if we should disable this.

Readers w/o index should not implement IRandomReader because they are not suitable for random access. I think the following possible solutions:

  1. Renaming IRandomReader to IRegionReader.
    • IRegionReader does not mention random access.
  2. Providing IRegionReader and IRandomRegionReader.
    • Readers with index implement IRandomRegionReader and reader w/o index implement IRegionReader. If so, users can detect them with satisfies?.
  3. Readers w/o index do not implement IRandomReader.

@totakke
Copy link
Member

totakke commented May 29, 2017

Oh, I forgot to add one comment. Please unstage change of README.md. I will fix README.md when releasing new version.

@alumi
Copy link
Member Author

alumi commented May 29, 2017

Thank you for your review!
I've fixed the docs and the arg-list.

For read-in-region, I've chosen the first option.

  1. Renaming IRandomReader to IRegionReader.
    IRegionReader does not mention random access.

Commits are rebased so that README.md is not changed now.

@totakke
Copy link
Member

totakke commented May 29, 2017

Thanks! Please wait merge until v0.3.1 release because of breaking changes.

@totakke totakke merged commit 7da4841 into master May 29, 2017
@totakke totakke deleted the feature/io-protocols branch May 29, 2017 11:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants