Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Modifying interleaved fastq format to be hadoop version independent. #289

Merged
merged 1 commit into from
Jul 5, 2014

Conversation

fnothaft
Copy link
Member

@fnothaft fnothaft commented Jul 4, 2014

Final modifications. Removed the isSplittable function, as the method signature depends on the Hadoop version (specifically, the InputFileFormat API changes from Hadoop 2.2->2.3). Now, we inherit directly from the base InputFileFormat implementation, which always returns true (i.e., the file can be split). As a condition of this, we must disallow compressed interleaved FASTQ files. This condition is OK because compressed files cannot be split, and the interleaved FASTQ format (which is our own ad hoc definition of a file format) is only used to make splitting simpler.

Also, I noticed that the adam-core module still incorrectly depended on the old adam-format submodule. This was not causing tests to fail because Sonatype still contains a snapshot of adam-format. We should probably delete that snapshot...

@AmplabJenkins
Copy link

All automated tests passed.
Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/ADAM-prb/69/

massie added a commit that referenced this pull request Jul 5, 2014
Modifying interleaved fastq format to be hadoop version independent.
@massie massie merged commit 40ccd19 into bigdatagenomics:master Jul 5, 2014
@massie
Copy link
Member

massie commented Jul 5, 2014

Thanks, Frank!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants