Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ADAM-436] Optionally output original qualities to fastq #467

Merged
merged 1 commit into from
Nov 7, 2014

Conversation

ryan-williams
Copy link
Member

Some improvements to fastq-writing flow / adam2fastq:

  • optionally write out original qualities or "recalibrated" ones
  • run adam2fastq through the single-or-paired-fastq interface based on the number of file arguments passed
  • fix a bug where the projection was leaving out needed fields
  • add an optional additional sanity check when writing paired-fastq

@fnothaft
Copy link
Member

fnothaft commented Nov 7, 2014

+1, LGTM. Squash?

@ryan-williams
Copy link
Member Author

squashed

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/377/

Build result: FAILURE

GitHub pull request #467 of commit 889e653 automatically merged.[EnvInject] - Loading node environment variables.Building remotely on amp-jenkins-slave-01 (centos) in workspace /home/jenkins/workspace/ADAM-prb > git rev-parse --is-inside-work-tree # timeout=10Fetching changes from the remote Git repository > git config remote.origin.url https://github.com/bigdatagenomics/adam.git # timeout=10Fetching upstream changes from https://github.com/bigdatagenomics/adam.git > git --version # timeout=10 > git fetch --tags --progress https://github.com/bigdatagenomics/adam.git +refs/pull/:refs/remotes/origin/pr/ > git rev-parse origin/pr/467/merge^{commit} # timeout=10Checking out Revision 28c877b288071db8a061148f71e6f5e4f161ef14 (detached) > git config core.sparsecheckout # timeout=10 > git checkout -f 28c877b288071db8a061148f71e6f5e4f161ef14 > git rev-list a7a2569f68f2981425bb14e2fffc1bdbd8024b8c # timeout=10Triggering ADAM-prb » 1.0.4,centosTriggering ADAM-prb » 2.2.0,centosTriggering ADAM-prb » 2.3.0,centosADAM-prb » 1.0.4,centos completed with result FAILUREADAM-prb » 2.2.0,centos completed with result FAILUREADAM-prb » 2.3.0,centos completed with result FAILURE
Test FAILed.

@ryan-williams
Copy link
Member Author

huh, could be real failures, let me look

@ryan-williams
Copy link
Member Author

right, FastqRecordConverter puts fastq qualities into AlignmentRecord.qual, which is asymmetric with my change here to default to AR.origQual.

I guess I'll have the writing code try to write (origQual|qual) and fall back to writing the other if the first choice doesn't exist...

@fnothaft @massie any thoughts on whether origQual or qual inherently makes more sense on each end? what stage are the "recalibrated" qualities typically computed at?

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/379/

Build result: FAILURE

GitHub pull request #467 of commit 4df9b44 automatically merged.[EnvInject] - Loading node environment variables.Building remotely on amp-jenkins-slave-01 (centos) in workspace /home/jenkins/workspace/ADAM-prb > git rev-parse --is-inside-work-tree # timeout=10Fetching changes from the remote Git repository > git config remote.origin.url https://github.com/bigdatagenomics/adam.git # timeout=10Fetching upstream changes from https://github.com/bigdatagenomics/adam.git > git --version # timeout=10 > git fetch --tags --progress https://github.com/bigdatagenomics/adam.git +refs/pull/:refs/remotes/origin/pr/ > git rev-parse origin/pr/467/merge^{commit} # timeout=10Checking out Revision 887c78f7f44a265ef8e256958d8a2937feb6f931 (detached) > git config core.sparsecheckout # timeout=10 > git checkout -f 887c78f7f44a265ef8e256958d8a2937feb6f931 > git rev-list a7a2569f68f2981425bb14e2fffc1bdbd8024b8c # timeout=10Triggering ADAM-prb » 1.0.4,centosTriggering ADAM-prb » 2.2.0,centosTriggering ADAM-prb » 2.3.0,centosADAM-prb » 1.0.4,centos completed with result FAILUREADAM-prb » 2.2.0,centos completed with result FAILUREADAM-prb » 2.3.0,centos completed with result FAILURE
Test FAILed.

@fnothaft
Copy link
Member

fnothaft commented Nov 7, 2014

@ryan-williams I'd go with qual. originalQual will only be populated if you've run BQSR before. There's not necessarily a clear reason to align reads, run BQSR, and then realign the reads, but I'm sure someone's done it before.

@ryan-williams
Copy link
Member Author

thanks, that makes sense

- unify code-paths for single-/paired-fastq writing

  also use recalibrated qualities

- shorten `-validation` argument to adam2fastq

- add first/second-in-pair to adam2fastq projection

  these are necessary for outputting “/1”/“/2”!

- optional extra pair-checking to adam2fastq

- add extra check to fastq test suite

- whitespaces
@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/380/

Build result: FAILURE

GitHub pull request #467 of commit 2716937 automatically merged.[EnvInject] - Loading node environment variables.Building remotely on amp-jenkins-slave-01 (centos) in workspace /home/jenkins/workspace/ADAM-prb > git rev-parse --is-inside-work-tree # timeout=10Fetching changes from the remote Git repository > git config remote.origin.url https://github.com/bigdatagenomics/adam.git # timeout=10Fetching upstream changes from https://github.com/bigdatagenomics/adam.git > git --version # timeout=10 > git fetch --tags --progress https://github.com/bigdatagenomics/adam.git +refs/pull/:refs/remotes/origin/pr/ > git rev-parse origin/pr/467/merge^{commit} # timeout=10Checking out Revision a34074082d54668273cacb5528d8aaed1b9c6ea8 (detached) > git config core.sparsecheckout # timeout=10 > git checkout -f a34074082d54668273cacb5528d8aaed1b9c6ea8 > git rev-list 887c78f7f44a265ef8e256958d8a2937feb6f931 # timeout=10Triggering ADAM-prb » 1.0.4,centosTriggering ADAM-prb » 2.2.0,centosTriggering ADAM-prb » 2.3.0,centosADAM-prb » 1.0.4,centos completed with result FAILUREADAM-prb » 2.2.0,centos completed with result FAILUREADAM-prb » 2.3.0,centos completed with result FAILURE
Test FAILed.

@ryan-williams
Copy link
Member Author

hrm, it seems like jenkins ran the previous SHA somehow? the line numbers it's giving don't even match up to the commit 2716937

@ryan-williams
Copy link
Member Author

@fnothaft can you ask jenkins to retest this?

@fnothaft
Copy link
Member

fnothaft commented Nov 7, 2014

Jenkins, retest this please.

@ryan-williams
Copy link
Member Author

also, since you asked me to squash this earlier, I went ahead and wrote a script to squash my commits together that includes all the commit messages, bulleted and indented :)

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/381/
Test PASSed.

@ryan-williams
Copy link
Member Author

huzzah!

@ryan-williams
Copy link
Member Author

weird that it ran the old code right after I pushed the new code...

@fnothaft
Copy link
Member

fnothaft commented Nov 7, 2014

huzzah indeed!

weird that it ran the old code right after I pushed the new code...

I think I kicked off the new build (by asking Jenkins) before you pushed the updated code.

fnothaft added a commit that referenced this pull request Nov 7, 2014
[ADAM-436] Optionally output original qualities to fastq
@fnothaft fnothaft merged commit 198e00d into bigdatagenomics:master Nov 7, 2014
@fnothaft
Copy link
Member

fnothaft commented Nov 7, 2014

Merged! Thanks @ryan-williams.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants