-
Notifications
You must be signed in to change notification settings - Fork 161
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add --sam-omit-prim-seq #458
Conversation
@ch4rr0 Could you please review? |
Hello Igor, I will take a look today. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am ok with the changes. My only question is why do this in bowtie2 as opposed to an AWK script, for example?
The resulting output file is huge, putting a lot of strain on the IO system. |
CC @wasade |
@BenLangmead, thoughts? |
Just a reminder.... |
I think this kind of straightforward postprocessing is best left to awk and similar tools. Otherwise we accumulate too many command-line options that make later changes trickier. I know that this is in tension with the fact that Bowtie had the |
Unfortunately, |
Correct |
Hi @BenLangmead, this option is valuable to our efforts with Qiita (https://qiita.ucsd.edu/). Qiita right now houses .sam output from 50-100k metagenomic samples, which are typically mapped against a few databases. The volume of data overall is large, and reprocessing occurs periodically. We currently post process to reduce storage burden, but it would be an appreciable runtime improvement to avoid the significant IO needed to stage .sam temporarily for filtering. |
I appreciate your comments; I suggest awk or mawk or similar should be a good expedient, or feel free to use a fork with your change. We do not plan to integrate this feature into the master branch. |
Thanks, @BenLangmead! We appreciate the follow up, and all of incredible work that has, and continues, to go into bowtie2! |
Add --sam-omit-prim-seq, with the same semantics as --omit-sec-seq but operating on primary alignments.
Addresses #457