Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for DT tag to distinguish PCR and optical duplicates #38

Open
seboyden opened this issue Sep 27, 2018 · 6 comments
Open

Support for DT tag to distinguish PCR and optical duplicates #38

seboyden opened this issue Sep 27, 2018 · 6 comments

Comments

@seboyden
Copy link

I'd like to request a feature analogous to the Picard MarkDuplicates TAGGING_POLICY option, where setting All will record the Duplicate Type (PCR or optical) in the optional DT tag, and OpticalOnly will only mark optical duplicates. It's often recommended to only mark optical duplicates on data from PCR-free library prep, which includes most WGS. Thanks!

@GregoryFaust
Copy link
Owner

I agree that PCR free WGS has become the norm. Therefore, I think this is a good suggestion. However, it does require that samblaster parse read-ids, something that it does not do currently. I will strongly consider this feature for any upcoming major release of samblaster.

@seboyden
Copy link
Author

seboyden commented Oct 1, 2018 via email

@seboyden
Copy link
Author

seboyden commented Jun 3, 2020

Any further consideration of adding optical duplicate marking?

@GregoryFaust
Copy link
Owner

Yes, I have been thinking about how to do this, but it is difficult in a one-pass algorithm that samblaster must use to satisfy its primary usage scenario in a pipe. In particular, I have yet to imagine a solution that does not approximately double the amount of memory used by samblaster in order to keep track of the Illumina flow cell location for reads.

@seboyden
Copy link
Author

seboyden commented Jun 4, 2020

Thanks, I think 2X memory usage might be acceptable given this would be optional, especially if warned about the increased memory in the documentation/help.

@carsonhh
Copy link

I've submitted a pull request of changes I Mae that would allow this. You should be able to add UMI support on top of that in just a few minutes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants