Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

allow iterative cleaning #7

Open
bwprice opened this issue Jan 15, 2025 · 0 comments
Open

allow iterative cleaning #7

bwprice opened this issue Jan 15, 2025 · 0 comments
Labels
enhancement New feature or request

Comments

@bwprice
Copy link
Collaborator

bwprice commented Jan 15, 2025

allow specification of starting thresholds for various options, after each run the # ambig bases are assessed, taking into account fragmentation (ie dropping too many reads fragmenting the alignment). If ambig bases remain then re-run with improved thresholds and repeat until all contaminant reads are removed. The aim here is to start with thresholds which won't remove too many sequences then adjust as needed without manual tweaking.

The main thresholds to consider would be AT ratio and outlier percentile. Less stringent values like AT > 0.2 and outlier percentile of 95% could be followed by 0.1 and 90% respectively.

For example:
iteration 1 = AT of 0.2 lower and outlier 95%; if ambig bases remain then:
iteration 2 = AT of 0.1 lower and outlier 90%; repeat

@bwprice bwprice added the enhancement New feature or request label Jan 15, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant