allow iterative cleaning #7

bwprice · 2025-01-15T09:37:09Z

allow specification of starting thresholds for various options, after each run the # ambig bases are assessed, taking into account fragmentation (ie dropping too many reads fragmenting the alignment). If ambig bases remain then re-run with improved thresholds and repeat until all contaminant reads are removed. The aim here is to start with thresholds which won't remove too many sequences then adjust as needed without manual tweaking.

The main thresholds to consider would be AT ratio and outlier percentile. Less stringent values like AT > 0.2 and outlier percentile of 95% could be followed by 0.1 and 90% respectively.

For example:
iteration 1 = AT of 0.2 lower and outlier 95%; if ambig bases remain then:
iteration 2 = AT of 0.1 lower and outlier 90%; repeat

bwprice added the enhancement New feature or request label Jan 15, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

allow iterative cleaning #7

allow iterative cleaning #7

bwprice commented Jan 15, 2025

allow iterative cleaning #7

allow iterative cleaning #7

Comments

bwprice commented Jan 15, 2025