Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dedup determinisitc - use a seed? #362

Closed
RichardCorbett opened this issue Aug 14, 2019 · 3 comments
Closed

dedup determinisitc - use a seed? #362

RichardCorbett opened this issue Aug 14, 2019 · 3 comments

Comments

@RichardCorbett
Copy link

Hi there,

When I run this command multiple times:
{code}
umi_tools dedup -I myBam.bam --extract-umi-method=tag --umi-tag=RX --read-length -S myBam.umi_tools.bam
{code}

I get the same number of reads returned each time, but the exact reads returned each time are different. Is there a way to get the same reads with each run?

thanks
RIchard

@IanSudbery
Copy link
Member

To use a seed with umi-tools, use the --random-seed= option.

You might also find you need to set the PYTHONHASHSEED= environment variable as well to make sure hashes are returned in the same order, although I think the sorts should take care of that these days....

This is how we run our tests:

#!/use/bin/env bash
export PYTHONHASHSEED=0
umi_tools dedup --random-seed=123456789 .......

@RichardCorbett
Copy link
Author

Perfect. Thanks. Sorry I somehow missed this myself.

@IanSudbery
Copy link
Member

I'm closing this. Please reopen if you need more advice.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants