Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to set the seed when subsampling reads #441

Open
Al-Murphy opened this issue May 18, 2024 · 1 comment
Open

How to set the seed when subsampling reads #441

Al-Murphy opened this issue May 18, 2024 · 1 comment

Comments

@Al-Murphy
Copy link

I have been using atac.subsample_reads but I would like to a seed for this analysis so I can ensure different reads are subsampled across different runs, is there a way to do this? I can't see it listed in the parameters?

Thanks!

@Al-Murphy
Copy link
Author

Just as an update, I tried hard coding a different seed in subsample_ta_se in encode_lib_genomic.py (my data is single-ended). However, the resulting pooled bigwig file data was the exact same as when only setting atac.subsample_reads. So it appears that setting the seed is having no affect, not sure where else this is having an effect, I can't see it anywhere in the atac.wdl code?

Updated code:

# bash-only
    cmd = 'zcat -f {} | '
    if non_mito:
        # cmd += 'awk \'{{if ($1!="'+mito_chr_name+'") print $0}}\' | '
        cmd += 'grep -v \'^'+mito_chr_name+'\\b\' | '
    if subsample > 0:
        cmd += 'shuf -n {} --random-source=<(openssl enc -aes-256-ctr '
        #CHANGE HERE
        #cmd += '-pass pass:$(zcat -f {} | wc -c) -nosalt '
        cmd += '-pass pass:101 -nosalt '
        cmd += '</dev/zero 2>/dev/null) | '
        cmd += 'gzip -nc > {}'
        cmd = cmd.format(
            ta,
            subsample,
            ta_subsampled)
    else:
        cmd += 'gzip -nc > {}'
        cmd = cmd.format(
            ta,
            ta_subsampled)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant