Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Distinct parameter not documented #20

Open
gallardoalba opened this issue Apr 29, 2021 · 1 comment
Open

Distinct parameter not documented #20

gallardoalba opened this issue Apr 29, 2021 · 1 comment

Comments

@gallardoalba
Copy link

gallardoalba commented Apr 29, 2021

The distinct parameter, used for generating the high frequency k-mers datasets required by winnowmap, is not documented at all I think.

  • Mapping ONT or PacBio-hifi WGS reads
meryl count k=15 output merylDB ref.fa
meryl print greater-than distinct=0.9998 merylDB > repetitive_k15.txt
winnowmap -W repetitive_k15.txt -ax map-ont ref.fa ont.fq.gz > output.sam 
@gallardoalba gallardoalba changed the title Distinct parameter Distinct parameter not documented Apr 29, 2021
@dgordon562
Copy link

Do you (or anyone) know what the distinct parameter means? More specifically what does
meryl print greater-than distinct=0.9998 merylDB
mean???
I believe that "meryl print greater-than 2" would mean to print out all kmers that are present in the reads at least twice.
But what does distinct=0.9998 mean? Some sort of fraction? Of what? A wild guess: print kmers that occur once in the reads (are distinct) but only if such kmers occur 0.9998 of all kmers in the reads.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants