You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Ran mash on a collection of reference 16s rRNA sequences (3.6Gb) and got this warning while sketching:
WARNING: For the k-mer size used (21), the random match probability (0.000726951) is above
the specified warning threshold (0.01) for the sequence "..rdp/current_Bacteria_unaligned.fa" of
size 18446744072614073515. Distances to this sequence may be underestimated as a result.
To meet the threshold of 0.01, a k-mer size of at least 20 is required.
Seems like the message should not be displayed and/or an interger overflow happened when calculating the size of the reference.
My suspicion is that maybe there an assumption that the reference fasta will always contain a single sequence?
The text was updated successfully, but these errors were encountered:
I just got the same thing on a massive collection of S.aureus genomes:
WARNING: For the k-mer size used (21), the random match probability (0.000641033) is above the specified warning threshold (0.01) for the sequence "saureus" of size 18446744072235684419. Distances to this sequence may be underestimated as a result. To meet the threshold of 0.01, a k-mer size of at least 20 is required.
Hi guys,
Ran
mash
on a collection of reference 16s rRNA sequences (3.6Gb) and got this warning while sketching:Seems like the message should not be displayed and/or an interger overflow happened when calculating the size of the reference.
My suspicion is that maybe there an assumption that the reference fasta will always contain a single sequence?
The text was updated successfully, but these errors were encountered: