-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sourmash ANI estimate in some cases does not match manual computation, although using the same sketch signature #1
Comments
@bluegenes is this because of some checks of the sketch size or the likelihood of corner cases? Please note that I do not use the |
Hi @mahmudhera, The I think the issue you're seeing is related to sourmash-bio/sourmash#2003, where we zero out the ANI when the sketch size estimation may be inaccurate. I've been noticing the same thing that you're seeing here -- this is happening quite often (see sourmash-bio/sourmash#2058 to see the original verbose output from these checks). Are we being too strict with size accuracy estimation checks? |
Hi @bluegenes, I have rerun the script with the Therefore, just for the discrepancies in this repository (which can be seen here), I believe the issue is the hardcoded thresholds, not the size estimation being too stringent). |
Running the script
python main.py inputs/ecoli.fasta --seed 0 --scalef 0.01
produces the results in the file ani_comparison_results. The results show a disagreement between manual calculation of the point estimate, and the sourmash estimate when the true ANI is <= 66%.The text was updated successfully, but these errors were encountered: