Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MRG] do not report untrusted jaccard ANI #2011

Merged
merged 5 commits into from
May 2, 2022
Merged

Conversation

bluegenes
Copy link
Contributor

@bluegenes bluegenes commented Apr 30, 2022

fixes #2004

This PR modifies jaccardANIResult to only return an ANI value when the jaccard error does not exceed the recommended threshold. This means that ANI will not be returned to any jaccard ANI function (minhash, signature, compare, search, etc) if it is not trusted.

I haven't enabled forcibly returning ANI in minhash or signature because I still return the distance (so if you really want the untrustworthy ANI anyway, you can just do 1-distance).

@codecov
Copy link

codecov bot commented Apr 30, 2022

Codecov Report

Merging #2011 (a06e43f) into latest (d425f6f) will increase coverage by 7.52%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##           latest    #2011      +/-   ##
==========================================
+ Coverage   84.14%   91.67%   +7.52%     
==========================================
  Files         129       98      -31     
  Lines       15082    10807    -4275     
  Branches     2118     2119       +1     
==========================================
- Hits        12691     9907    -2784     
+ Misses       2095      604    -1491     
  Partials      296      296              
Flag Coverage Δ
python 91.67% <100.00%> (+<0.01%) ⬆️
rust ?

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
src/sourmash/search.py 97.89% <ø> (-0.01%) ⬇️
src/sourmash/distance_utils.py 99.35% <100.00%> (+0.02%) ⬆️
src/core/src/ffi/storage.rs
src/core/src/errors.rs
src/core/src/index/mod.rs
src/core/src/index/linear.rs
src/core/src/ffi/mod.rs
src/core/src/sketch/nodegraph.rs
src/core/src/ffi/index/mod.rs
src/core/src/ffi/cmd/compute.rs
... and 23 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d425f6f...a06e43f. Read the comment docs.

@bluegenes
Copy link
Contributor Author

ready for review @ctb @sourmash-bio/devs

@bluegenes bluegenes changed the title [WIP] do not report untrusted jaccard ANI [MRG] do not report untrusted jaccard ANI May 2, 2022
@ctb ctb merged commit adf35ea into latest May 2, 2022
@ctb ctb deleted the jaccard-ani-untrustworthy branch May 2, 2022 21:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

What to do about untrustworthy jaccard --> ANI estimates?
2 participants