Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add multipeek functionality to report equal gather matches #1615

Open
ctb opened this issue Jun 20, 2021 · 1 comment
Open

add multipeek functionality to report equal gather matches #1615

ctb opened this issue Jun 20, 2021 · 1 comment

Comments

@ctb
Copy link
Contributor

ctb commented Jun 20, 2021

The big CounterGather refactor in #1489 left us with a fairly simple API for reporting gather matches - peek tells you what the next match could be, while consume pulls it off the counter.

However, there's some interest in reporting equivalent matches - see #1366 #278. Unfortunately peek can't do this, as it returns only a single match at a time. I think we could add a function multipeek that returns all of the equivalently scoring matches.

This does leave us the ...interesting question of how to report them...

Other things to consider -

  • there are equivalent matches (which is just about reporting) and equivalently sized matches (which represent bifurcating paths in the gather algorithm's choices, that would lead to different lower rank matches). The former is easy to report with a '*' or something, like we did in sourmash gather. What do we do about the latter?? Right now we are just reporting them at random. I don't know how often this happens, though. Maybe we need to study it.
@ctb
Copy link
Contributor Author

ctb commented Jun 20, 2021

(a suggestion that comes from #1366 is to break ties, maybe by choosing the one with the best max_containment?)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant