Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow multiple sketch types (ksize, DNA/prot, etc.) in a database? #732

Closed
ctb opened this issue Sep 11, 2019 · 1 comment
Closed

Allow multiple sketch types (ksize, DNA/prot, etc.) in a database? #732

ctb opened this issue Sep 11, 2019 · 1 comment

Comments

@ctb
Copy link
Contributor

ctb commented Sep 11, 2019

In #556 (comment), @luizirber suggested:

Can you build a LinearIndex with 3 different k-sizes? I think it should be possible to add signatures with more than one sketch (in this case it would be one signature with 3 minhashes, one for each k), and then search/gather will select the compatible sketch to compare. In this case the compatibility-checking is internal, and not exposed to other parts of the code.

(I think this also provides a clean way to add new sketches, without having to check conditions around the codebase to make them work.)

and I wanted to open up a separate issue to discuss this.

Briefly, I think:

  • it's a fine idea to allow constructing and searching databases with multiple sketches per signature!
  • it could be a user experience disaster to actually support this from the command line.

I envision people getting really confused by inconsistent results when they do a search and discover that they are getting wildly inconsistent results because they accidentally missed including 50% of the ksize=21 signatures.

Thoughts?

@ctb
Copy link
Contributor Author

ctb commented May 8, 2021

Well, practically speaking, this is now working fine for a variety of our Index classes - Zip files, directories, pathlists, signature collections - and I haven't noticed any problems myself. Closing for now. 🤷

@ctb ctb closed this as completed May 8, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant