-
Notifications
You must be signed in to change notification settings - Fork 80
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
discussion for why modulo hash / scaled signatures are awesome #606
Comments
(of course, this all needs to be balanced against the point that they can grow indefinitely :) |
note Richard Durbin's modimizer, which uses similar concepts! https://github.com/richarddurbin/modimizer - the README is informative for this issue. |
a more succinct way of putting the containment guarantees above are "Containment never decreases as you get more data" (which is nice for streaming esp.) |
note also that you can subtract and add scaled signatures, and filter them on abundance, and other things, without fear. |
preprinted and available! see link in #823. |
From private conversations with @luizirber @bluegenes @halexand recently -- scaled signatures are different from MinHash because:
downsample
and discussion in API docs in [MRG] Update the API docs #596), and both could maybe be built simultaneously (see Improvement: allow both num and scaled to be set #538)These properties need to be clearly laid out, discussed, evaluated empirically, and (ideally) described theoretically. cc @dkoslicki
The text was updated successfully, but these errors were encountered: