Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MRG] Add FrozenMinHash #1508

Merged
merged 274 commits into from
May 15, 2021
Merged

[MRG] Add FrozenMinHash #1508

merged 274 commits into from
May 15, 2021

Conversation

ctb
Copy link
Contributor

@ctb ctb commented May 8, 2021

Adds a FrozenMinHash type that is returned by SourmashSignature.minhash.

Also:

  • adds a MinHash method, to_mutable(), that returns self for MinHash and a mutable copy for FrozenMinHash.
  • adds to_frozen() method that does the converse

Very few changes are required to make all the tests pass. The ease of doing this suggests to me that this is a Good Change.

Explores #1494.

@ctb ctb changed the title [EXP] Add ImmutableMinHash [EXP] Add FrozenMinHash May 8, 2021
@ctb
Copy link
Contributor Author

ctb commented May 8, 2021

Changed over to FrozenMinHash.

I think the main thing to add is some checks that .mutable() and .immutable() are working as we expect and in particular that .mutable() is making a copy of the FrozenMinHash. The logic is simple enough so I suspect they are, but it's always nice to pin it down.

Also want to think about whether MinHash.mutable() should make a copy or not. Right now it DOES.

@ctb
Copy link
Contributor Author

ctb commented May 12, 2021

Curious what you think, @luizirber - should I propose this for merge, or just leave it for now? I don't have a strong opinion other than that this contains the potential for significant optimization down the road, so baking it in now seems like a not bad idea (since it's really easy!)

@luizirber
Copy link
Member

I think we should go for a merge.

Also want to think about whether MinHash.mutable() should make a copy or not. Right now it DOES.

This tripped me a bit while reviewing the code, because places that had copy.copy(mh) now have mh.mutable() and it is not clear from the .mutable() name that a copy is being made...

This is not necessarily pythonic, but in Rust there are usually conventions around .to_* and .into_* methods to return something that creates a new one, or consume something and return it. For example, in this case we could have
.to_mutable() for transforming the current MinHash to a mutable version (with copy) and
.into_mutable() to take this MinHash and make it mutable (without copy, consuming the previous minhash).
Not sure if this is clearer, tho 😅

@ctb
Copy link
Contributor Author

ctb commented May 13, 2021

Also want to think about whether MinHash.mutable() should make a copy or not. Right now it DOES.

This tripped me a bit while reviewing the code, because places that had copy.copy(mh) now have mh.mutable() and it is not clear from the .mutable() name that a copy is being made...

yep

This is not necessarily pythonic, but in Rust there are usually conventions around .to_* and .into_* methods to return something that creates a new one, or consume something and return it. For example, in this case we could have
.to_mutable() for transforming the current MinHash to a mutable version (with copy) and
.into_mutable() to take this MinHash and make it mutable (without copy, consuming the previous minhash).
Not sure if this is clearer, tho 😅

right, I'm on board with the general idea. I will maybe tryto_mutable() and to_frozen(); I'm not sure into_mutable() makes sense, but maybe into_frozen() does (since you'd get good error messages). I don't really like changing the classes underneath by policy (independent of what my code actually does :).

@ctb ctb changed the title [EXP] Add FrozenMinHash [MRG] Add FrozenMinHash May 13, 2021
@ctb
Copy link
Contributor Author

ctb commented May 13, 2021

Ready for review, @luizirber @bluegenes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants