Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rename: AtomicBloom to ConcurrentBloom #34483

Merged
merged 2 commits into from
Dec 21, 2023

Conversation

apfitzge
Copy link
Contributor

@apfitzge apfitzge commented Dec 15, 2023

Problem

  • contains, add operations are not atomic at the struct level. This should be reflected in the structure name.
  • The clear function takes a mutable reference for no reason
  • The clear function is named for_tests, but this is a valid operation which I want to use for filtering invalid payer keys, and clearing periodically w/o locking

Summary of Changes

  • Rename struct AtomicBloom to ConcurrentBloom

  • clear - Take immutable reference, rename to clear from clear_for_tests

  • All read/write operations are not atomic at the struct level, but only at the element level. Renaming the struct makes it clear that operations are not atomic.

Notes on clear
  • A concurrent write can be partially overwritten, since the operation(s) are not atomic at the structure-level, but at each group of 64 bits in the AtomicU64. This would lead to a key that was just inserted possibly returning false from contains if a concurrent clear was happening.
  • A concurrent read can only see partial bits, meaning the contains call is likely to return false if a concurrent clear is occurring.

Fixes #

Copy link

codecov bot commented Dec 15, 2023

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (e877eef) 81.8% compared to head (826e6d4) 81.8%.
Report is 1 commits behind head on master.

Additional details and impacted files
@@           Coverage Diff           @@
##           master   #34483   +/-   ##
=======================================
  Coverage    81.8%    81.8%           
=======================================
  Files         822      822           
  Lines      221504   221504           
=======================================
+ Hits       181289   181336   +47     
+ Misses      40215    40168   -47     

@t-nelson
Copy link
Contributor

i'm assuming the race is what earned this method the _for_tests suffix. if we're going to expose it generally i think it needs some indication of this particular operation being non-atomic.

@jstarry care to chime in? looks like you added the method in question in a2d251c

@apfitzge
Copy link
Contributor Author

i'm assuming the race is what earned this method the _for_tests suffix. if we're going to expose it generally i think it needs some indication of this particular operation being non-atomic.

@jstarry care to chime in? looks like you added the method in question in a2d251c

Yeah, maybe some better naming of the fn here. maybe non_atomic_clear and add doc-comments which detail the possible effects of it being non-atomic as I did in the PR description?

@jstarry
Copy link
Member

jstarry commented Dec 19, 2023

Yeah I believe @t-nelson is right about the reasoning behind the _for_tests suffix

@apfitzge
Copy link
Contributor Author

apfitzge commented Dec 19, 2023

Yeah I believe @t-nelson is right about the reasoning behind the _for_tests suffix

Understood on original reasoning. I think my reasoning makes sense on why I want this structure to be clearable, and I don't care about partial writes for my use case. So seems options going forward are:

  1. Make a new struct with essentially duplicate code
  2. Rename this data structure so it's clear not all operations are atomic, leave fn name as clear. maybe ConcurrentBloom?
  3. Rename fn to non_atomic_clear, add comments, leave struct name as AtomicBloom

I think 2 or 3 makes more sense, but lmk what you guys think.

@jstarry
Copy link
Member

jstarry commented Dec 19, 2023

  1. sounds fine to me as long as the pit-falls are described in the comments for the method

@apfitzge
Copy link
Contributor Author

  1. sounds fine to me as long as the pit-falls are described in the comments for the method

After sleeping on it, I think I'm leaning towards option 2 in renaming the struct to ConcurrentBloom.

The existing operations contains and add are already not atomic at the struct level, and only at the element level; i.e. concurrent contains while add is happening may see partially written key. So essentially the same thing I described w/ clear, it's atomic at the element level but not the struct level.

I pushed this in faf7cbd. If you feel strongly 3 was better option lmk.

@apfitzge apfitzge changed the title Rename: clear_for_tests -> clear Rename: AtomicBloom to ConcurrentBloom Dec 20, 2023
@t-nelson
Copy link
Contributor

renaming the struct to ConcurrentBloom

instead of renaming the struct, could add a ConcurrentReset trait and implement that on AtomicBloom so consumers have to jump through an extra hoop. paired with a descriptive name and doc comment, should steer most developers and reviewers in the right direction

@apfitzge
Copy link
Contributor Author

instead of renaming the struct, could add a ConcurrentReset trait and implement that on AtomicBloom so consumers have to jump through an extra hoop. paired with a descriptive name and doc comment, should steer most developers and reviewers in the right direction

Maybe I'm missing something. The existing unmarked fns for contains and add are already non-atomic. Why do we want to make it less convenient to use clear specifically?

@t-nelson
Copy link
Contributor

ah i think i was reviewing a stale commit. figured we had more consumers, but the diff isn't terrible. that crate is a dev-dependency for some other project, but looks like they only use Bloom. i suppose this is fine

@apfitzge apfitzge merged commit f0ff69b into solana-labs:master Dec 21, 2023
34 checks passed
@apfitzge apfitzge deleted the atomic_bloom_filter_clear branch December 21, 2023 14:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants