-
Notifications
You must be signed in to change notification settings - Fork 77
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: use bucket mutex instead of global mutex for dynamic set #268
Conversation
cc @jdm @mrobinson if either of you are familiar with the guts of this crate and what the perf implications would be here (I could try and figure this out but it would take me a while; I don't have that much time rn) |
(Just curious, feel free to ignore - I have no Rust or general low-level programming expertise) I'm failing to imagine how this can have negative perf implications. |
FWIW, we adopted this strategy in Firefox's atom table (which is essentially the same thing as this crate), and noticed no ill effects (as well as significant performance gains for parallel modification). |
I read through the changes and they look sensible to me. I have one question: if the focus of this PR is about switching from a global mutex to a mutex per bucket, why the switch from what looks like a linked list implementation to a vector instead? I don't see a reason that that change is required for the mutex one; am I missing anything? |
It was because I had trouble with adding Mutex onto the linked list. Now you have brought this up, should I switch it to |
This Vec version seems flawed. I ran my application for a few dozen times and it occasionally reported false postives (not getting the correct string), but I can't find where the bug is. Let me switch this PR back to the slower but correct hand-rolled linked list version. |
@jdm Would you like to take another look? |
I don't understand what the "reported false positives" means. However, I did notice a subtle difference between your Vec implementation and the existing, so maybe this helps: If this can explain the issue you saw with the Vec version, I think you should feel comfortable using a VecDeque and using |
src/dynamic_set.rs
Outdated
hash, | ||
ref_count: AtomicIsize::new(1), | ||
string: string.into_boxed_str(), | ||
}); | ||
let ptr = NonNull::from(&mut *entry); | ||
self.buckets[bucket_index] = Some(entry); | ||
|
||
vec.push(entry); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The existing code adds the entry to the head of the linked list.
This opens the possibility for the next insert()
usage to again find()
the entry that is about to be removed, and we'll end up inserting another duplicate entry (i.e. an unintended 3rd, 4th etc.).
With the existing implementation, the next insert()
is guaranteed to find()
the new entry that was added (and increment its ref count) since it's at the front.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for reducing the scope of this change! Would you mind squashing these commits?
This implementation uses bucket level mutex with linear probing.
73d5cfd
to
b473a4a
Compare
@jdm Done. |
@bors-servo r+ |
📌 Commit b473a4a has been approved by |
☀️ Test successful - checks-github |
Can you publish a new version, please? |
I've opened a PR to bump the version here: #270 |
Thank you! |
https://crates.io/crates/string-cache/0.8.5 is now available on crates.io. |
**Description:** I want to check the performance difference caused by servo/string-cache#268.
Revert trivial impl of Borrow<str> for Atom #266 caused a breaking change (see #271) in 0.8.5 (which was yanked). This PR fixes #271 / will allow publishing new versions (so #268 can make it out). I've just started learning Rust, haven't been able to 100% wrap my head around the issue. The breaking change manifesting in the "browserlist-rs" crate: ```rust error[E0283]: type annotations needed --> {.. snip ..}\browserslist-rs-0.12.3\src\data\caniuse.rs:91:35 | 91 | let chrome = CANIUSE_BROWSERS.get(&"chrome".into()).unwrap(); | ^^^ ---------------- type must be known at this point | | | cannot infer type of the type parameter `Q` declared on the associated function `get` | = note: multiple `impl`s satisfying `Atom<BrowserNameAtomStaticSet>: Borrow<_>` found in the following crates: `core`, `string_cache`: - impl<Static> Borrow<str> for Atom<Static> where Static: StaticAtomSet; - impl<T> Borrow<T> for T where T: ?Sized; note: required by a bound in `AHashMap::<K, V, S>::get` --> {.. snip ..}\ahash-0.7.6\src\hash_map.rs:81:12 | 81 | K: Borrow<Q>, | ^^^^^^^^^ required by this bound in `AHashMap::<K, V, S>::get` help: consider specifying the generic argument | 91 | let chrome = CANIUSE_BROWSERS.get::<Q>(&"chrome".into()).unwrap(); | +++++ ``` `CANIUSE_BROWSERS` is a ```rust AHashMap<BrowserNameAtom, BrowserStat> ``` where `BrowserNameAtom` is an Atom generated with `string_cache_codegen`. The signature of the `get` function: ```rust pub fn get<Q: ?Sized>(&self, k: &Q) -> Option<&V> where K: Borrow<Q>, Q: Hash + Eq, { self.0.get(k) } ``` `K` is the generic key type for the hash map. This is the exact same signature as the std lib HashMap, to which it delegates. When I debug 0.8.4 usage locally, the `From` that gets used for the `into()` is ```rust impl<'a, Static: StaticAtomSet> From<&'a str> for Atom<Static> { #[inline] fn from(string_to_add: &str) -> Self { Atom::from(Cow::Borrowed(string_to_add)) } } ``` It isn't clear to me how the `Borrow<str>` impl that was added "competes" with the std lib's implementation here. It doesn't seem relevant?
Revert trivial impl of Borrow<str> for Atom #266 caused a breaking change (see #271) in 0.8.5 (which was yanked). This PR fixes #271 / will allow publishing new versions (so #268 can make it out). I've just started learning Rust, haven't been able to 100% wrap my head around the issue. The breaking change manifesting in the "browserlist-rs" crate: ```rust error[E0283]: type annotations needed --> {.. snip ..}\browserslist-rs-0.12.3\src\data\caniuse.rs:91:35 | 91 | let chrome = CANIUSE_BROWSERS.get(&"chrome".into()).unwrap(); | ^^^ ---------------- type must be known at this point | | | cannot infer type of the type parameter `Q` declared on the associated function `get` | = note: multiple `impl`s satisfying `Atom<BrowserNameAtomStaticSet>: Borrow<_>` found in the following crates: `core`, `string_cache`: - impl<Static> Borrow<str> for Atom<Static> where Static: StaticAtomSet; - impl<T> Borrow<T> for T where T: ?Sized; note: required by a bound in `AHashMap::<K, V, S>::get` --> {.. snip ..}\ahash-0.7.6\src\hash_map.rs:81:12 | 81 | K: Borrow<Q>, | ^^^^^^^^^ required by this bound in `AHashMap::<K, V, S>::get` help: consider specifying the generic argument | 91 | let chrome = CANIUSE_BROWSERS.get::<Q>(&"chrome".into()).unwrap(); | +++++ ``` `CANIUSE_BROWSERS` is a ```rust AHashMap<BrowserNameAtom, BrowserStat> ``` where `BrowserNameAtom` is an Atom generated with `string_cache_codegen`. The signature of the `get` function: ```rust pub fn get<Q: ?Sized>(&self, k: &Q) -> Option<&V> where K: Borrow<Q>, Q: Hash + Eq, { self.0.get(k) } ``` `K` is the generic key type for the hash map. This is the exact same signature as the std lib HashMap, to which it delegates. When I debug 0.8.4 usage locally, the `From` that gets used for the `into()` is ```rust impl<'a, Static: StaticAtomSet> From<&'a str> for Atom<Static> { #[inline] fn from(string_to_add: &str) -> Self { Atom::from(Cow::Borrowed(string_to_add)) } } ``` It isn't clear to me how the `Borrow<str>` impl that was added "competes" with the std lib's implementation here. It doesn't seem relevant?
fix: move debug_assert check Fix needed after #268 The fix is only for debug_assert correctness. There's no functional effect. i.e. version 0.8.6 might cause debug tests to fail, but it isn't a (functionally) breaking change. **Explanation** Moving the lock to be per-bucket changed the `DYNAMIC_SET`'s API so that it doesn't need to be locked (i.e. `DYANMIC_SET` is not wrapped with a `Mutex`). The `Atom`'s `fn drop` changed from ```rust fn drop(&mut self) { if self.tag() == DYNAMIC_TAG { let entry = self.unsafe_data.get() as *const Entry; if unsafe { &*entry }.ref_count.fetch_sub(1, SeqCst) == 1 { drop_slow(self) } } // Out of line to guide inlining. fn drop_slow<Static>(this: &mut Atom<Static>) { DYNAMIC_SET .lock() .remove(this.unsafe_data.get() as *mut Entry); } } ``` to ```rust impl<Static> Drop for Atom<Static> { #[inline] fn drop(&mut self) { if self.tag() == DYNAMIC_TAG { let entry = self.unsafe_data.get() as *const Entry; if unsafe { &*entry }.ref_count.fetch_sub(1, SeqCst) == 1 { drop_slow(self) } } // Out of line to guide inlining. fn drop_slow<Static>(this: &mut Atom<Static>) { DYNAMIC_SET.remove(this.unsafe_data.get() as *mut Entry); } } ``` (the `lock()` is gone) Now when we enter `DYNAMIC_SET.remove()`, there's no longer a lock guarantee. Until we lock the bucket, another thread could be in the middle of performing a `DYNAMIC_SET.insert()` for the same string. Therefore, the `debug_assert!(value.ref_count.load(SeqCst) == 0)` is premature - it needs to occur after we take the lock for the bucket. Caught at swc-project/swc#6980 in a [failed test](https://github.com/swc-project/swc/actions/runs/4326536698/jobs/7554083939).
Background:
I'm working a concurrent program using swc to parse files in parallel. swc uses
string-cache
.I have found the global mutex in the dynamic set being the major performance bottleneck for our application. Using instruments as the profiler, we can see that
string-cache
contributes to 11.4% (9.6s) of the time:By switch to a a bucket level mutex (this PR),
string-cache
drops to 1.6% (1.04s) of the time:For this PR, can I kindly ask for a code review even if this PR cannot be merged due to any kind of circumstances, so I can safely patch my application with a fork, or help me polish this so everyone using swc can benefit from this PR.
Thank you all in advance. 🍻