Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: Replace DashMap with a locked AHashMap #785

Merged
merged 3 commits into from
Jan 18, 2022

Conversation

olix0r
Copy link
Contributor

@olix0r olix0r commented Jan 15, 2022

Motivation

The kube_runtime::reflector module pulls in dashmap, a concurrent
hashmap implementation. This implementation necessarily uses unsafe,
which can lead to safety/correctness/security issues (like
RUSTSEC-2022-02).

Solution

This change replaces dashmap with a RwLock<AHashMap<...>> using
parking_lot. parking_lot currently has greater than 10x the usage
(according to https://lib.rs) and is already used by existing
dependencies.

dashmap purports to be higher bandwidth and lower latency than a
simple parking_lot lock, but most kubernetes controllers should not
operate at the scale where these differences factor into application
performance. On the other hand, all users may be impacted by correctness
issues.

AHash is used to reduce contention incurred by hash latency. AHash
claims to be ~10x faster than the default hasher, while preserving DoS
resistance.

This change does not impact the public API in any way and it should not
impact users.

Fixes #781

@olix0r olix0r force-pushed the ver/cache-no-dash branch 2 times, most recently from 1f02e15 to 2e82d6c Compare January 15, 2022 20:30
The `kube_runtime::reflector` module pulls in `dashmap`, a concurrent
hashmap implementation. This implementation necessarily uses `unsafe`,
which can lead to safety/correctness/security issues (like
[RUSTSEC-2022-02][sec]).

This change replaces `dashmap` with a `RwLock<HashMap<...>>` using
`parking_lot`.  `parking_lot` currently has greater than 10x the usage
(according to https://lib.rs) and is already used by existing
dependencies.

`dashmap` purports to be higher bandwidth and lower latency than a
simple `parking_lot` lock, but most kubernetes controllers should not
operate at the scale where these differences factor into application
performance. On the other hand, all users may be impact by correctness
issues.

This change does not impact the public API in any way and it should not
impact users.

Fixes kube-rs#781

[sec]: https://rustsec.org/advisories/RUSTSEC-2022-0002.html

Signed-off-by: Oliver Gould <ver@buoyant.io>
@codecov-commenter

This comment has been minimized.

olix0r added a commit to olix0r/kube-rs that referenced this pull request Jan 16, 2022
`kube_runtime`'s `reflector` and `controller` clones the entire resource
data structure on each read. This may incur many allocations and add
latency to reads.

This change modifies the `kube_runtime::reflector` to store resources in
an atomic, reference-counted `Arc` so that clones do not incur
allocation. The goal of this change is to amortize costs so that these
allocations are only incurred once--at update-time--making reads
cheaper.

This changes the public APIs of `kube_runtime`'s `controller`,
`reflector`, and `finalizer` modules.

Related to kube-rs#785

Signed-off-by: Oliver Gould <ver@buoyant.io>
olix0r added a commit to olix0r/kube-rs that referenced this pull request Jan 16, 2022
`kube_runtime`'s `reflector` and `controller` clones the entire resource
data structure on each read. This may incur many allocations and add
latency to reads.

This change modifies the `kube_runtime::reflector` to store resources in
an atomic, reference-counted `Arc` so that clones do not incur
allocation. The goal of this change is to amortize costs so that these
allocations are only incurred once--at update-time--making reads
cheaper.

This changes the public APIs of `kube_runtime`'s `controller`,
`reflector`, and `finalizer` modules.

Related to kube-rs#785

Signed-off-by: Oliver Gould <ver@buoyant.io>
Signed-off-by: Oliver Gould <ver@buoyant.io>
Signed-off-by: Oliver Gould <ver@buoyant.io>
@olix0r olix0r changed the title runtime: Replace DashMap with a locked HashMap runtime: Replace DashMap with a locked AHashMap Jan 16, 2022
Comment on lines +67 to +69
.map(|obj| (ObjectRef::from_obj_with(obj, self.dyntype.clone()), obj.clone()))
.collect::<AHashMap<_, _>>();
*self.store.write() = new_objs;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note that the old version would hash every resource 3x more. this should be notably faster when dealing with large sets of resources.

olix0r added a commit to linkerd/linkerd2 that referenced this pull request Jan 16, 2022
While looking into replacing `dashmap` in `kube-rs`
(kube-rs/kube#785), I realized that we're probably better off using
[`ahash`][ahash] in our indexing code. AHash pruports to the be "the
fastest DoS-resistant hash currently available in Rust." According to
<https://lib.rs/crates/ahash>, it has substantial usage.

This change should help to minimize lock contention (i.e. by speeding up
hashing while a lock is held). In general, this looks to be a superior
default to `std::collections::HashMap`.

Signed-off-by: Oliver Gould <ver@buoyant.io>

[ahash]: https://github.com/tkaitchuck/aHash/blob/e77cab8c1e15bfc9f54dfd28bd8820c2a7bb27c4/compare/readme.md
olix0r added a commit to linkerd/linkerd2 that referenced this pull request Jan 16, 2022
While looking into replacing `dashmap` in `kube-rs`
(kube-rs/kube#785), I realized that we're probably better off using
[`ahash`][ahash] in our indexing code. AHash pruports to the be "the
fastest DoS-resistant hash currently available in Rust." According to
<https://lib.rs/crates/ahash>, it has substantial usage.

This change should help to minimize lock contention (i.e. by speeding up
hashing while a lock is held). In general, this looks to be a superior
default to `std::collections::HashMap`.

Signed-off-by: Oliver Gould <ver@buoyant.io>

[ahash]: https://github.com/tkaitchuck/aHash/blob/e77cab8c1e15bfc9f54dfd28bd8820c2a7bb27c4/compare/readme.md
@clux
Copy link
Member

clux commented Jan 16, 2022

Thanks a lot for this implementation. I think this looks great. Agree with rationale for picking it over dashmap, it's also used by tokio:

I will probably have some time to test this out on a decent sized reflector next week to see some performance comparisons - as it would be good to have some base numbers. But am 👍 on this.

@clux
Copy link
Member

clux commented Jan 17, 2022

Did a quick benchmark on a tool that is iterating over a bunch of objects and is trying to cross-reference owning pods with a reflector store in a cluster with ~1200 pods with a --release build. This is to check how fast it can do the read side of the store, because the store write side is bounded by kuberenetes api update/patch/replace calls (and those generally happen fairly infrequently much less frequently than how fast we can loop-read - at least not fast enough worth superoptimizing for).

pseudocode of benchmark:

cache = pod reflector
for p in api.list<PrometheusRule> {
  state = cache.state() // expensive step - time of this bench dominated by this
  // look for a pod in state with matching annotations to the prometheusrule (iterate over `state` - cheap-ish)
}

this only really checks the expensiveness of cache.state() - but hopefully this is sufficiently indicative of reads anyway it's using .iter() behind the scenes.

0.66.0 with current dashmap:

real	3m40s - 3m55s
user	3m20s - 3m35s
sys	0m0.9s - 0m0.4s

this branch with parking_lot:

real	3m49s - 3m52s 
user	3m27s - 3m33s
sys	0m0.4s - 0m0.5s

results of 2 runs. to me this looks like it does not make any meaningful difference on the read side.

@clux clux added the changelog-change changelog change category for prs label Jan 17, 2022
olix0r added a commit to linkerd/linkerd2 that referenced this pull request Jan 17, 2022
While looking into replacing `dashmap` in `kube-rs`
(kube-rs/kube#785), I realized that we're probably better off using
[`ahash`][ahash] in our indexing code. AHash pruports to the be "the
fastest DoS-resistant hash currently available in Rust." According to
<https://lib.rs/crates/ahash>, it has substantial usage.

This change should help to minimize lock contention (i.e. by speeding up
hashing while a lock is held). In general, this looks to be a superior
default to `std::collections::HashMap`.

Signed-off-by: Oliver Gould <ver@buoyant.io>

[ahash]: https://github.com/tkaitchuck/aHash/blob/e77cab8c1e15bfc9f54dfd28bd8820c2a7bb27c4/compare/readme.md
@clux clux added this to the 0.67.0 milestone Jan 18, 2022
@clux clux merged commit f66dd90 into kube-rs:master Jan 18, 2022
@nightkr
Copy link
Member

nightkr commented Jan 18, 2022

FWIW I don't believe either this or #446 are breaking? #446 is purely additive, and this is a transparent backend change.

@clux
Copy link
Member

clux commented Jan 18, 2022

You mean regarding the changelog-change label? It's not reserved for only breaking changes - and i agree this isn't. But it's also not adding any new functionality.

@nightkr
Copy link
Member

nightkr commented Jan 18, 2022

I was thinking more about the 0.67.0 milestone.

@nightkr
Copy link
Member

nightkr commented Jan 18, 2022

And if changelog/change isn't only for breaking changes then what's the plan for keeping track of breakage?

@clux
Copy link
Member

clux commented Jan 18, 2022

The milestone I'm just using as an experimental way to group everything together (rather than having to construct a compare url) - it doesn't imply anything in particular atm.

@clux
Copy link
Member

clux commented Jan 18, 2022

if changelog/change isn't only for breaking changes then what's the plan for keeping track of breakage?

well, most changelog-changes would be breaking changes at some levels. few people here would argue that new dependencies are breaking, but we've put BREAKING on tiny changes before. pretty much everything we change can break something in one way or another.

it probably makes sense to mark especially breaking things worthy of highlighting in the manually written notes in CHANGELOG.md under UNRELEASED (which gets prepended to the release)?

@olix0r olix0r mentioned this pull request Jan 18, 2022
olix0r added a commit to linkerd/linkerd2 that referenced this pull request Mar 31, 2022
While looking into replacing `dashmap` in `kube-rs`
(kube-rs/kube#785), I realized that we're probably better off using
[`ahash`][ahash] in our indexing code. AHash pruports to the be "the
fastest DoS-resistant hash currently available in Rust." According to
<https://lib.rs/crates/ahash>, it has substantial usage.

This change should help to minimize lock contention (i.e. by speeding up
hashing while a lock is held). In general, this looks to be a superior
default to `std::collections::HashMap`.

Signed-off-by: Oliver Gould <ver@buoyant.io>

[ahash]: https://github.com/tkaitchuck/aHash/blob/e77cab8c1e15bfc9f54dfd28bd8820c2a7bb27c4/compare/readme.md

(cherry picked from commit d2a0fa0)
Signed-off-by: Oliver Gould <ver@buoyant.io>
olix0r added a commit to linkerd/linkerd2 that referenced this pull request Mar 31, 2022
While looking into replacing `dashmap` in `kube-rs`
(kube-rs/kube#785), I realized that we're probably better off using
[`ahash`][ahash] in our indexing code. AHash pruports to the be "the
fastest DoS-resistant hash currently available in Rust." According to
<https://lib.rs/crates/ahash>, it has substantial usage.

This change should help to minimize lock contention (i.e. by speeding up
hashing while a lock is held). In general, this looks to be a superior
default to `std::collections::HashMap`.

Signed-off-by: Oliver Gould <ver@buoyant.io>

[ahash]: https://github.com/tkaitchuck/aHash/blob/e77cab8c1e15bfc9f54dfd28bd8820c2a7bb27c4/compare/readme.md

(cherry picked from commit d2a0fa0)
Signed-off-by: Oliver Gould <ver@buoyant.io>
olix0r added a commit to linkerd/linkerd2 that referenced this pull request Apr 7, 2022
While looking into replacing `dashmap` in `kube-rs`
(kube-rs/kube#785), I realized that we're probably better off using
[`ahash`][ahash] in our indexing code. AHash pruports to the be "the
fastest DoS-resistant hash currently available in Rust." According to
<https://lib.rs/crates/ahash>, it has substantial usage.

This change should help to minimize lock contention (i.e. by speeding up
hashing while a lock is held). In general, this looks to be a superior
default to `std::collections::HashMap`.

Signed-off-by: Oliver Gould <ver@buoyant.io>

[ahash]: https://github.com/tkaitchuck/aHash/blob/e77cab8c1e15bfc9f54dfd28bd8820c2a7bb27c4/compare/readme.md

(cherry picked from commit d2a0fa0)
Signed-off-by: Oliver Gould <ver@buoyant.io>
olix0r added a commit to linkerd/linkerd2 that referenced this pull request Apr 8, 2022
While looking into replacing `dashmap` in `kube-rs`
(kube-rs/kube#785), I realized that we're probably better off using
[`ahash`][ahash] in our indexing code. AHash pruports to the be "the
fastest DoS-resistant hash currently available in Rust." According to
<https://lib.rs/crates/ahash>, it has substantial usage.

This change should help to minimize lock contention (i.e. by speeding up
hashing while a lock is held). In general, this looks to be a superior
default to `std::collections::HashMap`.

Signed-off-by: Oliver Gould <ver@buoyant.io>

[ahash]: https://github.com/tkaitchuck/aHash/blob/e77cab8c1e15bfc9f54dfd28bd8820c2a7bb27c4/compare/readme.md

(cherry picked from commit d2a0fa0)
Signed-off-by: Oliver Gould <ver@buoyant.io>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
changelog-change changelog change category for prs
Projects
None yet
Development

Successfully merging this pull request may close these issues.

runtime: Don't require dashmap to use watcher
4 participants