[DNM] kv: prototype async Raft log writes #87050

See cockroachdb#17500. This commit polishes and flushes a prototype I've had lying around that demonstrates the async Raft log appends component of cockroachdb#17500. I'm not actively planning to productionize this, but it sounds like we may work on this project in v23.1, so this prototype might help. It also demonstrates the kind of performance wins we can expect to see on write-heavy workloads. To this point, we had only [demonstrated the potential speedup](cockroachdb#17500 (comment)) in a simulated environment with [rafttoy](https://github.com/nvanbenschoten/rafttoy). Half of the change here is to `etcd/raft` itself, which needs to be adapted to support asynchronous log writes. These changes are presented in nvanbenschoten/etcd@1d1fa32. The other half of the change is extracting a Raft log writer component that handles the process of asynchronously appending to a collection of Raft logs and notifying individual replicas about the eventual durability of these writes. This component is pretty basic and should probably be entirely rewritten, but it gets the job done for the prototype. The Raft log writer reveals an interesting dynamic where concurrency at this level actually hurts performance because it leads to concurrent calls to sync Pebble's WAL, which is less performant than having a single caller due to the fact that Pebble only exposes a synchronous Sync API and coalesces all Sync requests on to a single thread. An async Pebble Sync API would be valuable here. See the comment in NewWriter for more details. \### Benchmarks ``` name old ops/s new ops/s delta kv0/enc=false/nodes=3/cpu=32 36.4k ± 5% 46.5k ± 5% +27.64% (p=0.000 n=10+10) name old avg(ms) new avg(ms) delta kv0/enc=false/nodes=3/cpu=32 5.26 ± 3% 4.14 ± 6% -21.33% (p=0.000 n=8+10) name old p99(ms) new p99(ms) delta kv0/enc=false/nodes=3/cpu=32 10.9 ± 8% 9.1 ±10% -15.94% (p=0.000 n=10+10) ``` These are compelling results. I haven't pushed on this enough to know whether there's actually a throughput win here, or whether the fixed concurrency and reduced average latency is just making it look like there is. `kv0bench` should help answer that question.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[DNM] kv: prototype async Raft log writes #87050

[DNM] kv: prototype async Raft log writes #87050

Commits on Aug 29, 2022