Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DNM] kv: prototype async Raft log writes #87050

Commits on Aug 29, 2022

  1. [DNM] kv: prototype async Raft log writes

    See cockroachdb#17500.
    
    This commit polishes and flushes a prototype I've had lying around that
    demonstrates the async Raft log appends component of cockroachdb#17500. I'm not actively
    planning to productionize this, but it sounds like we may work on this project
    in v23.1, so this prototype might help. It also demonstrates the kind of
    performance wins we can expect to see on write-heavy workloads. To this point,
    we had only [demonstrated the potential speedup](cockroachdb#17500 (comment))
    in a simulated environment with [rafttoy](https://github.com/nvanbenschoten/rafttoy).
    
    Half of the change here is to `etcd/raft` itself, which needs to be adapted to support
    asynchronous log writes. These changes are presented in nvanbenschoten/etcd@1d1fa32.
    
    The other half of the change is extracting a Raft log writer component that
    handles the process of asynchronously appending to a collection of Raft logs and
    notifying individual replicas about the eventual durability of these writes.
    This component is pretty basic and should probably be entirely rewritten, but it
    gets the job done for the prototype.
    
    The Raft log writer reveals an interesting dynamic where concurrency at this
    level actually hurts performance because it leads to concurrent calls to sync
    Pebble's WAL, which is less performant than having a single caller due to the
    fact that Pebble only exposes a synchronous Sync API and coalesces all Sync
    requests on to a single thread. An async Pebble Sync API would be valuable here.
    See the comment in NewWriter for more details.
    
    \### Benchmarks
    
    ```
    name                          old ops/s    new ops/s    delta
    kv0/enc=false/nodes=3/cpu=32   36.4k ± 5%   46.5k ± 5%  +27.64%  (p=0.000 n=10+10)
    
    name                          old avg(ms)  new avg(ms)  delta
    kv0/enc=false/nodes=3/cpu=32    5.26 ± 3%    4.14 ± 6%  -21.33%  (p=0.000 n=8+10)
    
    name                          old p99(ms)  new p99(ms)  delta
    kv0/enc=false/nodes=3/cpu=32    10.9 ± 8%     9.1 ±10%  -15.94%  (p=0.000 n=10+10)
    ```
    
    These are compelling results. I haven't pushed on this enough to know whether
    there's actually a throughput win here, or whether the fixed concurrency and
    reduced average latency is just making it look like there is. `kv0bench` should
    help answer that question.
    nvanbenschoten committed Aug 29, 2022
    Configuration menu
    Copy the full SHA
    78c132e View commit details
    Browse the repository at this point in the history