Skip to content

Commit

Permalink
libroach: bump up rocksdb backpressure limits
Browse files Browse the repository at this point in the history
System-critical writes in Cockroach, like node-liveness, just can not be
slow or they will fail, meaning that if theese rocksdb back-pressure
slowdowns ever kick in, they usually do not gradually slow traffic
until the system reaches some stable throughput equilibrium as intended,
but rather cause liveness to fail and result in sudden  unavailability
-- the opposite of what they were intended to do.

Thus we are probably better off just letting the metrics they were
intended to protect -- like read-amplification or compaction debt --
stray further into unhealthy territory, than we are back-pressuring and
hastening our demise: slower reads due to elevated read-amp are still
better than no reads due to node-liveness failures (and indeed slower
reads may serve as their own backpressure as we usually need to read to
write).

Release note: None
  • Loading branch information
dt committed Oct 18, 2019
1 parent 96bcd7a commit a1b68ff
Showing 1 changed file with 13 additions and 9 deletions.
22 changes: 13 additions & 9 deletions c-deps/libroach/options.cc
Original file line number Diff line number Diff line change
Expand Up @@ -253,22 +253,26 @@ rocksdb::Options DBMakeOptions(DBOptions db_opts) {
// slowdowns to writes.
// TODO(dt): if/when we dynamically tune for bulk-ingestion, we
// could leave this at 20 and only raise it during ingest jobs.
options.level0_slowdown_writes_trigger = 200;
options.level0_slowdown_writes_trigger = 500;
// Maximum number of L0 files. Writes are stopped at this
// point. This is set significantly higher than
// level0_slowdown_writes_trigger to avoid completely blocking
// writes.
// TODO(dt): if/when we dynamically tune for bulk-ingestion, we
// could leave this at 30 and only raise it during ingest.
options.level0_stop_writes_trigger = 400;
options.level0_stop_writes_trigger = 1000;
// Maximum estimated pending compaction bytes before slowing writes.
// Default is 64gb but that can be hit during bulk-ingestion since it
// is based on assumptions about relative level sizes that do not hold
// during bulk-ingestion.
// TODO(dt): if/when we dynamically tune for bulk-ingestion, we
// could leave these as-is and only raise / disable them during ingest.
options.soft_pending_compaction_bytes_limit = 256 * 1073741824ull;
options.hard_pending_compaction_bytes_limit = 512 * 1073741824ull;
// Default is 64gb but that can be hit easily during bulk-ingestion since it
// is based on assumptions about relative level sizes that do not hold when
// adding data directly. Additionally some system-critical writes in
// cockroach (node-liveness), just can not be slow or they will fail and cause
// unavilability, so back-pressuring may *cause* unavailability, instead of
// gracefully slowing to some stable equilibrium to avoid it. As such, we want
// these set very high so are very unlikely to hit them.
// TODO(dt): if/when we dynamically tune for bulk-ingestion, we could leave
// these as-is and only raise / disable them during ingest.
options.soft_pending_compaction_bytes_limit = 2048 * 1073741824ull;
options.hard_pending_compaction_bytes_limit = 4098 * 1073741824ull;
// Flush write buffers to L0 as soon as they are full. A higher
// value could be beneficial if there are duplicate records in each
// of the individual write buffers, but perf testing hasn't shown
Expand Down

0 comments on commit a1b68ff

Please sign in to comment.