Skip to content

mv disable recovery log

Matthew Von-Maszewski edited this page Mar 6, 2017 · 3 revisions

Status

  • merged to master -
  • code complete - February 28, 2017
  • development started - February 28, 2017

History / Context

THIS IS A DANGEROUS OPTION THAT CAN LEAD TO DATA LOSS! Use carefully.

Google's leveldb places data from every user's Write() operation into both a sorted memory buffer and a disk based recovery log. leveldb later writes the sorted memory buffer to a level 0 .sst table file once the buffer is full. leveldb then erases the recovery log. Also, leveldb does not write the sorted memory buffer to disk upon a user's Close() request. Instead, it relies on the recovery log to repopulate the sorted memory buffer upon next data base Open(). The recovery log is essential to the durability of leveldb.

The recovery log reduces performance since the server must write data both to the recovery log and later to the .sst table file. There might be situations where the loss of data not written to the recovery log is acceptable. One such situation is with Riak TS's query buffers (Riak Time Series). The query buffer code creates a temporary leveldb database. The data written to the database is only valid until the user reads all the query results, or the server shutdowns (or dies). The query buffer's content is not expected, or even desired, to exist across database close / open sequences. Therefore disabling the recovery log is a performance gain without adding risk for the query buffer use case.

Branch description

include/leveldb/options.h & util/options.cc

This feature is enable when the leveldb::Options::disable_recovery_log field is set to "true". The default is "false". Options structure and its Dump() routine now support the field.

db/db_impl.cc

leveldb's shutdown / Close() is an implied action that occurs when the user deletes (destructs) the database object leveldb::DB. The DBImpl::~DBImpl() routine is the actual destructor called by internal code. The routine is now aware of the new disable_recovery_log flag. If the flag is set, the routine will initiate a write of any pending write buffer contents to an .sst table file. Note: this implies the data is not lost when there is an orderly shutdown.

The DBImpl::Write() function typically writes new user data to the recovery log first, then adds the new data to the internal sorted memory buffer. The recover log write is now skipped with the disable_recovery_log flag is true.

DBImpl::NewRecoveryLog() routine now skips creating a new recovery log file when the disable_recovery_log flag is set.

db/db_test.cc

This existing Google unit test has thorough coverage of open and close scenarios. This branch adds an additional pass through all of the scenarios with the disable_recovery_log flag set. This unit test was essential in getting the new code correct.

Clone this wiki locally