Skip to content

PP Bigsets Changes

Paul A. Place edited this page Jul 13, 2016 · 6 revisions

Overview

The following changes have been made to branches of leveldb and eleveldb to support the CRDT bigsets prototype. See Russell's design document for more information about CRDT bigsets.

leveldb:

  • Branch: rdb/bigset/comparator

    Note that this branch originally contained the bigset-specific comparator class, but that code now resides in eleveldb.

  • Logging: Exposed the leveldb Logger class to eleveldb by adding the GetLogger() method to the DB class in include/leveldb/db.h. This is a virtual method that defaults to returning NULL; the implementation in the DBImpl class (in db/db_impl.h) returns the options_.info_log data member.

    Note that the Logger object is typically used by passing it to the leveldb::Log() function, defined in util/env.cc.

    Matthew and I discussed updating leveldb::Log() to write to the syslog if the Logger* is NULL, which may be the case during leveldb initialization or termination.

eleveldb:

  • Branch: pp-bigset-streaming-fold-per-elem-ctx (based on the streaming fold code introduced by Engel and used by Riak TS)

  • Updates to existing code in c_src:

    • workitems.h and workitems.cc: Most of the real eleveldb work for bigsets is done in the RangeScanTask::operator() method, which is the method called by a worker thread in the eleveldb thread pool to perform a client task (in this case, a streaming fold). If the client sets the "this is a bigset" option, the RangeScanTask constructor allocates a BigsetAccumulator object. In the RangeScanTask::operator() method, each record read from leveldb is added to the accumulator by calling the BigsetAccumulator::AddRecord() method.

      Note that elements for a bigset are guaranteed to be contiguous in leveldb, due to the way the custom comparator sorts bigset elements. This means that when the streaming fold hits the end-key for a bigset, or if it encounters a bigset with a different set name, then the current streaming fold is done.

      Note also that a particular bigset element may consist of multiple associated records in leveldb. That is why the RangeScanTask::operator() method adds each record from leveldb to the accumulator until the accumulator reports that the last record for a particular element has been processed. When BigsetAccumulator::RecordReady() returns true, RangeScanTask::operator() gets the final version of the bigset element from the accumulator and adds it to the buffer of records returned to the client.

  • Additions in c_src:

    • BigsetAccumulator.h and BigsetAccumulator.cc: The BigsetAccumulator class provides methods for processing bigset records stored in leveldb. Since a bigset element is comprised of one or more records in leveldb, the records read from leveldb must be accumulated until all the records for a given bigset element have been read and can be aggregated into a single bigset element record. When BigsetAccumulator::RecordReady() returns true, the caller must then call BigsetAccumulator::GetCurrentElement() to retrieve the current bigset record. The different record types that BigsetAccumulator can process are defined in the BigsetKey class.

      The BigsetAccumulator class provides some additional methods to support range queries. This is necessary since the streaming fold code in the RangeScanTask::operator() method must first read the clock for the bigset, then it must advance the leveldb iterator to the first record of the range. Additionally, the range query may stop before reading the end-key for the bigset, requiring logic in the RangeScanTask::operator() method to ensure that the last bigset element in the range is properly accumulated and added to the output buffer returned to the client.

    • BigsetClock.h and BigsetClock.cc: Declares and implements all the classes necessary to represent a bigset clock in C++. See the comment block before BigsetClock::ValueToBigsetClock() in BigsetClock.cc for a description of the bigset clock's serialization format.

    • BigsetComparator.h and BigsetComparator.cc: Russell added the bigset-specific comparator class leveldb::BSComparator in leveldb/util/comparator.cc. I moved this class to eleveldb and renamed it BigsetComparator. The comment block before the class declaration provides a good description of the comparator's functionality.

    • BigsetKey.h and BigsetKey.cc: The BigsetKey class assists with parsing a bigset key stored in leveldb. See the comment block before the BigsetKey constructor in BigsetKey.cc for information about the serialized format of a bigset key.

  • New stuff:

    • util directory: The util directory contains classes and functions used in the code mentioned above. Note that these files can (and possibly should) be moved into a separate library of Basho C/C++ utility code.

      • buffer.h: Declares the Buffer<> template class, which provides basic buffer management, using a built-in buffer whose size is specified by the template parameter. This class is useful when you need a buffer whose size does not vary too often. Using a built-in buffer that can grow if needed allows allocating a buffer that is large enough most of the time but allows for growth when needed.

      • erlangUtils.h and erlangUtils.cc: Declares a few helper functions for interacting with serialized Erlang data.

      • stringUtils.h and stringUtils.cc: Declares a few helper functions for working with std::string. This includes functions for trimming whitespace (or more generally, trimming any desired characters from the ends of a std::string), formatting an integer into a human-readable number with thousands separators, and formatting a size in bytes value as a human-readable string (e.g, format 987 as "987 bytes" and format 303841 as "296.72 KB (303,841 bytes)"). The trim functions are useful when parsing text input, and the formatting functions are useful in creating nicely-formatted log messages.

        Note that these functions could be generalized as template functions if necessary, for example, to handle integers of different sizes, and to work with std::basic_string<> instead of std::string.

      • utils.h: This is the main header file for this utility library.

      • Unit tests in test subdirectory: Currently we have unit tests for the Buffer<> class and for the string utility functions.

    • test directory:

      • bigsetClock_test.cc: unit tests for the BigsetClock class and its associated helper classes.

      • bigsetClockValidationTool.cc: this is a command line utility that I wrote early in the process of figuring out how to parse a bigset clock from its Erlang serialized format. It is not likely useful at this point, although it might provide some historical insight into the code in BigsetClock.h and BigsetClock.cpp.

Clone this wiki locally