- Based on
Bitcask
principles - High performance
- Insertion rate bottleneck is the disk I/O saturation
- Lookup throughput benefits from a scalable concurrent hashtable and a built-in memory cache
- Crash friendliness because architectured as a log-structured file systems, only the non-disk-flushed data are lost
- Ability to handle datasets much larger than RAM without degradation: only the keys reside in memory
- Easy software integration: copying 1 header file is enough, no external dependencies
- Ease of backup and restore: backuping 1 flat directory is enough
- Support of indexation using parts of the keys
- Support of entry lifetime
Litecask is:
- an efficient embedded database as a building block for your application
- a lean library with an opinionated set of features, to maximize control and limit entropy
- tracked internal performance (hashtable, allocator, cache, startup time...)
- consistency by using ASAN, TSAN, clang-format, clang-tidy, and custom tests based on
doctest
- an enhanced implementation of
Bitcask
, additionally featuring:- a built-in cache
- indexing capabilities
- a C++17 single-header file, including:
- a
TLSF
heap-based memory allocator - a concurrent hashtable with optimistic locking (inspired by
memC3
and this analysis) - a lock-free RW-lock (aka shared mutex) reducing both false sharing and kernel access
- a memory cache with segmented LRU (inspired by
memcached
)
- a
Litecask is not:
- a remote database
- it misses some layers: a high performance network part (asio, evpp...) with a client-server communication protocol
- a billion entries database. Indeed, a fundament of
Bitcask
is to keep the key directory in memory.- 100 millions entries is however in its range, depending on average key size and available RAM
- value sizes do not matter as they are not kept in memory, cache excepted
- scaling horizontally would imply making the database remote and sharded
- 100 millions entries is however in its range, depending on average key size and available RAM
A simple example inserting and retrieving a value is shown below:
// example.cpp . Place the file litecask.h is in the same folder
// Build with: 'c++ --std=c++17 example.cpp -o example' (Linux) or 'cl.exe /std:c++17 /EHsc example.cpp' (Windows)
#include "litecask.h"
int main(int argc, char** argv)
{
litecask::Datastore store;
store.open("/tmp/my_temp_db");
// Store an entry
std::vector<uint8_t> value{1,2,3,4,5,6,7,8};
store.put("my key identifier", value);
// Retrieve the entry
std::vector<uint8_t> retrievedValue;
store.get("my key identifier", retrievedValue);
assert(retrievedValue==value);
store.close();
}
Performance are highly dependent on the hardware (CPU quantity, CPU caches, memory bandwidth, disk speed, OS...).
The results in this section correspond to a laptop with 16 CPUs (i7-11800H @2.30GHz) on Linux, obtained with the built-in benchmarks (matplotlib is required to draw the graphs in the optional last command):
mkdir build
cd build
cmake ..
make -j $(nproc)
./bin/litecask_test benchmark -ll
../ci/benchmark -n
Result for a 1 million entries database, 8 bytes keys, values in cache and Zipf-1.0 access distribution.
The memory throughput graph on the right is deduced from the left graph and the size of the value.
The lower rate when writing is due to disk I/O bottleneck, highlighted by the asymptote when value size grows.
Result for a 1 million entries database, 8 bytes keys, 256 bytes values, values in cache and Zipf-1.0 access distribution.
The read access scales well with the thread quantity thanks to the concurrent hashtable and shared lock implementations.
The full write does not scale with multi-threading due to the one-writer constraint (log-structured design).
Measures for a 30 million entries datastore with 8 bytes keys and 16 bytes value size:
Characteristic | Performance |
---|---|
Total startup load time | 3.975 seconds (regardless of value sizes) |
Total Used memory | 2064 MB (regardless of value sizes) |
Startup load rate | 7.5 million entries / second |
RAM / entry | 61 bytes + the key size + 2 bytes per index (global averaged overhead) |
Disk size / entry | 16 bytes + the key size + 2 bytes per index + the value size |
Effect of deferred write
Deferred write is simply writing in an intermediate memory buffer to avoid costly kernel disk write calls. This buffer is flushed on disk only if full or non empty after a configurable timeout.
In case of process disruption, a drawback is the potential loss of these not-yet-on-disk entries.
Conditions: monothread, key size 8 bytes, value size 8 bytes, write buffer size 100KB
Write kind | Performance |
---|---|
Deferred sync | 4.170 Mop/s |
Forced sync | 1.139 Mop/s |
Effect of value cache
A cache keeps the value of some entries in memory so that their access does not require a costly disk read. The strategy of selecting which value to keep or not is implementation dependent.
Conditions: monothread, key size 8 bytes, value size 256 bytes, 1 million entries, zipf-1.0 access pattern
Cache size percentage | Performance | Cache hit |
---|---|---|
0% | 2.594 Mop/s | 0.0% |
25% | 6.132 Mop/s | 96.0% |
50% | 6.167 Mop/s | 96.2% |
90% | 6.226 Mop/s | 96.8% |
100% | 7.217 Mop/s | 100% |
Note: the cache effect is even bigger effect with multithreaded access.
By design and without any upper layer library, a key-value store has limited capabilities to scan and query:
- Log Structured Hash tables scan by peering at each entry.
- Log Structured Merge trees internally sort keys, allowing key-range based scans.
Litecask, under the first category, proposes a different approach: use parts of the key as indexes.
A high level view of this internal behavior could be the usage of a dedicated hash table that takes an array of bytes as input and returns a set of unique keys as output.
Let's consider the following entry with a text-based key:
store.put("UJohn Doe/CUS/TTax document/0001", value);
The text key can be visually split into:
UJohn Doe
from byte 0, length 9CUS
from byte 10, length 3TTax document
from byte 14, length 13
These chunks of keys can be used as an index by upgrading the previous insertion command as follows:
store.put("UJohn Doe/CUS/TTax document/0001", value, { {0,9}, {10, 3}, {14, 13} });
Thanks to this lightweight indexes, it is now possible to query for the user UJohn Doe
, the country CUS
, or type TTax document
entries. Or an intersection of these.
std::vector<std::vector<uint8_t>> matchingKeys;
// Query for user
store.query("UJohn Doe", matchingKeys);
assert(matchingKeys.size()==1);
// Query for country
store.query("CUS", matchingKeys);
assert(matchingKeys.size()==1);
// Query for user AND country (implicit AND. OR can be performed by additional queries and removing duplicates)
store.query({std::string("UJohn Doe"), std::string("CUS")}, matchingKeys);
assert(matchingKeys.size()==1);
Note: in this example, the prefixes U
, C
and T
prevent mixing "columns" in case of same content.
The separating '/' is purely for human readability.
Datastore::Datastore(...)
- Datastore instance creation
Datastore::Datastore(size_t cacheBytes = 256 * 1024 * 1024);
Parameter name | Description |
---|---|
cacheBytes |
Defines the maximum memory usage of the value cache in bytes. Default is 256 MB. Once this size is reached, values are evicted to allow insertion of new ones. Value cache greatly improves read performance by avoiding disk access. |
Status Datastore::open(...)
- Datastore opening
Status Datastore::open(std::filesystem::path dbDirectoryPath, bool doCreateIfNotExist = true);
Parameter name | Description |
---|---|
dbDirectoryPath |
The path of the litecask datastore |
doCreateIfNotExist |
Boolean to decide to create a non existing datastore (default) |
Return code | Comment |
---|---|
Status::Ok |
The datastore was successfully opened |
Status::StoreAlreadyOpen |
The datastore instance is already in opened state |
Status::CannotOpenStore |
The provided path does not correspond to a litecask store |
Status::StoreAlreadyInUse |
The datastore files are already in use by another process |
Status::BadDiskAccess |
The datastore cannot be opened due to file access issues |
Status Datastore::close(...)
- Datastore closing
Status Datastore::close();
Return code | Comment |
---|---|
Status::Ok |
The datastore was successfully opened |
Status::StoreNotOpen |
The datastore instance was not in opened state |
void Datastore::sync(...)
- Flush the write buffer
void Datastore::sync();
This synchronization of the written data to the disk is performed:
- When this function
sync()
is called explicitely - When an entry is inserted with the
forceDiskSync
flag set totrue
- When the write buffer is full
- Automatically with a configurable period
Note: this synchronization is at the application level, protecting against loss when the application crashes. If a sudden shutdown of the machine occurs, the content of non-written OS disk cache may still be lost.
Status Datastore::put(...)
- Entry insertion
// Key and value as pointer plus size
Status Datastore::put(const void* key, size_t keySize,
const void* value, size_t valueSize,
const std::vector<KeyIndex>& keyIndexes = {},
uint32_t ttlSec = 0,
bool forceDiskSync = false);
// Variant 1: key as vector
Status Datastore::put(const std::vector<uint8_t>& key,
const void* value, size_t valueSize,
const std::vector<KeyIndex>& keyIndexes = {},
uint32_t ttlSec = 0,
bool forceDiskSync = false);
// Variant 2: key as string. Note that the null termination is not part of the key
Status Datastore::put(const std::string& key,
const void* value, size_t valueSize,
const std::vector<KeyIndex>& keyIndexes = {},
uint32_t ttlSec = 0,
bool forceDiskSync = false);
// Variant 3: key as vector and value as vector
Status Datastore::put(const std::vector<uint8_t>& key,
const std::vector<uint8_t>& value,
const std::vector<KeyIndex>& keyIndexes = {},
uint32_t ttlSec = 0,
bool forceDiskSync = false);
// Variant 4: key as string and value as vector
Status Datastore::put(const std::string& key,
const std::vector<uint8_t>& value,
const std::vector<KeyIndex>& keyIndexes = {},
uint32_t ttlSec = 0,
bool forceDiskSync = false);
// This structure defines a part of the key [start index; size[ to use as an index.
// An array of key indexes MUST be sorted by increasing startIdx, then size if startIdx are equal.
// Except this constraint on sorting, key indexes can overlap.
struct KeyIndex {
uint8_t startIdx; // Relative to the start of the key
uint8_t size;
};
Parameter name | Description |
---|---|
key |
The pointer or the structure of the key |
keySize |
In case of key as a pointer, the size of the key in bytes. Maximum accepted size is 65534 bytes |
value |
The pointer or the structure of the value |
valueSize |
In case of value as a pointer, the size of the value in bytes |
keyIndexes |
An array of KeyIndex structures which defines the parts of the key to use as an index. Default is no index |
ttlSec |
The 'Time To Live' of the entry in second. Default is zero which means no lifetime limit |
forceDiskSync |
Boolean to force the write on disk of the full write buffer after this entry. Default is false. Note that it covers just the application cache, not the OS one. |
Return code | Comment |
---|---|
Status::Ok |
The entry was successfully stored |
Status::BadKeySize |
The key size is bigger than 65535 |
Status::InconsistentKeyIndex |
The quantity of key index is bigger than 64, or the index address bytes outside of the key |
Status::UnorderedKeyIndex |
The index are not in order |
Status::BadValueSize |
The value size is bigger than 4294901760 bytes |
Status::StoreNotOpen |
The datastore is not open |
Status::OutOfMemory |
The system is running our of memory |
Status Datastore::remove(...)
- Entry removal
// Key pointer plus size
Status Datastore::remove(const void* key, size_t keySize,
bool forceDiskSync = false);
// Variant 1: key as vector
Status Datastore::remove(const std::vector<uint8_t>& key,
bool forceDiskSync = false);
// Variant 2: key as string
Status Datastore::remove(const std::string& key,
bool forceDiskSync = false);
Parameter name | Description |
---|---|
key |
The pointer or the structure of the key |
keySize |
In case of key as a pointer, the size of the key in bytes. Maximum accepted size is 65534 bytes |
forceDiskSync |
Boolean to force the write on disk of the full write buffer after this entry removal. Default is false. Note that it covers just the application cache, not the OS one. |
Return code | Comment |
---|---|
Status::Ok |
The entry was successfully stored |
Status::BadKeySize |
The key size is bigger than 65535 |
Status::StoreNotOpen |
The datastore is not open |
Status::EntryNotFound |
The key was not found in the datastore |
Status Datastore::get(...)
- Entry retrieval
// Key pointer and size
Status Datastore::get(const void* key, size_t keySize,
std::vector<uint8_t>& value);
// Variant 1: key as vector
Status Datastore::get(const std::vector<uint8_t>& key,
std::vector<uint8_t>& value);
// Variant 2: key as string
Status Datastore::get(const std::string& key,
std::vector<uint8_t>& value);
Parameter name | Description |
---|---|
key |
The pointer or the structure of the key |
keySize |
In case of key as a pointer, the size of the key in bytes. Maximum accepted size is 65534 bytes |
value |
The output array structure for the retrieved value |
Return code | Comment |
---|---|
Status::Ok |
The entry was successfully retrieved |
Status::BadKeySize |
The key size is bigger than 65535 |
Status::StoreNotOpen |
The datastore is not open |
Status::EntryNotFound |
The key was not found in the datastore |
Status::EntryCorrupted |
The entry was retrieved from disk and the checksum is incorrect (i.e. corrupted entry) |
Status Datastore::query(...)
- Entries query based on index
// Query variant 1: single key part as vector
Status query(const std::vector<uint8_t>& keyPart,
std::vector<std::vector<uint8_t>>& matchingKeys);
// Query variant 2: single key part as string
Status query(const std::string& keyPart,
std::vector<std::vector<uint8_t>>& matchingKeys);
// Query variant 3: multiple key part as vector of vector
Status query(const std::vector<std::vector<uint8_t>>& keyParts,
std::vector<std::vector<uint8_t>>& matchingKeys);
// Query variant 4: key part as vector of string
Status query(const std::vector<std::string>& keyParts,
std::vector<std::vector<uint8_t>>& matchingKeys);
// Query variant 5: single key part as vector, with arena allocator for output array of keys
Status query(const std::vector<uint8_t>& keyPart,
std::vector<QueryResult>& arenaMatchingKeys,
ArenaAllocator& allocator);
// Query variant 6: single key part as string, with arena allocator for output array of keys
Status query(const std::string& keyPart,
std::vector<QueryResult>& arenaMatchingKeys,
ArenaAllocator& allocator);
// Query variant 7: multiple key parts as vector, with arena allocator for output array of keys
Status query(const std::vector<std::vector<uint8_t>>& keyParts,
std::vector<QueryResult>& arenaMatchingKeys,
ArenaAllocator& allocator);
// Query variant 8: multiple key parts as string, with arena allocator for output array of keys
Status query(const std::vector<std::string>& keyParts,
std::vector<QueryResult>& arenaMatchingKeys,
ArenaAllocator& allocator);
// This structure defines a 'query result' by providing a memory span, when using an arena allocator API.
struct QueryResult {
uint8_t* ptr;
uint16_t size;
};
// 'minAllocChunkBytes' is the performed allocation size if the requested amount is smaller than this value.
// For efficiency reasons, it should to be several orders of magnitude larger than the typical allocation size
class ArenaAllocator {
ArenaAllocator(size_t minAllocChunkBytes = 1024 * 1024);
uint8_t* allocate(size_t bytes); // Used by the query API
size_t getAllocatedBytes() const;
void reset(); // Free all allocations, but keeps the memory chunks internally. Invalidates QueryResult content.
};
Parameter name | Description |
---|---|
keyPart |
A string or array of bytes to use as a single index to query |
keyParts |
An array of string or array of bytes to use as a multiple index to query. The resulting keys shall match all indexes ('AND' operation) |
matchingKeys |
The output array of array of bytes holding the matching keys |
allocator |
An arena allocator which improves efficiency of large queries by batching allocations Used only with the arena allocator APIs and in association with the QueryResult structure |
Return code | Comment |
---|---|
Status::Ok |
The entry was successfully retrieved |
Status::BadKeySize |
One of the provided key chunk has a size bigger than 65535 |
Status Datastore::setConfig(...)
- Set datastore configuration
Status Datastore::setConfig(const Config& config);
struct Config {
// General store parameters
// ========================
// 'dataFileMaxBytes' defines the maximum byte size of a data file before switching to a new one.
// It implicitely limits the maximum size of the database as there can be at most 65535 data files.
// Bigger data files make the total size bigger (up to 65535*4GB = 281 TiB)
// Smaller data files make the merge time shorter
uint32_t dataFileMaxBytes = 100'000'000;
// 'mergeCyclePeriodMs' defines the merge period for the database, in milliseconds.
// This merge cycle first checks if the 'merge' process is needed. If positive, the eligible data files
// are selected and compacted into defragmented and smaller files which eventually replace the old ones.
uint32_t mergeCyclePeriodMs = 60'000;
// 'upkeepCyclePeriodMs' defines the upkeep period for the internal structures, in milliseconds.
// It copes mainly with the cache eviction and the KeyDir resizing. This latter does not wait the end of
// the cycle and start working immediately
uint32_t upkeepCyclePeriodMs = 1000;
// 'writeBufferFlushPeriodMs' defines the maximum time for the write buffer to be flushed on disk.
// This limits the amount of data that can be lost in case of sudden interruption of the program, while
// avoiding costly disk access at each write operation.
// Note that the effective period is the maximum between upkeepCyclePeriodMs and writeBufferFlushPeriodMs.
// Note also that the "put" API offers to force-flush directly on disk (with a performance cost).
uint32_t writeBufferFlushPeriodMs = 5000;
// 'upkeepKeyDirBatchSize' defines the quantity of KeyDir entries to update in a row.
// This includes both KeyDir resizing and data file compaction mechanisms.
// A higher quantity of entries will make the transition finish earlier, at the price of higher spikes of
// latency on entry write or update. A too low value could paradoxically induce a forced resizing of the
// remaining part of the KeyDir if the next resize arrives before the end of the previous one.
uint32_t upkeepKeyDirBatchSize = 100'000;
// 'upkeepValueCacheBatchSize' defines the quantity of cached value entries to update in a row in the LRU.
// A higher quantity of entries will make the background task finish earlier, at the price of higher spikes of
// latency on entry write or update. A too low value could paradoxically induce a forced task to clean and
// evict cached values at inserting time.
uint32_t upkeepValueCacheBatchSize = 10000;
// 'valueCacheTargetMemoryLoadPercentage' configures the target load for the cache, so that the remaining free
// space ensures a performant insertion in the cache. The eviction required to meet this target load is deferred
// in a background task. Too low a value wastes cache memory, too high a value prevent the insertion a new entry
// in the cache because of lack of free space.
uint32_t valueCacheTargetMemoryLoadPercentage = 90;
// Merge Triggers
// ==============
// They determine the conditions under which merging will be invoked. They fall into two basic categories:
// 'mergeTriggerDataFileFragmentationPercentage' describes the percentage of dead keys to total keys in
// a file that will trigger merging.
// Increasing this value will cause merging to occur less often.
uint32_t mergeTriggerDataFileFragmentationPercentage = 60;
// 'mergeTriggerDataFileDeadByteThreshold' describes how much data stored for dead keys in a single file triggers
// merging. Increasing the value causes merging to occur less often, whereas decreasing the value causes merging
// to happen more often.
uint32_t mergeTriggerDataFileDeadByteThreshold = 51'200'000;
// Merge data file selection
// =========================
// These parameters determine which files will be selected for inclusion in a merge operation.
// 'mergeSelectDataFileFragmentationPercentage' describes which percentage of dead keys to total keys in a file
// causes it to be included in the merge.
// Note: this value shall be less than the corresponding trigger threshold.
uint32_t mergeSelectDataFileFragmentationPercentage = 40;
// 'mergeSelectDataFileDeadByteThreshold' describes which ratio the minimum amount of data occupied by dead keys
// in a file to cause it to be included in the merge.
// Note: this value shall be less than the corresponding trigger threshold.
uint32_t mergeSelectDataFileDeadByteThreshold = 12'800'000;
// 'mergeSelectDataFileSmallSizeTheshold' describes the minimum size below which a file is included in the merge.
// The purpose is to reduce the quantity of small data files to keep open file quantity low.
uint32_t mergeSelectDataFileSmallSizeTheshold = 10'000'000;
};
Parameter name | Description |
---|---|
config |
The configuration structure |
Return code | Comment |
---|---|
Status::Ok |
The entry was successfully retrieved |
Status::BadParameterValue |
A parameter has a bad value. Check the logs for details (warning) |
Status::InconsistentParameterValues |
Some parameter values are incompatible. Check the logs for details (warning) |
Config Datastore::getConfig(...)
- Get datastore configuration
Config Datastore::getConfig() const;
Return value | Comment |
---|---|
config |
The current configuration structure |
Status Datastore::setWriteBufferBytes(...)
- Write buffer configuration
Status Datastore::setWriteBufferBytes(uint32_t writeBufferBytes);
Parameter name | Description |
---|---|
writeBufferBytes |
The size of the write buffer in bytes. Default is 100 KB. Note: above tens of typical entry size, increasing further this value should not have a big impact on performances |
Return code | Comment |
---|---|
Status::Ok |
The entry was successfully retrieved |
bool Datastore::setLogLevel(...)
- Set the logging level
bool Datastore::setLogLevel(LogLevel level);
enum class LogLevel { Debug = 0, Info = 1, Warn = 2, Error = 3, Fatal = 4, None = 5 };
Parameter name | Description |
---|---|
level |
The minimum log level. Logs with a strictly lower level are filtered out. Default is LogLevel::Info |
Return code | Comment |
---|---|
boolean |
True if the provided log level is within the defined range |
void Datastore::setLogHandler(...)
- Override the default log handler
void Datastore::setLogHandler(const std::function<void(LogLevel, const char*, bool)>& logHandler);
Parameter name | Description |
---|---|
logHandler |
The new log handler function. The provided parameters are: - the log level (after filtering) - the message string - a boolean notifying the termination of the logging process if true |
The default logger simply writes in rolling files litecask<N>.log
at the root of the datastore folder.
To remove dead entries, a merge/compaction of the data files is performed automatically in background.
The API in this section is not required for proper function of the datastore.
It is however possible to force manually such process to target for instance a period where the load is low.
Such compaction is not systematic nor global, please refer to the configuration structure for further details.
bool Datastore::requestMerge(...)
- Explicit merge/compaction request
bool Datastore::requestMerge();
Return code | Comment |
---|---|
boolean |
Returns true if the request has been properly processed. If a merge/compaction process is on-going, the returned value is false |
bool Datastore::isMergeOnGoing(...)
- Get state of the merge/compaction request
bool Datastore::isMergeOnGoing() const;
Return code | Comment |
---|---|
boolean |
Returns true if a merge/compaction is on-going |
uint64_t Datastore::getEstimatedUsedMemoryBytes(...)
- Get datastore global used memory
uint64_t Datastore::getEstimatedUsedMemoryBytes(bool withCache = false) const;
Parameter name | Description |
---|---|
withCache |
Boolean to select the inclusion of the value cache used memory. Default is false |
Return value | Comment |
---|---|
uint64_t |
The estimated byte quantity used by the datastore |
uint64_t Datastore::getValueCacheMaxAllocatableBytes(...)
- Get configured value cache size
uint64_t Datastore::getValueCacheMaxAllocatableBytes() const;
Return value | Comment |
---|---|
uint64_t |
Returns the configured value cache size in bytes |
uint64_t Datastore::getValueCacheAllocatedBytes(...)
- Get used value cache size
uint64_t Datastore::getValueCacheAllocatedBytes() const;
Return value | Comment |
---|---|
uint64_t |
Returns the currently allocated value cache size in bytes |
const DatastoreCounters& Datastore::getCounters(...)
- Get datastore counters
const DatastoreCounters& Datastore::getCounters() const;
// All fields are monotonic counters
struct DatastoreCounters {
// API calls
std::atomic<uint64_t> openCallQty;
std::atomic<uint64_t> openCallFailedQty;
std::atomic<uint64_t> closeCallQty;
std::atomic<uint64_t> closeCallFailedQty;
std::atomic<uint64_t> putCallQty;
std::atomic<uint64_t> putCallFailedQty;
std::atomic<uint64_t> removeCallQty;
std::atomic<uint64_t> removeCallNotFoundQty;
std::atomic<uint64_t> removeCallFailedQty;
std::atomic<uint64_t> getCallQty;
std::atomic<uint64_t> getCallNotFoundQty;
std::atomic<uint64_t> getCallCorruptedQty;
std::atomic<uint64_t> getCallFailedQty;
std::atomic<uint64_t> getWriteBufferHitQty;
std::atomic<uint64_t> getCacheHitQty;
std::atomic<uint64_t> queryCallQty;
std::atomic<uint64_t> queryCallFailedQty;
// Data files
std::atomic<uint64_t> dataFileCreationQty;
std::atomic<uint64_t> dataFileMaxQty;
std::atomic<uint64_t> activeDataFileSwitchQty;
// Index
std::atomic<uint64_t> indexArrayCleaningQty;
std::atomic<uint64_t> indexArrayCleanedEntries;
// Maintenance (merge / compaction)
std::atomic<uint64_t> mergeCycleQty;
std::atomic<uint64_t> mergeCycleWithMergeQty;
std::atomic<uint64_t> mergeGainedDataFileQty;
std::atomic<uint64_t> mergeGainedBytes;
std::atomic<uint64_t> hintFileCreatedQty;
};
Return value | Comment |
---|---|
DatastoreCounters |
A constant reference on the internal datastore counters structure |
const ValueCacheCounters& Datastore::getValueCacheCounters(...)
- Get value cache counters
const ValueCacheCounters& Datastore::getValueCacheCounters() const;
// All fields are monotonic counters, except 'currentInCacheValueQty'
struct ValueCacheCounters {
std::atomic<uint64_t> insertCallQty;
std::atomic<uint64_t> getCallQty;
std::atomic<uint64_t> removeCallQty;
std::atomic<uint32_t> currentInCacheValueQty; // Not a monotonic counter
std::atomic<uint64_t> hitQty;
std::atomic<uint64_t> missQty;
std::atomic<uint64_t> evictedQty;
};
Return value | Comment |
---|---|
ValueCacheCounters |
A constant reference on the internal value cache counters structure |
DataFileStats Datastore::getFileStats(...)
- Get data file statistics
DataFileStats Datastore::getFileStats() const
// Aggregated statistics over all data files
struct DataFileStats {
uint64_t fileQty;
uint64_t entries; // Includes all entries, including tomb and dead
uint64_t entryBytes;
uint64_t tombBytes;
uint64_t tombEntries; // A 'tomb' entry marks an entry deletion
uint64_t deadBytes;
uint64_t deadEntries; // An obsolete entry superseded by a newer one
};
Return value | Comment |
---|---|
DataFileStats |
A data file statistics structure |
Supported OS:
- Linux
- Windows
Note: in our tests, performance on Windows are lower than on Linux.
Description | Limit |
---|---|
Maximum key size | 65534 bytes |
Maximum entry qty | System memory dependent Approximate cost per entry is: key size + 60 bytes |
Maximum value size | 4294901760 (0xFFFF0000) bytes |
Maximum datastore size | File system dependent - Data file handles: maximum 65535 data files - Disk space: each data file can be up to 4 GB, for a total of 256 TB |
Maximum index quantity per entry | 64 |
Indexable part of the key | First 256 bytes (storage efficiency reasons) |
Also this project is young, feedback is welcome!
Litecask source code is available under the MIT license
Associated components:
- Hash function:
Wyhash
- Selected for its good non-cryptographic properties, speed and small code size
- Released in the public domain
- Test framework:
doctest
- Released under the MIT license