Skip to content

Commit

Permalink
add README
Browse files Browse the repository at this point in the history
  • Loading branch information
ckampfe committed May 23, 2024
1 parent bb075a4 commit ac931f4
Showing 1 changed file with 42 additions and 1 deletion.
43 changes: 42 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
@@ -1 +1,42 @@
# b2
# b2

An implementation of [Bitcask](https://riak.com/assets/bitcask-intro.pdf) as a Rust library.

## what can it do

Bitcask presents a very simple KV store API, so that is what this does, too.
See the Bitcask paper to understand why Bitcask's particular conception of a KV store is unique and interesting.

This is the public API:

```rust
pub async fn new(db_directory: &Path, options: Options) -> Result<Self>
pub async fn get<V: Serialize + DeserializeOwned + Send>(&self, key: &K) -> Result<Option<V>>
pub async fn insert<V: Serialize + DeserializeOwned + Send>(&self, k: K, v: V) -> Result<()>
pub async fn remove(&self, k: K) -> Result<()>
pub async fn contains_key(&self, k: &K) -> bool
pub async fn merge(&self) -> Result<()>
pub async fn flush(&self) -> Result<()>
```

For a given database, keys must all be the same type (i.e., all `String`, or whatever other type can implement `Serialize` and `DeserializeOwned`), but values can vary arbitrarily, again as long as they can be serialized and deserialized. This could change at some point, who knows. See the tests for examples of this.

In terms of concurrency, right now this uses a `tokio::sync::RwLock`, so there can be: `(N readers) XOR (1 writer)`. Given Bitcask's model, it is possible to relax this so that there can be `(N readers) AND (1 writer)`, and I might do that in the future.

By default it flushes every write to disk. This is slow, but leads to predictable read-after-write semantics. You can relax this (and increase write throughput) by changing an option.

## is it any good? should I use it?

Probably not! From what I can tell, it is API complete with respect to the Bitcask paper. This does not mean it functions correctly. It is undertested. It uses a simple `tokio::sync::RwLock` internally so its concurrency story is weak. There are probably a bunch of other problems with it. Nonetheless, it is a tiny amount of code in comparison to other database systems, you can probably actually understand what this does just by reading the source.

## why

I have known about Bitcask for a while, and I wanted to learn it by building a working implementation.

## todo

- [ ] better testing (in general)
- [ ] better testing (around merging, specifically)
- [ ] allow concurrent reading and writing (relax RwLock)
- [ ] clean up merging code
- [ ] clean up datamodel around records/entrypointers/mergepointers

0 comments on commit ac931f4

Please sign in to comment.