Skip to content

Commit

Permalink
[docs] Add more details about bootstrapping process to a readme (#3203)
Browse files Browse the repository at this point in the history
  • Loading branch information
vpranckaitis authored Feb 24, 2021
1 parent 568e70a commit 812d585
Showing 1 changed file with 74 additions and 0 deletions.
74 changes: 74 additions & 0 deletions src/dbnode/storage/bootstrap/bootstrapper/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,12 +2,74 @@

The collection of bootstrappers comprise the task executed when bootstrapping a node.

## A high level overview of bootstrapping process

During the bootstrapping process, the list of configured bootstrappers is iterated, invoking the bootstrap routine of each of them. Each bootstrapper is given time ranges for which data needs to be bootstrapped. It acknowledges, retrieves, processes and/or persists the data that is available to it, and returns a subset of requested time ranges that it was unable to fulfill.

If there are any errors or unfulfilled time ranges after going through all the configured bootstrappers, the process is retried.

See pseudo-code below for more details.

```python
bootstrappers = {filesystem, commitlog, peer, uninitialized}

while true:
namespaces = getOwnedNamespaces()
shards = filterNonBootstrapped(getOwnedShards()) # NOTE: peer bootstrapper takes INITIALIZING shards

# shard time ranges which needs to be bootstrapped
persistRange, memoryRange =
timeRangesFromRetentionPeriodToCurrentBlock(namespaces, shards)

bootstrapResult = newBootstrapResult()
for r in {persistRange, memoryRange}:
remainingRanges = r
# iterate through the configured bootstrappers
for b in bootstrappers:
availableRanges = getAvailableRanges(b, namespaces, remainingRanges)
bootstrappedRanges = bootstrap(b, namespaces, availableRanges)
remainingRanges -= bootstrappedRanges

# record unfulfilled ranges
unfulfilledRanges = remainingRanges
updateResult(bootstrapResult, unfulfilledRanges)

bootstrapNamespaces(namespaces, bootstrapResults)
bootstrapShards(shards, bootstrapResult)

if hasNoErrors(bootstrapResult) and allRangesFulfilled(bootstrapResult):
break
```

## Bootstrappers

- `fs`: The filesystem bootstrapper, used to bootstrap as much data as possible from the local filesystem.
- `peers`: The peers bootstrapper, used to bootstrap any remaining data from peers. This is used for a full node join too.
- `commitlog`: The commit log bootstrapper, currently only used in the case that peers bootstrapping fails. Once the current block is being snapshotted frequently to disk it might be faster and make more sense to not actively use the peers bootstrapper and just use a combination of the filesystem bootstrapper and the minimal time range required from the commit log bootstrapper.
- *NOTE*: the commitlog bootstrapper is special cased in that it runs for the *entire* bootstrappable range per shard whereas other bootstrappers fill in the unfulfilled gaps as bootstrapping progresses.
- `uninitialized`: The uninitialized bootstrapper, used to bootstrap a node when the whole cluster is new and there are no peers to fetch the data from.

The bootstrappers satisfy the `bootstrap.Source` interface. `AvailableData()` and `AvailableIndex()` methods take shard time ranges that need to be bootstrapped and return which ranges (possibly a subset) can be fulfilled by this bootstrapper. `Read()` method does the bootstrapping of the available data and index ranges.

### Filesystem

- `AvailableData()`, `AvailableIndex()` - for each shard, reads info files (or cached data of info files) and converts block start times to available ranges
- `Read()` - reads data ranges from info files without reading the data, then builds index segments and either flushes them to disk or keeps in memory

### Commit log

- `AvailableData()`, `AvailableIndex()` - checks which shards have ever reached availability (i.e. is in `Available` or `Leaving` state) and returns the whole requested time range for those shards
- `Read()` - for each shard reads the most recent snapshot file, and then reads commit log and checks-out all the series belonging to the namespaces that are being bootstrapped

### Peer

- `AvailableData()`, `AvailableIndex()` - inspects the cluster topology and returns the whole requested time range for shards which have enough available replicas to satisfy consistency requirements
- `Read()` - fetches shard data from peers and either persists it or checks-out into memory. When fetching, block checksums are compared between peers: if they match, data is retrieved from one of the peers, otherwise data from multiple peers is merged. Then it builds index segments and either flushes them to disk or keeps in memory

### Uninitialized

- `AvailableData()`, `AvailableIndex()` - for each shard, inspects its status across peers. If the number of `Initializing` replicas is higher than `Leaving`, the shard is deemed _new_ and available to be bootstrapped by this bootstrapper
- `Read()` - for each shard that is new (as described above), respond that it was fulfilled

## Cache policies

Expand Down Expand Up @@ -47,3 +109,15 @@ For example, see the following sequences:
- Bootstrap (128 shards) // These are received form Node 2, it owns 256 now.
- Node 2:
- Node remove

### Node add

When a node is added to the cluster it is assigned shards that relieves load fairly from the existing nodes. The shards assigned to the new node will become _INITIALIZING_, the nodes then discover they need to be bootstrapped and will begin bootstrapping the data using all replicas available. The shards that will be removed from the existing nodes are marked as _LEAVING_.

### Node down

A node needs to be explicitly taken out of the cluster. If a node goes down and is unavailable the clients performing reads will be served an error from the replica for the shard range that the node owns. During this time it will rely on reads from other replicas to continue uninterrupted operation.

### Node remove

When a node is removed the shards it owns are assigned to existing nodes in the cluster. Remaining servers discover they are now in possession of shards that are _INITIALIZING_ and need to be bootstrapped and will begin bootstrapping the data using all replicas available.

0 comments on commit 812d585

Please sign in to comment.