Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: inter-block cache specification #14370

Merged
merged 9 commits into from
Jan 16, 2023
294 changes: 294 additions & 0 deletions docs/spec/store/interblock-cache.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,294 @@
# Inter-block cache
tac0turtle marked this conversation as resolved.
Show resolved Hide resolved
- [Inter-block cache](#inter-block-cache)
- [Synopsis](#synopsis)
- [Overview and basic concepts](#overview-and-basic-concepts)
- [Motivation](#motivation)
- [Definitions](#definitions)
- [System model and properties](#system-model-and-properties)
- [Assumptions](#assumptions)
- [Properties](#properties)
- [Thread safety](#thread-safety)
- [Crash recovery](#crash-recovery)
- [Iteration](#iteration)
- [Technical specification](#technical-specification)
- [General design](#general-design)
- [API](#api)
- [CommitKVCacheManager](#commitkvcachemanager)
- [CommitKVStoreCache](#commitkvstorecache)
- [Implementation details](#implementation-details)
- [History](#history)
- [Copyright](#copyright)


## Synopsis

The inter-block cache is an in-memory cache storing (in-most-cases) immutable state that modules need to read in between blocks. When enabled, all sub-stores of a multi store, e.g., `rootmulti`, are wrapped.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we drop (in-most-cases)?

Suggested change
The inter-block cache is an in-memory cache storing (in-most-cases) immutable state that modules need to read in between blocks. When enabled, all sub-stores of a multi store, e.g., `rootmulti`, are wrapped.
The inter-block cache is an in-memory cache storing immutable state that modules need to read in between blocks. When enabled, all sub-stores of a multi store, e.g., `rootmulti`, are wrapped.

Copy link
Contributor Author

@angbrav angbrav Jan 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I wonder if there is any guarantee that the inter-block cache stores the immutable state. Thoughts on what enables this? Is it the ARC? If so, I will highlight this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not immutable. If a key is updated, the cache will be updated.

Copy link
Contributor Author

@angbrav angbrav Jan 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know it is not immutable. My point is whether there is anything in place that in case of reaching the max capacity of the cache, it is likely that immutable state remains cache. My guess is that the fact that the cache is an ARC may help with that: it tracks both frequency and recency of use. Thus, under the assumption that immutable state is more frequently queried, the ARC may help guaranteeing that this is almost always cached, even when max capacity is reached.

If we agree that the above is correct, I will highlight that it is important that the cache implementation is an ARC (or something similar that enables the above), instead of just discussing it as an implementation detail.

Also, I want to get rid of the word immutable, it is confusing: nothing is immutable per se, we only mean keys that a rarely updated.


## Overview and basic concepts

### Motivation

The goal of the inter-block cache is to allow SDK modules to have fast access to data that it is typically queried during the execution of every block. This is data that do not change often, e.g., configuration parameters. The inter-block cache wraps each `CommitKVStore` of a multi store such as `rootmulti` with a fixed size, write-through cache. Caches are not cleared after a block is committed, as opposed to other caching layers such as `cachekv`.
tac0turtle marked this conversation as resolved.
Show resolved Hide resolved

### Definitions

`Store key` uniquely identifies a store.

`KVCache` is a `CommitKVStore` wrapped with a cache.

`Cache manager` is a key component of the inter-block cache responsible for maintaining a map from `store keys` to `KVCaches`.
tac0turtle marked this conversation as resolved.
Show resolved Hide resolved

## System model and properties

### Assumptions

This specification assumes that there exists a cache implementation accessible to the inter-block cache feature.

> The implementation uses adaptive replacement cache (ARC), an enhancement over the standard last-recently-used (LRU) cache in that tracks both frequency and recency of use.

The inter-block cache requires that the cache implementation to provide methods to create a cache, add a key/value pair, remove a key/value pair and retrieve the value associated to a key. In this specification, we assume that a `Cache` feature offers this functionality through the following methods:

* `NewCache(size: int)` creates a new cache with `size` capacity and returns it.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* `NewCache(size: int)` creates a new cache with `size` capacity and returns it.
* `NewCache(size int)` creates a new cache with `size` capacity and returns it.

nit: Cannot we use Go syntax everywhere?

* `Get(key: string)` attempts to retrieve a key/value pair from `Cache.` It returns `[value: []byte, success: bool]`. If `Cache` contains the key, it `value` contains the associated value and `success=true`. Otherwise, `success=false` and `value` should be ignored.
* `Add(key: string, value: []byte)` inserts a key/value pair into the `Cache`.
* `Remove(key: string)` removes the key/value pair identified by `key` from `Cache`.

The specification also assumes that `CommitKVStore` offers the following API:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like here, why not simply display a Go interface?


* `Get(key: string)` attempts to retrieve a key/value pair from `CommitKVStore`.
* `Set(key, string, value: []byte)` inserts a key/value pair into the `CommitKVStore`.
* `Delete(key: string)` removes the key/value pair identified by `key` from `CommitKVStore`.

> Ideally, both `Cache` and `CommitKVStore` should be specified in a different document and referenced here.

### Properties

#### Thread safety

Accessing the `cache manager` or a `KVCache` is not thread-safe: no method is guarded with a lock.
Note that this is true even if the cache implementation is thread-safe.

> For instance, assume that two `Set` operations are executed concurrently on the same key, each writing a different value. After both are executed, the cache and the underlying store may be inconsistent, each storing a different value under the same key.

#### Crash recovery

The inter-block cache transparently delegates `Commit()` to its aggregate `CommitKVStore`. If the
aggregate `CommitKVStore` supports atomic writes and use them to guarantee that the store is always in a consistent state in disk, the inter-block cache can be transparently moved to a consistent state when a failure occurs.

> Note that this is the case for `IAVLStore`, the preferred `CommitKVStore`. On commit, it calls `SaveVersion()` on the underlying `MutableTree`. `SaveVersion` writes to disk are atomic via batching. This means that only consistent versions of the store (the tree) are written to the disk. Thus, in case of a failure during a `SaveVersion` call, on recovery from disk, the version of the store will be consistent.

#### Iteration

Iteration is not supported.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the implementation, since CommitKVStoreCache embeds a CommitKVStore, it does expose the CommitKVStore iteration API.

Since this is just a system model, I guess it's sound to underspecify here, but probably surprising to the reader?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added a sentence. Please check.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, thx!


## Technical specification

### General design

The inter-block cache feature is composed by two components: `CommitKVCacheManager` and `CommitKVCache`.

`CommitKVCacheManager` implements the cache manager. It maintains a mapping from a store key to a `KVStore`.

```typescript
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

interface CommitKVStoreCacheManager{
cacheSize: uint
caches: Map<string, CommitKVStore>
}
```

`CommitKVStoreCache` implements a `KVStore`: a write-through cache that wraps a `CommitKVStore`. This means that deletes and writes always happen to both the cache and the underlying `CommitKVStore`. Reads on the other hand first hit the internal cache. During a cache miss, the read is delegated to the underlying `CommitKVStore` and cached.

```typescript
interface CommitKVStoreCache{
store: CommitKVStore
cache: Cache
}
```

To enable inter-block cache on `rootmulti`, one needs to instantiate a `CommitKVCacheManager` and set it by calling `SetInterBlockCache()` before calling one of `LoadLatestVersion()`, `LoadLatestVersionAndUpgrade(...)`, `LoadVersionAndUpgrade(...)` and `LoadVersion(version)`.

### API

#### CommitKVCacheManager

The method `NewCommitKVStoreCacheManager` creates a new cache manager and returns it.

| Name | Type | Description |
| ------------- | ---------|------- |
| size | integer | Determines the capacity of each of the KVCache maintained by the manager |

```typescript
function NewCommitKVStoreCacheManager(
size: uint): CommitKVStoreCacheManager {

manager = CommitKVStoreCacheManager{size, new Map<string, CommitKVStore>()}
return manager
}
```

`GetStoreCache` returns a cache from the CommitStoreCacheManager for a given store key. If no cache exists for the store key, then one is created and set.

| Name | Type | Description |
| ------------- | ---------|------- |
| manager | `CommitKVStoreCacheManager` | The cache manager |
| storeKey | string | The store key of the store being retrieved |
| store | `CommitKVStore` | The store that it is cached in case the manager does not have any in its map of caches |

```typescript
function GetStoreCache(
manager: CommitKVStoreCacheManager,
storeKey: string,
store: CommitKVStore): CommitKVStore {

if manager.caches.has(storeKey) {
return manager.caches.get(storeKey)
} else {
cache = CommitKVStoreCacheManager{store, manager.cacheSize}
manager.set(storeKey, cache)
return cache
}
}
```

`Unwrap` returns the underlying CommitKVStore for a given store key.

| Name | Type | Description |
| ------------- | ---------|------- |
| manager | `CommitKVStoreCacheManager` | The cache manager |
| storeKey | string | The store key of the store being unwrapped |

```typescript
function Unwrap(
manager: CommitKVStoreCacheManager,
storeKey: string): CommitKVStore {

if manager.caches.has(storeKey) {
cache = manager.caches.get(storeKey)
return cache.store
} else {
return nil
}
}
```

`Reset` resets the manager's map of caches.

| Name | Type | Description |
| ------------- | ---------|------- |
| manager | `CommitKVStoreCacheManager` | The cache manager |

```typescript
function Reset(
manager: CommitKVStoreCacheManager) {

for (let storeKey of manager.caches.keys()) {
manager.caches.delete(storeKey)
}
}
```

#### CommitKVStoreCache

`NewCommitKVStoreCache` creates a new `CommitKVStoreCache` and returns it.

| Name | Type | Description |
| ------------- | ---------|------- |
| store | CommitKVStore | The store to be cached |
| size | string | Determines the capacity of the cache being created |

```typescript
function NewCommitKVStoreCache(
store: CommitKVStore,
size: uint): CommitKVStoreCache {
KVCache = CommitKVStoreCache{store, NewCache(size)}
return KVCache
}
```

`Get` retrieves a value by key. It first looks in the cache. If the key is not in the cache, the query is delegated to the underlying `CommitKVStore`. In the latter case, the key/value pair is cached. The method returns the value.

| Name | Type | Description |
| ------------- | ---------|------- |
| KVCache | `CommitKVStoreCache` | The `CommitKVStoreCache` from which the key/value pair is retrieved |
| key | string | Key of the key/value pair being retrieved |

```typescript
function Get(
KVCache: CommitKVStoreCache,
key: string): []byte {
[valueCache, success] = KVCache.cache.Get(key)
if success {
// cache hit
return valueCache
} else {
// cache miss
valueStore = KVCache.store.Get(key)
KVCache.cache.Add(key, valueStore)
return valueStore
}
}
```

`Set` inserts a key/value pair into both the write-through cache and the underlying `CommitKVStore`.

| Name | Type | Description |
| ------------- | ---------|------- |
| KVCache | `CommitKVStoreCache` | The `CommitKVStoreCache` to which the key/value pair is inserted |
| key | string | Key of the key/value pair being inserted |
| value | []byte | Value of the key/value pair being inserted |

```typescript
function Set(
KVCache: CommitKVStoreCache,
key: string,
value []byte) {

KVCache.cache.Add(key, value)
KVCache.store.Set(key, value)
}
```

`Delete` removes a key/value pair from both the write-through cache and the underlying `CommitKVStore`.

| Name | Type | Description |
| ------------- | ---------|------- |
| KVCache | `CommitKVStoreCache` | The `CommitKVStoreCache` from which the key/value pair is deleted |
| key | string | Key of the key/value pair being deleted |

```typescript
function Delete(
KVCache: CommitKVStoreCache,
key: string) {

KVCache.cache.Remove(key)
KVCache.store.Delete(key)
}
```

`CacheWrap` wraps a `CommitKVStoreCache` with another caching layer (`CacheKV`).

> It is unclear whether there is a use case for `CacheWrap`.

| Name | Type | Description |
| ------------- | ---------|------- |
| KVCache | `CommitKVStoreCache` | The `CommitKVStoreCache` being wrapped |

```typescript
function CacheWrap(
KVCache: CommitKVStoreCache) {

return CacheKV.NewStore(KVCache)
}
```

### Implementation details

The inter-block cache implementation uses a fixed-sized adaptive replacement cache (ARC) as cache. [The ARC implementation](https://github.com/hashicorp/golang-lru/blob/master/arc.go) is thread-safe. ARC is an enhancement over the standard LRU cache in that tracks both frequency and recency of use. This avoids a burst in access to new entries from evicting the frequently used older entries. It adds some additional tracking overhead to a standard LRU cache, computationally it is roughly `2x` the cost, and the extra memory overhead is linear with the size of the cache. The default cache size is `1000`.

## History

Dec 20, 2022 - Initial draft finished and submitted as a PR

## Copyright

All content herein is licensed under [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0).