Add "Pivot Key" RFC Draft #29

fia0 · 2022-12-13T12:25:07Z

Title: Pivot Keys
Status: DRAFT

Summary/Motivation

This RFC adds an identifier which allows node defined access to structural
elements in the B-epsilon tree. This measure allows us to perform operations
which are more dependent on the actual present layout of the tree rather than
the semantic content which is stored by individual keys. Furthermore, in
combination with the recently added migration policies we gainthe option to
migrate singular nodes and improve the reporting scheme by using more stable
identifiers compared to the previously used DiskOffsets.

Description

        p1      p2      p3
┌───────┬───────┬───────┬───────┐
│       │       │       │       │
│ Left  │ Right │ Right │ Right │
│ Outer │       │       │       │
└───┬───┴───┬───┴───┬───┴───┬───┘
    │       │       │       │
    ▼       ▼       ▼       ▼

This RFC proposes a new identification for nodes within the B-epsilon tree. We
design it to be as uninstrusive as possible to the struture and current
implemenation of the tree to avoid undesired side effects later on. This
includes reducing the dependency on state as we show in the description here.
The basis of "PivotKey" are pivot elements already present in the tree. We use
an enum indicating the position of the node based on its parent.

type Pivot = CowBytes;

enum LocalPivotKey {
    LeftOuter(Pivot),
    Right(Pivot),
    Root,
}

The property we are piggy-backing on is that pivots are searchable and unique,
furthermore we can structurally define PivotKey directions which keeps the
required memory space relatively low as only a single pivot key is required.
Finally, the pivot keys are more persistent than normal keys in the collection,
as they are only refreshed when the strcture of the tree changes. This might
happen on rebalancing. Although, when we consider this than any node based
algorithm needs to reconsider decisions anyway.

To make the pivot key ready to use over all datasets in a database (which can
have overlapping key sets) we require an additional information to direct the
pivot key to the correct dataset. This can be done by adding a DatasetId to
the key.

type Pivot = CowBytes;

enum PivotKey {
    LeftOuter(Pivot, DatasetId),
    Right(Pivot, DatasetId),
    Root(DatasetId),
}

Also, as the root node of a tree does not have a parent which could index the
node by its pivot - we add another variant which simply denotes that the root
of a given dataset is to be chosen.

We propose that we internally use two kinds of pivot keys. First, the global
PivotKey, which is structured as shown above including the DatasetId.
Second, the scoped LocalPivotKey, which can offer of us some advantages when
designing interfaces for node operations, which do not have the knowledge in
which tree they are located in. This alleviates the need for passing around
DatasetIds to layers which are normally unaffected by it. The transformation
from LocalPivotKey to PivotKey is always possible, as all local keys which a
tree layer will encounter belong to the tree itself, the reverse direction is
not given to be valid and should therefore be excluded in the implementation.

Integration

The given PivotKey structure can be used in ObjRef to add a measure of
identification to the cache location or disk location.
We can then use the PivotKey of a tree node to perform operations on specific
nodes which fulfill certain conditions such as access frequency or access
probability. This is helpful in a number of scenarios such as data prefetching
or disk-to-disk migrations.

Since pivots are stored in the internal nodes we are required to read a
substantial amount of data to retrieve knowledge about all exisiting pivot keys.
This limits the efficient usage to scenarios in which we retrieve pivot keys,
for example from messages emitted by the DMU, and record these as they are used
by the user. Previously a similar scheme has been done by the migration policies
which recorded disk offsets and set hints to the DMU to which tier a node is
advised to be written.
This limited hints to often accessed nodes which are likely to be migrated to
faster storage, as not often accessed nodes would not encounter the sent hints.
With PivotKeys we could actively migrate them downwards, which is the main
advantage of PivotKeys. This can be useful in scenarios with additional small
layers like NVRAM where we are then more flexible to migrate granular amounts of
data for better tier utilization.

Drawbacks

The implementation does not need to require extra members in the node as we can
generate the valid PivotKeys when reading the tree from disk. Although, this
creates with the current deserialization scheme (deserializing directly into
valid nodes) hidden states, either nodes with reconstructed PivotKey or nodes
missing valid PivotKeys which would encounter errors when trying to actual
read data from their children. To avoid this the deserialization scheme would
need to be adjusted to serialize proto-types "InternalNodeProto" or whichever
name which would need to be validated via a transformation InternalNodeProto
-> InternalNode. This makes the deserialization code a bit more involved,
but more clear to avoid misuse.

Alternatives

The search for alternatives is difficult as not many characteristics are present
which allow nodes to be identifiable and searchable. Something like Ids for
example could make nodes identifiable but from an Id we cannot search for the
node in the tree.

An alternative to the pivot method could be the path construction which relies
on the structural uniqueness of the tree, in which we construct a path from the
root by giving directions for each step downwards until the desired node is
located. It can look similar to this:

enum Path {
    Child(usize, Path),
    Terminate,
}

This method does not provide many advantages and carries the disadvantage of
having to maintain additional state when constructing keys about the just
traversed distance. Arguably, this is semantic-wise not complicated but many
methods will be affected by this change. Also, misidentification may become
possible as with reconstruction of subtrees paths may be shifted around. With
PivotKeys these will result in a failed search indicating an outdated key.
Currently the only advantage this method would have to the PivotKey method is
that the size can be expected to remain comparatively low, with 5 bytes for
each element in the search path. A restriction of key size as discussed in the
corresponding issue could solve
this problem and is already in discussion.

michaelkuhn

Maybe we should also have a section "Motivation" (why do we want to do this? what problems does it solve?)? Starting with the description (which is very technical) might be a bit overwhelming for people not deeply familiar with the topic. The purpose section is similar but a bit too abstract.

docs/src/rfc/1-pivotkey.md

fia0 · 2023-01-18T09:14:34Z

Yeah, the "What Problem does this solve" is a bit scatter over the text at the moment, mainly in the "Purpose" section. We can congregate this probably in a 4-5 sentence summary/motivation which explains why we open this RFC and what we do.

We can probably also change the "Purpose" section to something like "Integration" to explain how it complements or reworks other components of the stack.

So the structure would look something like this:

Title
Status

Summary/Motivation
Description
Integration (?)
Drawbacks
Alternatives

docs/src/rfc/1-pivotkey.md

SajadKarim

This RFC proposes a new way of identifying nodes in the B-epsilon tree, called PivotKey and LocalPivotKey, that would allow for operations that are more dependent on the actual or present layout of the tree. PivotKey relies on the existing pivot elements in the tree as its foundation. Also, with the recent addition of migration policies, migration of singular nodes and the reporting scheme would improve by using more this new stable identifier than the previously used DiskOffsets.

docs: add "Pivot Key" rfc draft

99ad322

fia0 mentioned this pull request Dec 14, 2022

tree: cleanup #25

Merged

Johannes Wünsche added 6 commits December 19, 2022 16:27

docs: Extend purpose pivot key rfc

07dd8ae

docs: add dataset discriminator to pivot key proposal

2f0aee9

rfc: rework variants

8d88fd9

rfc: add missing text block tag

56abfff

docs: add root variant to base version

06693e0

docs: both variants exists

a65aa9e

fia0 mentioned this pull request Jan 16, 2023

tree: Add Pivot Key #33

Merged

michaelkuhn reviewed Jan 17, 2023

View reviewed changes

docs/src/rfc/1-pivotkey.md Outdated Show resolved Hide resolved

docs/src/rfc/1-pivotkey.md Outdated Show resolved Hide resolved

docs/src/rfc/1-pivotkey.md Outdated Show resolved Hide resolved

Johannes Wünsche added 4 commits January 18, 2023 10:16

rfc: fix typos

30513b4

rfc: update PivotKey drawbacks

0bf86fd

rfc: Add PivotKey Motivation

43a4f35

rfc: Reword PivotKey "Integration"

d1956f5

SajadKarim reviewed Feb 27, 2023

View reviewed changes

docs/src/rfc/1-pivotkey.md Show resolved Hide resolved

SajadKarim approved these changes May 14, 2023

View reviewed changes

fia0 merged commit bc01684 into parcio:main May 15, 2023

fia0 deleted the rfc-draft-draft branch September 20, 2023 06:56

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add "Pivot Key" RFC Draft #29

Add "Pivot Key" RFC Draft #29

fia0 commented Dec 13, 2022 •

edited

Loading

michaelkuhn left a comment

fia0 commented Jan 18, 2023

SajadKarim left a comment

Add "Pivot Key" RFC Draft #29

Add "Pivot Key" RFC Draft #29

Conversation

fia0 commented Dec 13, 2022 • edited Loading

Summary/Motivation

Description

Integration

Drawbacks

Alternatives

michaelkuhn left a comment

Choose a reason for hiding this comment

fia0 commented Jan 18, 2023

SajadKarim left a comment

Choose a reason for hiding this comment

fia0 commented Dec 13, 2022 •

edited

Loading