Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add "Pivot Key" RFC Draft #29

Merged
merged 11 commits into from
May 15, 2023
Merged

Add "Pivot Key" RFC Draft #29

merged 11 commits into from
May 15, 2023

Conversation

fia0
Copy link

@fia0 fia0 commented Dec 13, 2022

  • Title: Pivot Keys
  • Status: DRAFT

Summary/Motivation

This RFC adds an identifier which allows node defined access to structural
elements in the B-epsilon tree. This measure allows us to perform operations
which are more dependent on the actual present layout of the tree rather than
the semantic content which is stored by individual keys. Furthermore, in
combination with the recently added migration policies we gainthe option to
migrate singular nodes and improve the reporting scheme by using more stable
identifiers compared to the previously used DiskOffsets.

Description

        p1      p2      p3
┌───────┬───────┬───────┬───────┐
│       │       │       │       │
│ Left  │ Right │ Right │ Right │
│ Outer │       │       │       │
└───┬───┴───┬───┴───┬───┴───┬───┘
    │       │       │       │
    ▼       ▼       ▼       ▼

This RFC proposes a new identification for nodes within the B-epsilon tree. We
design it to be as uninstrusive as possible to the struture and current
implemenation of the tree to avoid undesired side effects later on. This
includes reducing the dependency on state as we show in the description here.
The basis of "PivotKey" are pivot elements already present in the tree. We use
an enum indicating the position of the node based on its parent.

type Pivot = CowBytes;

enum LocalPivotKey {
    LeftOuter(Pivot),
    Right(Pivot),
    Root,
}

The property we are piggy-backing on is that pivots are searchable and unique,
furthermore we can structurally define PivotKey directions which keeps the
required memory space relatively low as only a single pivot key is required.
Finally, the pivot keys are more persistent than normal keys in the collection,
as they are only refreshed when the strcture of the tree changes. This might
happen on rebalancing. Although, when we consider this than any node based
algorithm needs to reconsider decisions anyway.

To make the pivot key ready to use over all datasets in a database (which can
have overlapping key sets) we require an additional information to direct the
pivot key to the correct dataset. This can be done by adding a DatasetId to
the key.

type Pivot = CowBytes;

enum PivotKey {
    LeftOuter(Pivot, DatasetId),
    Right(Pivot, DatasetId),
    Root(DatasetId),
}

Also, as the root node of a tree does not have a parent which could index the
node by its pivot - we add another variant which simply denotes that the root
of a given dataset is to be chosen.

We propose that we internally use two kinds of pivot keys. First, the global
PivotKey, which is structured as shown above including the DatasetId.
Second, the scoped LocalPivotKey, which can offer of us some advantages when
designing interfaces for node operations, which do not have the knowledge in
which tree they are located in. This alleviates the need for passing around
DatasetIds to layers which are normally unaffected by it. The transformation
from LocalPivotKey to PivotKey is always possible, as all local keys which a
tree layer will encounter belong to the tree itself, the reverse direction is
not given to be valid and should therefore be excluded in the implementation.

Integration

The given PivotKey structure can be used in ObjRef to add a measure of
identification to the cache location or disk location.
We can then use the PivotKey of a tree node to perform operations on specific
nodes which fulfill certain conditions such as access frequency or access
probability. This is helpful in a number of scenarios such as data prefetching
or disk-to-disk migrations.

Since pivots are stored in the internal nodes we are required to read a
substantial amount of data to retrieve knowledge about all exisiting pivot keys.
This limits the efficient usage to scenarios in which we retrieve pivot keys,
for example from messages emitted by the DMU, and record these as they are used
by the user. Previously a similar scheme has been done by the migration policies
which recorded disk offsets and set hints to the DMU to which tier a node is
advised to be written.
This limited hints to often accessed nodes which are likely to be migrated to
faster storage, as not often accessed nodes would not encounter the sent hints.
With PivotKeys we could actively migrate them downwards, which is the main
advantage of PivotKeys. This can be useful in scenarios with additional small
layers like NVRAM where we are then more flexible to migrate granular amounts of
data for better tier utilization.

Drawbacks

The implementation does not need to require extra members in the node as we can
generate the valid PivotKeys when reading the tree from disk. Although, this
creates with the current deserialization scheme (deserializing directly into
valid nodes) hidden states, either nodes with reconstructed PivotKey or nodes
missing valid PivotKeys which would encounter errors when trying to actual
read data from their children. To avoid this the deserialization scheme would
need to be adjusted to serialize proto-types "InternalNodeProto" or whichever
name which would need to be validated via a transformation InternalNodeProto
-> InternalNode. This makes the deserialization code a bit more involved,
but more clear to avoid misuse.

Alternatives

The search for alternatives is difficult as not many characteristics are present
which allow nodes to be identifiable and searchable. Something like Ids for
example could make nodes identifiable but from an Id we cannot search for the
node in the tree.

An alternative to the pivot method could be the path construction which relies
on the structural uniqueness of the tree, in which we construct a path from the
root by giving directions for each step downwards until the desired node is
located. It can look similar to this:

enum Path {
    Child(usize, Path),
    Terminate,
}

This method does not provide many advantages and carries the disadvantage of
having to maintain additional state when constructing keys about the just
traversed distance. Arguably, this is semantic-wise not complicated but many
methods will be affected by this change. Also, misidentification may become
possible as with reconstruction of subtrees paths may be shifted around. With
PivotKeys these will result in a failed search indicating an outdated key.
Currently the only advantage this method would have to the PivotKey method is
that the size can be expected to remain comparatively low, with 5 bytes for
each element in the search path. A restriction of key size as discussed in the
corresponding issue
could solve
this problem and is already in discussion.

@fia0 fia0 mentioned this pull request Dec 14, 2022
@fia0 fia0 mentioned this pull request Jan 16, 2023
Copy link
Member

@michaelkuhn michaelkuhn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should also have a section "Motivation" (why do we want to do this? what problems does it solve?)? Starting with the description (which is very technical) might be a bit overwhelming for people not deeply familiar with the topic. The purpose section is similar but a bit too abstract.

docs/src/rfc/1-pivotkey.md Outdated Show resolved Hide resolved
docs/src/rfc/1-pivotkey.md Outdated Show resolved Hide resolved
docs/src/rfc/1-pivotkey.md Outdated Show resolved Hide resolved
@fia0
Copy link
Author

fia0 commented Jan 18, 2023

Yeah, the "What Problem does this solve" is a bit scatter over the text at the moment, mainly in the "Purpose" section. We can congregate this probably in a 4-5 sentence summary/motivation which explains why we open this RFC and what we do.

We can probably also change the "Purpose" section to something like "Integration" to explain how it complements or reworks other components of the stack.

So the structure would look something like this:

  • Title
  • Status
  1. Summary/Motivation
  2. Description
  3. Integration (?)
  4. Drawbacks
  5. Alternatives

Copy link
Collaborator

@SajadKarim SajadKarim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This RFC proposes a new way of identifying nodes in the B-epsilon tree, called PivotKey and LocalPivotKey, that would allow for operations that are more dependent on the actual or present layout of the tree. PivotKey relies on the existing pivot elements in the tree as its foundation. Also, with the recent addition of migration policies, migration of singular nodes and the reporting scheme would improve by using more this new stable identifier than the previously used DiskOffsets.

@fia0 fia0 merged commit bc01684 into parcio:main May 15, 2023
@fia0 fia0 deleted the rfc-draft-draft branch September 20, 2023 06:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants