-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add "Pivot Key" RFC Draft #29
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we should also have a section "Motivation" (why do we want to do this? what problems does it solve?)? Starting with the description (which is very technical) might be a bit overwhelming for people not deeply familiar with the topic. The purpose section is similar but a bit too abstract.
Yeah, the "What Problem does this solve" is a bit scatter over the text at the moment, mainly in the "Purpose" section. We can congregate this probably in a 4-5 sentence summary/motivation which explains why we open this RFC and what we do. We can probably also change the "Purpose" section to something like "Integration" to explain how it complements or reworks other components of the stack. So the structure would look something like this:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This RFC proposes a new way of identifying nodes in the B-epsilon tree, called PivotKey and LocalPivotKey, that would allow for operations that are more dependent on the actual or present layout of the tree. PivotKey relies on the existing pivot elements in the tree as its foundation. Also, with the recent addition of migration policies, migration of singular nodes and the reporting scheme would improve by using more this new stable identifier than the previously used DiskOffsets.
Summary/Motivation
This RFC adds an identifier which allows node defined access to structural
elements in the B-epsilon tree. This measure allows us to perform operations
which are more dependent on the actual present layout of the tree rather than
the semantic content which is stored by individual keys. Furthermore, in
combination with the recently added migration policies we gainthe option to
migrate singular nodes and improve the reporting scheme by using more stable
identifiers compared to the previously used
DiskOffset
s.Description
This RFC proposes a new identification for nodes within the B-epsilon tree. We
design it to be as uninstrusive as possible to the struture and current
implemenation of the tree to avoid undesired side effects later on. This
includes reducing the dependency on state as we show in the description here.
The basis of "PivotKey" are pivot elements already present in the tree. We use
an enum indicating the position of the node based on its parent.
The property we are piggy-backing on is that pivots are searchable and unique,
furthermore we can structurally define
PivotKey
directions which keeps therequired memory space relatively low as only a single pivot key is required.
Finally, the pivot keys are more persistent than normal keys in the collection,
as they are only refreshed when the strcture of the tree changes. This might
happen on rebalancing. Although, when we consider this than any node based
algorithm needs to reconsider decisions anyway.
To make the pivot key ready to use over all datasets in a database (which can
have overlapping key sets) we require an additional information to direct the
pivot key to the correct dataset. This can be done by adding a
DatasetId
tothe key.
Also, as the root node of a tree does not have a parent which could index the
node by its pivot - we add another variant which simply denotes that the root
of a given dataset is to be chosen.
We propose that we internally use two kinds of pivot keys. First, the global
PivotKey
, which is structured as shown above including theDatasetId
.Second, the scoped
LocalPivotKey
, which can offer of us some advantages whendesigning interfaces for node operations, which do not have the knowledge in
which tree they are located in. This alleviates the need for passing around
DatasetIds
to layers which are normally unaffected by it. The transformationfrom
LocalPivotKey
toPivotKey
is always possible, as all local keys which atree layer will encounter belong to the tree itself, the reverse direction is
not given to be valid and should therefore be excluded in the implementation.
Integration
The given
PivotKey
structure can be used inObjRef
to add a measure ofidentification to the cache location or disk location.
We can then use the
PivotKey
of a tree node to perform operations on specificnodes which fulfill certain conditions such as access frequency or access
probability. This is helpful in a number of scenarios such as data prefetching
or disk-to-disk migrations.
Since pivots are stored in the internal nodes we are required to read a
substantial amount of data to retrieve knowledge about all exisiting pivot keys.
This limits the efficient usage to scenarios in which we retrieve pivot keys,
for example from messages emitted by the DMU, and record these as they are used
by the user. Previously a similar scheme has been done by the migration policies
which recorded disk offsets and set hints to the DMU to which tier a node is
advised to be written.
This limited hints to often accessed nodes which are likely to be migrated to
faster storage, as not often accessed nodes would not encounter the sent hints.
With
PivotKey
s we could actively migrate them downwards, which is the mainadvantage of
PivotKey
s. This can be useful in scenarios with additional smalllayers like NVRAM where we are then more flexible to migrate granular amounts of
data for better tier utilization.
Drawbacks
The implementation does not need to require extra members in the node as we can
generate the valid
PivotKey
s when reading the tree from disk. Although, thiscreates with the current deserialization scheme (deserializing directly into
valid nodes) hidden states, either nodes with reconstructed
PivotKey
or nodesmissing valid
PivotKey
s which would encounter errors when trying to actualread data from their children. To avoid this the deserialization scheme would
need to be adjusted to serialize proto-types "InternalNodeProto" or whichever
name which would need to be validated via a transformation
InternalNodeProto
->
InternalNode
. This makes the deserialization code a bit more involved,but more clear to avoid misuse.
Alternatives
The search for alternatives is difficult as not many characteristics are present
which allow nodes to be identifiable and searchable. Something like Ids for
example could make nodes identifiable but from an Id we cannot search for the
node in the tree.
An alternative to the pivot method could be the path construction which relies
on the structural uniqueness of the tree, in which we construct a path from the
root by giving directions for each step downwards until the desired node is
located. It can look similar to this:
This method does not provide many advantages and carries the disadvantage of
having to maintain additional state when constructing keys about the just
traversed distance. Arguably, this is semantic-wise not complicated but many
methods will be affected by this change. Also, misidentification may become
possible as with reconstruction of subtrees paths may be shifted around. With
PivotKey
s these will result in a failed search indicating an outdated key.Currently the only advantage this method would have to the
PivotKey
method isthat the size can be expected to remain comparatively low, with 5 bytes for
each element in the search path. A restriction of key size as discussed in the
corresponding issue could solve
this problem and is already in discussion.