PeerStore Persistence #591
Labels
exp/expert
Having worked on the specific codebase is important
kind/enhancement
A net-new feature or improvement to an existing feature
status/ready
Ready to be worked
PeerStore Persistence
As part of the PeerStore improvements epic, we intend to back the PeerStore with a datastore.
Sub-milestones
Overview
Centralizing all the information a peer has about its environment enables us to easily persist this data. This is particularly useful when we need to restart a node as we will be able to start establishing connections with peers that we know (other than the bootstrap nodes).
This should have a big impact on browser nodes, since they have less ways of discovering nodes and these discovery services sometimes take a longer times. Moreover, the nodes will have a bigger set of connected nodes considerably faster.
Other than faster connectivity, a persisted peerStore enables us to rely less in the bootstrap nodes (via configuration), so that we reduce the load on them.
Implementation Design
What to persist
The PeerStore is composed by 4 different components,
addressBook
,KeyBook
,metadataBook
andprotoBook
. While some of the content of these books are super relevant to persist, others might not have a clear value.The
addressBook
contains a list ofmultiaddr
for each peer, as well as some relevant data for eachmultiaddr
, including their validity, degree and confidence, .... While themultiaddr
are super valuable, the remaining two are discussable. Considering that we will need to dial the peers when the peer restarts, the validity will be updated, so it is not that relevant to store. On the other side, the degree of confidence can have a good impact if we have multiple multiaddresses for the peer, since we will dial multiple ones in parallel.The
keyBook
content must be stored as it is crucial for the correct work of the system.The
metadataBook
may contain previously set metadata about each peer. As a result, this information should still exist when the peer is restarted.The
protoBook
contains the list of protocols supported by each peer. As we will need to establish connections with the peers, we can run the identify service to get the updated list of protocols they support and multiaddrs they are listening on. Therefore, this information does not seem crucial to store. However, it could potentially be important if we had a large number of peers. In this context, we could choose the best peers to connect based on the protocols they run (but the information can also be outdated).How to persist
A datastore stores the data in a key-value fashion. As a result, we need coherent keys so that we do not overwrite data.
A datastore allows us to query it through a key prefix. This way, we can find all the information if we define a consistent namespace that allow us to find the content without having any information.
The namespaces was defined as follows:
AddressBook
All the knownw peer addresses are stored with a key pattern as follows:
/peers/addrs/<b32 peer id no padding>
ProtoBook
All the knownw peer protocols are stored with a key pattern as follows:
/peers/protos/<b32 peer id no padding>
KeyBook
All public keys are stored under the following pattern:
/peers/keys/<b32 peer id no padding>
MetadataBook
Metadata is stored under the following key pattern:
/peers/metadata/<b32 peer id no padding>/<key>
Configuration
A user should be able to choose a datastore compatible with the interface-datastore to store the data.
Even though we can set a default on the information that we recomment to persist. Users should be able to persist the data that they want to. This way, libp2p should allow a custom persistence module.
With the persisted datastore, libp2p should provide a way to configure the
libp2p-bootstrap
nodes to only dial those peers if needed, or to only dial a subset of them according to a metric or percentage.The text was updated successfully, but these errors were encountered: