Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PeerStore Persistence #591

Closed
vasco-santos opened this issue Mar 25, 2020 · 1 comment
Closed

PeerStore Persistence #591

vasco-santos opened this issue Mar 25, 2020 · 1 comment
Labels
exp/expert Having worked on the specific codebase is important kind/enhancement A net-new feature or improvement to an existing feature status/ready Ready to be worked

Comments

@vasco-santos
Copy link
Member

vasco-santos commented Mar 25, 2020

PeerStore Persistence

As part of the PeerStore improvements epic, we intend to back the PeerStore with a datastore.

Sub-milestones

  • Add persitence back end to the books
  • Configurable persistence

Overview

Centralizing all the information a peer has about its environment enables us to easily persist this data. This is particularly useful when we need to restart a node as we will be able to start establishing connections with peers that we know (other than the bootstrap nodes).

This should have a big impact on browser nodes, since they have less ways of discovering nodes and these discovery services sometimes take a longer times. Moreover, the nodes will have a bigger set of connected nodes considerably faster.

Other than faster connectivity, a persisted peerStore enables us to rely less in the bootstrap nodes (via configuration), so that we reduce the load on them.

Implementation Design

What to persist

The PeerStore is composed by 4 different components, addressBook, KeyBook, metadataBook and protoBook. While some of the content of these books are super relevant to persist, others might not have a clear value.

The addressBook contains a list of multiaddr for each peer, as well as some relevant data for each multiaddr, including their validity, degree and confidence, .... While the multiaddr are super valuable, the remaining two are discussable. Considering that we will need to dial the peers when the peer restarts, the validity will be updated, so it is not that relevant to store. On the other side, the degree of confidence can have a good impact if we have multiple multiaddresses for the peer, since we will dial multiple ones in parallel.

The keyBook content must be stored as it is crucial for the correct work of the system.

The metadataBook may contain previously set metadata about each peer. As a result, this information should still exist when the peer is restarted.

The protoBook contains the list of protocols supported by each peer. As we will need to establish connections with the peers, we can run the identify service to get the updated list of protocols they support and multiaddrs they are listening on. Therefore, this information does not seem crucial to store. However, it could potentially be important if we had a large number of peers. In this context, we could choose the best peers to connect based on the protocols they run (but the information can also be outdated).

How to persist

A datastore stores the data in a key-value fashion. As a result, we need coherent keys so that we do not overwrite data.

A datastore allows us to query it through a key prefix. This way, we can find all the information if we define a consistent namespace that allow us to find the content without having any information.

The namespaces was defined as follows:

AddressBook

All the knownw peer addresses are stored with a key pattern as follows:

/peers/addrs/<b32 peer id no padding>

ProtoBook

All the knownw peer protocols are stored with a key pattern as follows:

/peers/protos/<b32 peer id no padding>

KeyBook

All public keys are stored under the following pattern:

/peers/keys/<b32 peer id no padding>

MetadataBook

Metadata is stored under the following key pattern:

/peers/metadata/<b32 peer id no padding>/<key>

Configuration

A user should be able to choose a datastore compatible with the interface-datastore to store the data.

Even though we can set a default on the information that we recomment to persist. Users should be able to persist the data that they want to. This way, libp2p should allow a custom persistence module.

With the persisted datastore, libp2p should provide a way to configure the libp2p-bootstrap nodes to only dial those peers if needed, or to only dial a subset of them according to a metric or percentage.

@vasco-santos vasco-santos added exp/expert Having worked on the specific codebase is important kind/enhancement A net-new feature or improvement to an existing feature status/ready Ready to be worked labels Mar 25, 2020
This was referenced Apr 27, 2020
@vasco-santos
Copy link
Member Author

Closing this as #619 and #626 got merged

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
exp/expert Having worked on the specific codebase is important kind/enhancement A net-new feature or improvement to an existing feature status/ready Ready to be worked
Projects
None yet
Development

No branches or pull requests

1 participant