Persist routing table #254

anacrolix · 2019-02-08T00:09:06Z

Standard DHT implementations don't require bootstrapping on every instance start-up. It should be possible to import previous instance routing tables, and avoid having to bootstrap from a set of bootstrap nodes. Note that regularly refreshing the routing table is still common (and may be called bootstrapping).

raulk · 2019-02-08T19:50:34Z

Thoughts.

Persistence medium

Option A: persist in the peerstore. With the current API, we'd have to attach metadata to a peer indicating if it belongs in the routing table. This is OK for write, but on startup it is costly to read right now. We'd have to add a method PeersWithMetadata(), get a chan of peers, and query the metadata field for each peer. Also, from the viewpoint of kad-dht, it doesn't know if the runtime peerstore is memory-backed or ds-backed. It doesn't make sense to persist the routing table to a mem-backed peerstore, so we'll need to add another method for the peerstore to return its storage medium.
- If we go with this option: this needs to merge: pstoreds: migrate from strict TTL to GC-based expiry. go-libp2p-peerstore#47
Option B: use go-datastore directly, with the ability to pick the data model that suits us.

Seeding strategy

We need to decide how we're going to seed the routing table on start, because it'll affect what metadata we persist about peers in the routing table, and hence the data model.

Brainstorming options:

Take a random sample of peers that we known to be in the routing table during the previous run, as long as they were active in a certain time threshold (i.e. we don't want to reuse peers if the last run was 1 year ago), falling back to well-known bootstrap peers if none or to few are found fitting the criteria.
Score peers by latency, helpfulness, and bucket (i.e. peers in closer k-buckets are more valuable), and seed the routing table from the top-scoring peers. This mitigates some attack vectors, but adds others.

Stebalien · 2019-02-08T22:31:23Z

I think we should (a) keep the address information in the peer store and (b) persist a single blob containing the routing table (peer IDs only). Really, we can probably just persist the entire routing table and reconnect to the entire routing table on start, no need to score. Thoughts?

raulk · 2019-02-08T22:46:23Z

go-ethereum seeds with a randomised sample instead of rematerializing the table verbatim, as a way to recover from borked tables. I think it’s a sensible strategy, it’s more secure, and it works.

Stebalien · 2019-02-08T22:59:59Z

Randomized sample from the routing table? Connecting to a subset shouldn't be any more secure than connecting to all, unless I'm missing something.

raulk · 2019-02-08T23:04:46Z

In a scenario where your routing table is borked (e.g. all nodes keep their TCP connections alive but they ignore queries) and you restart, if you materialise the same table you will be stuck again. That’s why it’s more secure to seed a subset, and perform a random walk from those, to fill empty slots with fresh nodes.

Stebalien · 2019-02-08T23:10:38Z

Won't that subset just ignore queries?

raulk · 2019-02-09T00:04:40Z

@Stebalien you’re right. I don’t know, it feels wrong not to introduce a randomisation component over restarts, maybe mixing in well-known/reputable nodes as well during seeding (for now, our bootstrappers; later, nodes we’ve calculated a high reputation for over time). Randomisation is a good starting point to defend against some attacks in a Byzantine scenario.

raulk · 2019-02-09T00:07:43Z

Basically with a mix of randomised past nodes + a few bootstrappers (to keep it simple) we should be able to recover from a borked table after a restart, albeit slowly, but without manual intervention.

Stebalien · 2019-02-09T00:11:58Z

Mixing in some bootstrappers sounds like a good idea.

da2x · 2019-02-22T02:40:27Z

From looking at BitTorrent client implementations; the only criterion for storing a node for future use is this: you’ve successfully established an outgoing connection to it (it’s reachable.)

On the next startup, a number of nodes are selected at random (e.g. 10). If any of them are unreachable, replace it with another randomly selected node. If you run out of nodes, fall back to the bootstrapping nodes.

If you want to be smart about it then you can try selecting some nodes that have a short prefix length from your own public IP and some nodes from a diverse set of subnets. However, I do believe blindly choosing nodes at random is a perfectly valid strategy. Nodes are expected to leave and join the network all the time so it’s probably just a waste of time to try to carefully select the perfect set of nodes.

Mikaela · 2019-02-22T11:25:32Z

As Yggdrasil and Cjdns user it would be nice if I automatically got peers from 0200::/7 and fc00::/8 so not all my traffic would go through clearnet that way.

If I was always given ten random clearnet peers instead of ever using bootstrap peers, would I get peers from those networks at all?

yangm97 · 2019-02-22T16:32:05Z

@Mikaela afaik, atm, one way to do that is if the application talks to the cjdns/yggdrasil API directly, to fetch nodes.

But assuming you're not peering over clearnet and nodes around you are cjdns/ygg native, you could discover them by ipv6 loopback multicast, and then switch to cjdns/ygg for data exchange, once you discover the ips these nodes are advertising (and save them because ygg/cjdns ips don't change when nodes move around).

anacrolix · 2019-02-28T01:57:50Z

From looking at BitTorrent client implementations; the only criterion for storing a node for future use is this: you’ve successfully established an outgoing connection to it (it’s reachable.)

Right on. You just store your routing table wholesale.

On the next startup, a number of nodes are selected at random (e.g. 10). If any of them are unreachable, replace it with another randomly selected node. If you run out of nodes, fall back to the bootstrapping nodes.

Again, essentially standard routing table treatment. Your point about falling back to bootstrap nodes is the missing link in this implementation. Anytime the routing table becomes unviable (or a query stalls due to that), a callback that returns bootstrap nodes can be invoked to get things rolling again, like StartingNodes here.

If you want to be smart about it then you can try selecting some nodes that have a short prefix length from your own public IP and some nodes from a diverse set of subnets. However, I do believe blindly choosing nodes at random is a perfectly valid strategy. Nodes are expected to leave and join the network all the time so it’s probably just a waste of time to try to carefully select the perfect set of nodes.

I agree. It's sufficient to just use your routing table management policy at all times, both before and after persisting. By definition the routing table is the best available peers, appropriately distributed throughout the keyspace at all times anyway.

raulk · 2019-02-28T23:23:24Z

Let's try not to hardcode any behaviour, and let's make these strategies modular. That's the ethos of libp2p.

Unlike a specific BitTorrent client, etc. libp2p is a library for multitude of use cases, and no one size fits all.

I'm thinking of two interfaces you can inject via Options:

type Loader interface {
         // Load receives a snapshots of the members of the previous routing table
         // (possibly 0-length), and the peerstore (so it can verify whether we
         // still hold addresses for past members). 
         // 
         // It selects the candidates to add to the routing table, and it may use
         // any heuristic, such as past observed latencies, failure rate, 
         // helpfulness score, etc. (recorded as metadata in the peerstore), 
         Load(previous []peer.ID, pstore pstore.Peerstore) (candidates []peer.ID)
}

type Seeder interface {
        // Seed receives the routing table to populate, the candidates the Loader 
        // picked, and the fallback peers (static bootstrap peers).
        //
        // Seed must ensure connectivity to any peers it adds, and in that process
        // it can use metrics such as current, real latency to filter peers. 
        //
        // In the event that candidates are unviable, it can fall back to the 
        // static bootstrap peers to salvage the situation.
        Seed(host libp2p.Host, table kb.RoutingTable, candidates, fallback []peer.ID)
}

We should have default strategies for when the user doesn't pass these options in the constructor, like the ones that have been cited.

We can also define common strategies in exported vars, so that the user can combine them as if they were recipes.

anacrolix · 2019-03-01T02:01:14Z

After creating a node, it should be possible to seed the routing table with entries. It's already possible to access the routing table of a node, we only need to determine appropriate values to be reconstituted with a peer's entry, and expose that. That would include the peer ID, and its known addresses if they're not handled externally. Adding nodes to the routing table directly via a method on the DHT should only be required if the routing table does not have sufficient context to receive the above type.

Correspondingly, the DHT user should be able to retrieve a snapshot of the routing table before it is closed, or on closing, to persist as they see fit. Again, the only question here is what data is required from the DHT to persist a routing table entry and restore it later. Deferring address management and persistent to the peerstore, or whichever component is responsible for addresses makes things a little less convenient, but possibly more flexible.

anacrolix · 2019-03-18T23:47:40Z

I think #283 should be done first, as it may affect what information needs to be persisted.

anacrolix · 2019-03-19T04:18:56Z

#283 also makes reconstituting a routing table more robust. Currently if we did this, we'd have to connect to each peer added to maintain the routing table invariant that all entries are connected. Which also highlights how flawed that assumption is: If we can't connect to any peers while adding to the routing table due to intermittent connectivity issues, we can't restore the routing table.

anacrolix · 2019-04-17T00:50:32Z

I'm suggesting, more or less:

IpfsDHT.SetRoutingTable([]PeerInfo)
IpfsDHT.GetRoutingTable() []PeerInfo

All other behaviours at the whim of the consumer.

hsanjuan · 2019-06-14T22:03:47Z

Note that the DHT may not be the only service that would benefit from remembering peers and reconnecting to them. I kind of see that as a separate, non-dht issue that should be handled by the libp2p host directly.

Peerstore should be persisted (host option)
Hosts should have an option specifying what to do with known peers in the peerstore on boot (ranging between connecting to every peer, or connecting to N peers, or none). Probably as part of a host.Bootstrap() method that can be called when all services have been added since they depend on connect notifications. This allows the DHT to bootstrap itself from scratch if necessary, but also other services.
Then the DHT may choose to persist its own routing table, and connect to those peers as well, but regardless of what the general host bootstrap process is.

tl;dr: Don't make peerstore persistence a dht-only feature.

Stebalien · 2020-05-26T04:51:02Z

Juan brought up that simply persisting the routing table and reviving from that could be dangerous. If an attacker can ever "take over" a peer's routing table and control their view of the network, recovering from this situation would be impossible without using the centralized bootstrappers.

The proposed solution is to look at historical data. That is, we should bootstrap off of peers that we've seen repeatedly over a long period of time.

A possible solution (probably not optimal) is to keep bucketed counts of how frequently we've seen and used each peer over time:

< 1 hour ago
< 1 day ago
< 1 week ago
< 1 month ago.
...

We'd prefer peers that:

Are continuously online (are are present in all recent buckets).
Are frequently useful (i.e., have high counts).
Have an historical presence (present in older buckets as well as new buckets).

This solution is probably quite a bit more complicated than it needs to be, but I hope it illustrates what I think we need.

master255 · 2023-10-30T00:41:13Z

Temporary solution:

var peersAddrs []peer.AddrInfo
	var i int8
	for _, peerId := range m.host.Network().Peerstore().PeersWithAddrs() {
		if m.host.ID() == peerId {
			continue
		}
		peersAddrs = append(peersAddrs, m.host.Network().Peerstore().PeerInfo(peerId))
		i++
		if i > 20 {
			break
		}
	}
	bytes, _ := json.Marshal(peersAddrs)
....

var peersAddrs []peer.AddrInfo
		err := json.Unmarshal(bytes, &peersAddrs)
....

And I found that Network().Peerstore().PeersWithAddrs() contains more information about peers than kademlia.RoutingTable().ListPeers(). So it is better to use it.

anacrolix self-assigned this Feb 8, 2019

raulk mentioned this issue Feb 15, 2019

Censorship resistant bootstrapping (e.g. for wikipedia) ipfs/kubo#3908

Open

This was referenced Feb 15, 2019

Saving previously seen nodes for later bootstrapping ipfs/kubo#3926

Closed

Saving previously seen nodes for later bootstrapping ipfs/js-ipfs#1505

Closed

anacrolix mentioned this issue Mar 4, 2019

Dissociating routing table membership from connection state #283

Closed

This was referenced Mar 11, 2019

How to make dht reconnect after network fault #294

Closed

Explicitly set bootstrap/fallback nodes #295

Closed

anacrolix added the P3 Low: Not priority right now label Mar 11, 2019

raulk mentioned this issue Apr 16, 2019

WIP: Persisting/seeding a routing table #315

Closed

anacrolix mentioned this issue Apr 16, 2019

DHT provider records: adopt sloppiness, rejection and backtracking from Coral libp2p/specs#163

Open

raulk mentioned this issue May 30, 2019

Critical path towards DHT efficiency and performance #345

Open

7 tasks

lidel mentioned this issue Jun 3, 2019

Achieving Connectivity Magic ipfs/js-ipfs#1459

Closed

anacrolix removed their assignment Aug 5, 2019

This was referenced Aug 11, 2019

Feat/dht persist seed #382

Closed

Persisting/seeding a routing table #383

Closed

Stebalien added the status/in-progress In progress label May 29, 2020

experiencedft mentioned this issue Jul 11, 2020

Save last seen peers and use them as bootstrap upon restart keep-network/keep-core#1877

Closed

lidel mentioned this issue Mar 9, 2021

List the multiple definitions of "node". ipfs/ipfs-docs#683

Closed

guseggert mentioned this issue May 5, 2022

AcceleratedDHTClient needs more resources than the ResourceManager has by default ipfs/kubo#8945

Closed

3 tasks

This was referenced Oct 5, 2022

feat(bootstrap): save connected peers as backup bootstrap peers ipfs/kubo#8856

Merged

Persist peers for bootstrapping libp2p/rust-libp2p#2987

Closed

Wondertan mentioned this issue Mar 1, 2023

feat: Persist peers to disk such that nodes can bootstrap from previously seen peers celestiaorg/celestia-node#1851

Closed

guillaumemichel mentioned this issue May 23, 2023

Save and restore the DHT state? #842

Closed

achingbrain mentioned this issue May 31, 2023

Routing Table persistence libp2p/js-libp2p-kad-dht#109

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Persist routing table #254

Persist routing table #254

anacrolix commented Feb 8, 2019

raulk commented Feb 8, 2019 •

edited

Loading

Stebalien commented Feb 8, 2019

raulk commented Feb 8, 2019

Stebalien commented Feb 8, 2019

raulk commented Feb 8, 2019

Stebalien commented Feb 8, 2019

raulk commented Feb 9, 2019

raulk commented Feb 9, 2019

Stebalien commented Feb 9, 2019

da2x commented Feb 22, 2019

Mikaela commented Feb 22, 2019

yangm97 commented Feb 22, 2019

anacrolix commented Feb 28, 2019

raulk commented Feb 28, 2019 •

edited

Loading

anacrolix commented Mar 1, 2019

anacrolix commented Mar 18, 2019

anacrolix commented Mar 19, 2019

anacrolix commented Apr 17, 2019

hsanjuan commented Jun 14, 2019

Stebalien commented May 26, 2020

master255 commented Oct 30, 2023 •

edited

Loading

Persist routing table #254

Persist routing table #254

Comments

anacrolix commented Feb 8, 2019

raulk commented Feb 8, 2019 • edited Loading

Persistence medium

Seeding strategy

Stebalien commented Feb 8, 2019

raulk commented Feb 8, 2019

Stebalien commented Feb 8, 2019

raulk commented Feb 8, 2019

Stebalien commented Feb 8, 2019

raulk commented Feb 9, 2019

raulk commented Feb 9, 2019

Stebalien commented Feb 9, 2019

da2x commented Feb 22, 2019

Mikaela commented Feb 22, 2019

yangm97 commented Feb 22, 2019

anacrolix commented Feb 28, 2019

raulk commented Feb 28, 2019 • edited Loading

anacrolix commented Mar 1, 2019

anacrolix commented Mar 18, 2019

anacrolix commented Mar 19, 2019

anacrolix commented Apr 17, 2019

hsanjuan commented Jun 14, 2019

Stebalien commented May 26, 2020

master255 commented Oct 30, 2023 • edited Loading

raulk commented Feb 8, 2019 •

edited

Loading

raulk commented Feb 28, 2019 •

edited

Loading

master255 commented Oct 30, 2023 •

edited

Loading