Remove special cases for Peer and Provider records #584

aschmahmann · 2020-04-13T06:53:22Z

Proposal

Use an updated Record interface that supports:
a. Merge(rec1, rec2) so that we can support record types that are mergeable (e.g. provider records) and not just sortable (e.g. IPNS records)
b. Verify(key, rec) which we already have
c. We could optionally make both of these functions take some external "state" if we wanted to insist that the above were pure functions, which is not the case with the existing Record interface
Pass record validators into the DHT with a few extra properties for efficiency (especially helpful given the below changes):
a. CustomKadID(key string) kad.ID, so that if it makes sense that /a/QmABC would be near QmABC in Kad space that we can represent that
b. RecordID int we should allow representing record namespaces on the wire using a (var)int instead of a string to save some bytes. There's no reason to expect contention over the slots since they should be defined per DHT prefix (e.g. /ipfs), and that spec would be where the standardization of the record IDs happen
Rework some of the record processing internals to handle "merged" records instead of just "best" records (there's likely to be some coding complexity here that will fine tune exactly what information we need from a validator)
Remove the ADD_PROVIDER and GET_PROVIDERS Message Types and instead perform those operations via the /provider namespace of PUT/GET VALUE
Use the /p2p namespace of PUT/GET VALUE for finding and advertising peers. Use a CustomKadID for peer records that puts /p2p/QmABC near SHA256(QmABC) (this will make it behave similarly to how it does today)

Background

We currently have a few message types that do approximately the same thing:

Generic Puts + Gets

PutValue(key, value): puts an arbitrary piece of data that must follow the rules defined by the key's prefix
Get/SearchValue(key): gets an arbitrary piece of data returning the "best" version of the data found using the rules defined by the key's prefix

Message Types Used: FIND_NODE and PUT_VALUE for PutValue and GET_VALUE for Get/SearchValue

What can you do with the records? The Record interface allows for user defined types that verify and order records (e.g. the /ipns prefix is associated with a particular record format, a way to verify that an IPNS record is valid and a way to determine which of two records is better).

Provider Records

Provide(cid): Puts a provider record (a piece of data signifying that a peer cares about some information)
FindProviders(cid): Gets a set of provider record corresponding to the cid (or really multihash #422 )

Message Types Used: FIND_NODE and ADD_PROVIDER for Provide and GET_PROVIDERS for FindProviders

What can you do with the records? Everything here is implicit, we learn about peer's the have expressed interest in some topic (e.g. having the data identified by a multihash) and then decide what to do about it (e.g. connect to them and start using some other protocol). Despite provider records not being signed we'd still like the network to only advertise that someone is interested in being contacted if they have expressed interest, therefore DHT servers only accept provider records coming directly from the source (as can be verified by their crypto handshake).

Peer Records

This one is a little funky and still needs a specification libp2p/go-libp2p#784. The main reason it's strange is because there are a number of optimizations that can be done for peers that are members of the DHT (whether clients or servers) that cannot be performed for others.

FindPeer(peerID): Get a peer's multiaddresses from the DHT
PutPeer(peerID): This function does not exist, but rather is implicitly done by forming connections to other peers

Message Types Used: FIND_NODE and implicitly the Identify protocol

What can you do with the records? These operate similarly to provider records, you find out about the addresses for a peer that you have an interest in. Like provider records servers are supposed to ensure that only real addresses that they have received from the peers are conveyed in queries.

Why Now?

Well a few of the "quirks" of our DHT message types have finally caught up with us. In particular, we want to:

Make peer records signed (means breaking the message format) Signed Peer records for the DHT #558 - which helps us
a. Dial peers more reliably, especially peers that are not DHT servers, but are advertising their addresses in the DHT
b. Trust DHT servers a little less (attack omitted, but you can guess 😁)
c. Enable third parties to advertise peer records in the DHT
d. Give less trust to third parties that fetch peer records from the DHT
Make provider records signed (and actual records) Signed Provider records for the DHT #559 - which helps us
a. Find content more reliably, since we can prioritize which providers to utilize first
b. Enable third parties to advertise provider records in the DHT
c. Give less trust to third parties that fetch provider records from the DHT
d. Enable record rebalancing Correctly Implement Kademlia Rebalancing #354 (more important for provider records then peer records, but useful there too)
e. Having an actual record format will also enable embedding extra information in the record going forward (e.g. in the case of IPFS, do I have all of a DAG under some CID or just some of it)
f. Trust DHT servers a little less
Allow refreshing our routing tables beyond 15 buckets Fix Routing Table Refresh after we have a SearchByKadId RPC #556
We want to remove the IPFS-ness from this libp2p component IpfsDHT -> KadDHT #568

Given that 1,2 and 3 will require rewriting our message types anyway this seems like an opportune time to tackle this as well. As an added bonus this will give us an opportunity to take another stab at what the ContentRouting and PeerRouting interfaces could look like in a very low stakes environment (since we'll end up wrapping the DHT in something that exports the existing Routing interface)

@jacobheun @Stebalien @aarshkshah1992

The text was updated successfully, but these errors were encountered:

willscott · 2020-04-13T15:58:57Z

can you expand a bit on on the proposed updated record interface? e.g. how would merge indicate two records couldn't be merged?

The optimizations possible with findPeer seem like they may warrant remaining special cased. it looks like the proposal is to keep the FIND_NODE records as they exist now, which sounds right.

aschmahmann · 2020-04-13T16:25:53Z

how would merge indicate two records couldn't be merged

They can always be merged, but maybe there's a better word to use here. The question being asked of the interface is "given that I've received record A, what do I do with this record B I've just received?". In the case of provider records, we return A + B (i.e. emit B since A has already been sent), while in the case of IPNS and peer records we return Best(A, B) (i.e. emit B if B is better than A, and not thing otherwise).

it looks like the proposal is to keep the FIND_NODE records as they exist now

These are going to have to change in some form anyway as a result of signed peer records #558. I think we may have to experiment a bit to see if we can remove the FindPeer special casing without taking any non-trivial performance hits. If we can get away with it I think it would be great since I'd prefer to think of any "special-casing" like we should be extending the interface. For example, if someone wanted to build a DHT with a new type of peer record (e.g. different signature scheme) it'd be nice if they could benefit from the same optimizations we're using for the "default" FindPeer.

guillaumemichel · 2023-08-17T10:08:38Z

I would argue that the only needed GET RPC is SearchByKadId described in #556. This SearchByKadId RPC could contain an extra context information, e.g if you are looking for the {peer, provider, IPNS, new type of} record identified by the provided KadId. A GET request would only contain [KadID, (record_type)]

The principle of a DHT is to address data by a common identifier so that they can be allocated to nodes. So let's address the data using this common identifier, and not a preimage of it. This will allow us get rid of monsters such as the preimage table.

The implication is that all data handled by the DHT must be indexed by its Kademlia ID, and not by its preimage (e.g peer.ID, CID etc.). Namely, the routing table can be a binary trie data structure, mapping a KadId to a peer.ID. The Provider Store should index provider records by their KadId and not by CID.

The format of the PUT operation is less critical, because we don't expect to receive closer peers to the key we are putting. But I agree that it makes sense to have a standard PUT operation, which only needs to contain [KadID, data, (record_type)]

cc: @aschmahmann @dennis-tra @iand

aschmahmann · 2023-08-17T14:04:20Z

So let's address the data using this common identifier, and not a preimage of it. This will allow us get rid of monsters such as the preimage table.

This is mixing concerns there are two types of operations:

DHT control plane (i.e. FIND_NODE for refreshing buckets)
DHT data plane (i.e. peer records, provider records, ipns records, ....)

The preimage insanity is because the control plane RPC is using peerIDs (i.e. preimages) rather than kadIDs which is just wrong. That's not the same as data plane using preimages which IIUC you'd like because it helps with a different problem (i.e. you want to have some request anonymity by asking for a prefix of the key).

Either way I'd probably still keep the request and data plane RPCs as separate rather than as:

Control plane: SearchByKadId(KadID)
Data plane: SearchByKadId(KadID, record-type)

Sure, they're functionally quite similar but I'd like it to be really obvious that people should not use the control plane lookups for peer records to minimize confusion/issues there.

guillaumemichel · 2023-08-17T14:16:49Z

That's not the same as data plane using preimages which IIUC you'd like because it helps with a different problem (i.e. you want to have some request anonymity by asking for a prefix of the key).

Yes, that's right!

Either way I'd probably still keep the request and data plane RPCs as separate rather than as:
1. Control plane: `SearchByKadId(KadID)`

2. Data plane: `SearchByKadId(KadID, record-type)`

Totally agree! 👍🏻

burdiyan · 2024-01-29T11:09:16Z

Is there any roadmap for implementing what's discussed in this issue?

I've been diving into the internals of the DHT for the past few days, and ended up realizing that it is possibly reaching the limits of the "organic growth" it's been having :)

The reason I started diving into the internals (unrelated to this issue)

Basically because of our own bad design decision, where we'd be trying to connect periodically to lots of peer IDs without addresses, which triggers a thundering herd of DHT peer routing requests, eating up the entire bandwidth of the internet connection. And because most of those PeerIDs are either offline, or even just dummy keys from tests, and there's no backoff delay, we just end up completely ruining the network for us :)

guillaumemichel · 2024-01-30T09:05:50Z

Hi @burdiyan

IIUC your problem isn't exactly related to this issue. The DHT is essentially a distributed key-value store, and if your application is making a massive amount of requests, it will simply exhaust the internet connection. I would suggest implementing a backoff delay on the application side.

Concerning this issue, I won't probably be implemented anytime soon, because it is a breaking change and the teams currently have other priorities.

burdiyan · 2024-01-30T16:16:38Z

Sorry, I added a clarification that my issue is unrelated to this one, it just made me dive into the DHT internals (that's why I put it under a toggle initially, but it wasn't clear that it's actually not very related). We are already implementing a backoff delay to avoid it.

Concerning this issue, I won't probably be implemented anytime soon, because it is a breaking change and the teams currently have other priorities.

That's sad, because the ideas here could be really helpful. The fact that it's a breaking change could probably be implemented as a different protocol maybe, and nodes could be supporting both protocols for some time, until most people migrate.

I remember seeing some DHT refactoring efforts, but can't find it now. Is there any place to follow up on what's next for the DHT in general? Is there any roadmap document?

guillaumemichel · 2024-01-30T16:48:12Z

Yes, we had a full work stream to modernize the DHT, make it upgradable and generally more agile, migrate the network, implement privacy improvements etc. But because of team restructuring this work stream has been stopped, until we find funding to go on with it.

Zikade is the modern go DHT implementation we have been working on.

You can view the outdated Roadmap on Starmap.

jacobheun added the Epic label Apr 20, 2020

jacobheun added this to the go-ipfs 0.6 milestone Apr 20, 2020

This was referenced Apr 27, 2020

Gradual Network Upgrades #616

Open

Nesting DHTs #626

Draft

jacobheun modified the milestones: go-ipfs 0.6, go-ipfs 0.7 May 4, 2020

aschmahmann mentioned this issue Jul 29, 2020

Signed Peer Records #630

Closed

aschmahmann mentioned this issue Jan 12, 2021

RFC|BB|L2-09/10 Handling arbitrarily large blocks and Treating files as large blocks protocol/beyond-bitswap#29

Open

This was referenced Mar 22, 2021

Require the protocol id when creating a new DHT or DHT client. #704

Closed

[Feature Suggestion] - CID deny / allow API ipfs/kubo#7871

Closed

aschmahmann mentioned this issue Apr 1, 2021

findpeer should work even on peers that are not part of DHT queries #711

Merged

aschmahmann mentioned this issue Apr 20, 2021

Crawler based DHT client #709

Merged

6 tasks

aschmahmann mentioned this issue Jun 24, 2021

Add minimal Kademlia DHT spec libp2p/specs#108

Merged

4 tasks

aschmahmann mentioned this issue Jul 21, 2021

feat: delete GetValues #728

Merged

aschmahmann mentioned this issue Apr 21, 2022

Reframe peer records ipfs/specs#279

Closed

aschmahmann mentioned this issue May 30, 2022

feat(routing): Delegated Routing ipfs/kubo#8997

Merged

aschmahmann mentioned this issue Jun 8, 2022

add provide messages in reframe ipfs/specs#285

Merged

aschmahmann mentioned this issue Oct 12, 2022

Support announcing provider records ipfs/someguy#11

Closed

2 tasks

guillaumemichel mentioned this issue Aug 17, 2023

Protocol interface refinement probe-lab/go-kademlia#98

Closed

dennis-tra mentioned this issue Aug 21, 2023

Consolidate Record Persistence Logic #867

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove special cases for Peer and Provider records #584

Remove special cases for Peer and Provider records #584

aschmahmann commented Apr 13, 2020 •

edited

Loading

willscott commented Apr 13, 2020

aschmahmann commented Apr 13, 2020

guillaumemichel commented Aug 17, 2023

aschmahmann commented Aug 17, 2023

guillaumemichel commented Aug 17, 2023

burdiyan commented Jan 29, 2024 •

edited

Loading

guillaumemichel commented Jan 30, 2024

burdiyan commented Jan 30, 2024

guillaumemichel commented Jan 30, 2024

Remove special cases for Peer and Provider records #584

Remove special cases for Peer and Provider records #584

Comments

aschmahmann commented Apr 13, 2020 • edited Loading

Proposal

Background

Generic Puts + Gets

Provider Records

Peer Records

Why Now?

willscott commented Apr 13, 2020

aschmahmann commented Apr 13, 2020

guillaumemichel commented Aug 17, 2023

aschmahmann commented Aug 17, 2023

guillaumemichel commented Aug 17, 2023

burdiyan commented Jan 29, 2024 • edited Loading

guillaumemichel commented Jan 30, 2024

burdiyan commented Jan 30, 2024

guillaumemichel commented Jan 30, 2024

aschmahmann commented Apr 13, 2020 •

edited

Loading

burdiyan commented Jan 29, 2024 •

edited

Loading