Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simple, minimal peer exchange protocol #222

Open
raulk opened this issue Oct 25, 2019 · 6 comments · May be fixed by #590
Open

Simple, minimal peer exchange protocol #222

raulk opened this issue Oct 25, 2019 · 6 comments · May be fixed by #590

Comments

@raulk
Copy link
Member

raulk commented Oct 25, 2019

This issue proposes a general-use peer exchange protocol, that is not embedded in any specific protocol like gossipsub/episub.

The goal of PEX is to enable peers to share records about other peers they're connected to in a 1:1, ad-hoc fashion. It does not intend to produce deterministic results like DHTs, nor does it rely on a structured network or shared heuristic. BitTorrent uses PEX to streamline tracker-less peer discovery.

In the context of gossipsub, PEX proves useful to find additional peers in a topic we subscribe to, as a way of strengthening our topic mesh. Through subscription beaconing (ie. peers gossiping about which topics they're subscribed to), it can even be possible to bootstrap a topic subscription without hitting the DHT, or other structured discovery mechanisms at all.

I'm thinking we should spec out a minimal PEX protocol, consisting of a simple advertisement schema, and two operations: advertise, lookup.

Advertisement schema

An advertisement struct consists of a peer address record and a set of CIDs we are advertising, signed by the peer's key to prevent MITM attacks.

Local advertisement record maintenance

Our advertisement record is kept in memory and updated at runtime. It is populated with:

  • our peer ID.
  • own addresses (which may change over time; the PEX protocol can subscribe to updates via the eventbus).
  • advertised CIDs.

The Host API would expose methods so that downstream components (e.g. protocols) can manage advertised CIDs, e.g.:

// We don't want to add an accessor for PEX in the Host interface.
// The host-service refactor is a prerequisite to be able to do this.
svc, ok := host.GetService(&PEX{})
if !ok {
    return nil
}
pexsvc := svc.(PEXService)

ad := pex.NewAd("gossipsub:topic_name")
cancelFn, err := pexsvc.Advertise(ad)
if err != nil {
    return err
}

// ... store the cancelFn in state ...

// atomically replace the advertised value, possibly not useful for gossipsub, 
// but it will be for other protocols.
// Helps mitigate add/remove noise when sending deltas.
ad.Replace("gossipsub:topic_name_b")

// when done / closing down
err := cancelFn()
if err != nil {
    return err
}

Advertise operation

Upon establishing a libp2p connection:

  1. We open a stream for protocol ID /libp2p/pex/v01.
  2. If successful, we push our local advertisement record.
  3. When receiving a record, we store it in memory.

We repeat the above when advertisements or addresses change. Note that process looks a lot like the identify protocol logic. We could extend the identify protocol to support advertised CIDs. Note that protocol IDs are insufficient to contextualise an advertisement (e.g. we want to know that a peer is a member of gossipsub topic abc, not that it supports gossipsub).

Lookup operation

When the local application/protocol intends to look up peers advertising a specific CID, it sends a lookup RPC to all connected neighbours, who reply with the advertisement records of all directly connected peers they know to be advertising the CID.

If a peer returns irrelevant/malformed/badly signed ads, we decrease their score on the grounds of displaying malicious behaviour. Below a certain threshold, we blacklist/disconnect the peer.

In its basic form, the lookup operation extends our view of the network by degree 2 (we reach peers of our peers), but it can be further enhanced by a TTL mechanism that allows the request to be relayed N number of hops. Thus, if a peer knows of zero peers advertising the CID, it could relay the request to its neighbours.

I propose we don't venture with relayed lookup requests at this stage, as it requires thoughtful modelling of rate-limiting, quotas, and scoring, to prevent DDoS attacks. But it's definitely something to keep in the radar.

Privacy reflections

Just like with DHTs, it's hard to guarantee reader privacy. PEX could be used to map out how peers interested in a certain subject are effectively connected. We can introduce randomness to deter such attempts.

@vyzo
Copy link
Contributor

vyzo commented Oct 25, 2019

We might also want to have a push protocol for advertisements instead of relying on poll lookup.

@jbenet
Copy link
Contributor

jbenet commented Oct 28, 2019

PEX

great to see this here! 👍 -- we've needed something like PEX in libp2p for a long time

ad := pex.NewAd("gossipsub:topic_name")

oh cool, i didn't recall PEX kept specific topics/swarms associated with each peer. makes sense. We probably want to do something like tags actually:

Get("gossipsub:topic_name") # get all peers related to this gossipsub topic
Get("providers:<selector>") # get all peers related to this ipld selector
Get("transport:QUIC") # get all peers that have QUIC
Get("kad-dht") # get all peers that speak kad-dht
Get("filecoin") # get all peers that speak filecoin
Get("filecoin:retrieval") # get all peers that speak filecoin:retrieval
Get("kad-dht", "gossipsub:topic_name") # get all peers related to this gossipsub topic, and who speak kad-dht

In this sense, maybe we should be doing pathing (/ separated), and re-using the protocol identifiers we already use (for uniqueness and default simplicity):

Get(Path(gossipsub.ProtocolID, "topic_name"))
Get(Path(providers.ProtocolID, selector))
Get(Path("transport", quic.ProtocolID))
Get(Path(filecoin.ProtocolID, filecoin.RetrievalProtocolID))
Get(Path(gossipsub.ProtocolID, "topic_name"), Path("kad-dht))

In its basic form, the lookup operation extends our view of the network by degree 2 (we reach peers of our peers), but it can be further enhanced by a TTL mechanism that allows the request to be relayed N number of hops. Thus, if a peer knows of zero peers advertising the CID, it could relay the request to its neighbours.

not sure we should even reach peers-of-peers, but maybe.

I propose we don't venture with relayed lookup requests at this stage, as it requires thoughtful modelling of rate-limiting, quotas, and scoring, to prevent DDoS attacks. But it's definitely something to keep in the radar.

Yeah i think this needs to be explicitly out of scope for this protocol. this should be a very simple 1-1 protocol (or just about).

Just like with DHTs, it's hard to guarantee reader privacy. PEX could be used to map out how peers interested in a certain subject are effectively connected. We can introduce randomness to deter such attempts.

yes 👍

Security Considerations

  • PEX MUST be easy to turn off and never required
  • PEX SHOULD not give all peers one is connected to. this would be a potential attack vector.
    • limiting by topic/protocol is good (PEX interface should allow making it only respond to certain labels -eg only gossipsub peers, etc)
    • limiting by number of ranodmly-chosen-peers per label can also work (eg. will return at most 5 peers per label)
  • PEX SHOULD be able to return peers that are no longer connected -- it should be able to use the local cache / peerbook, which may have lots and lots of peers, even if not directly connected atm -- this should be an option that can be off.

@thomaseizinger
Copy link
Contributor

I think the rendezvous protocol might allow to implement this. Advertisements are essentially registrations. Namespaces are free-form text so clients can store all kinds of stuff in there. As long as the format is agreed upon, it can be used to advertise gossipsub topics, CIDs, etc

@Menduist
Copy link
Contributor

I think the rendezvous protocol might allow to implement this.

I second this, what's missing from rendezvous to be considered a Peer Exchange protocol?
We could have a mode where it's stored in a db, and another where it feeds directly from the PeerStore / other structure in memory

We just need a "auto register", which will register us every time we connect to a peer with RDV enabled

@thomaseizinger
Copy link
Contributor

I think the rendezvous protocol might allow to implement this.

What is missing is some kind of standardisation, how the registrations are structured, i.e. what the namespace is.

For example, how do you take a gossip-sub topic and advertise it via rendezvous?

It should probably be prefixed with the protocol and then some protocol specific parameters, e.g.:

/gossipsub/1.1.0/topic/my_room

AFAIK, protocol IDs are completely opaque so we can't rely on / or any other char being a separator.

I second this, what's missing from rendezvous to be considered a Peer Exchange protocol?
We could have a mode where it's stored in a db, and another where it feeds directly from the PeerStore / other structure in memory

That is IMO entirely an implementation consideration and does not need to be part of a spec / protocol.

We just need a "auto register", which will register us every time we connect to a peer with RDV enabled

That is also an implementation choice IMO.

@thomaseizinger
Copy link
Contributor

Trying to reboot this in a simpler form: #587

@thomaseizinger thomaseizinger linked a pull request Oct 22, 2023 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Triage
Development

Successfully merging a pull request may close this issue.

5 participants