Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Signed Address Records #217

Merged
merged 17 commits into from
Nov 19, 2020
Merged
Show file tree
Hide file tree
Changes from 10 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
108 changes: 108 additions & 0 deletions RFC/0002-signed-envelopes.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,108 @@
# RFC 0002 - Signed Envelopes

- Start Date: 2019-10-21
- Related RFC: [0003 Address Records][addr-records-rfc]

## Abstract

This RFC proposes a "signed envelope" structure that contains an arbitray byte
raulk marked this conversation as resolved.
Show resolved Hide resolved
string payload, a signature of the payload, and the public key that can be used
to verify the signature.

This was spun out of an earlier draft of the [address records
RFC][addr-records-rfc], since it's generically useful.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It may be worth considering how generically useful this structure is given that the payload must be kept exactly as it is received (instead of allowing it to be deserailized and then reserialized).

If we chose to use a deterministic encoding scheme (e.g. Canonical CBOR or IPLD) instead of Protobufs this would be less of a problem. However, if we'd like to keep using Protobufs then it'd be great to have some documentation letting people know.

Thanks @yusefnapora for the great work putting this together

Copy link
Member

@raulk raulk Nov 11, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The envelope contains both the byte payload, and the signature over that byte payload. The serialisation scheme is irrelevant at this layer.

The recipient of this payload validates that the signature matches the plaintext and the key, then deserialises the payload with the serialisation format mandated for the payload type, in order to process it (e.g. to consume the multiaddrs).

If the recipient intends to relay this payload (as is the case of p2p discovery mechanisms), it does not send a re-serialised form, but rather it forwards the original envelope. In general, it's bad and fragile practice to reconstitute a payload in the hope that it'll continue matching the original signature that was annexed to it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The serialisation scheme is irrelevant at this layer....
In general, it's bad and fragile practice to reconstitute a payload in the hope that it'll continue matching the original signature that was annexed to it.

These two statements are tied together and are following a rule set that you may think is correct, but is not obvious. Not obvious restrictions dictating how the data may be interacted with should be documented. Additionally, this restriction does not have to exist it's just something that's been decided is ok/insufficiently problematic to bother dealing with.

I also disagree with it being "bad" to allow consistent serialization/deserialization of objects. If I have data which I need to propagate frequently and access infrequently then I'll just store the message bytes and deserialize every time I need to access the data. If I frequently propagate and access the data I'll store the data both serialized and deserialized. If however, I infrequently propagate the data and frequently access it I'm now forced to waste space by storing both the serialized and deserialized versions for no reason other than we like Protobufs.

Copy link
Member

@raulk raulk Nov 12, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These two statements are tied together and are following a rule set that you may think is correct, but is not obvious.

They are not. You are mixing up the concerns of a cryptographic envelope, with the details of how the inner opaque payload is constructed. These two layers are decoupled, and @yusefnapora has done a good job of modelling that in this spec.

I also disagree with it being "bad" to allow consistent serialization/deserialization of objects.

That's not what I said.

If however, I infrequently propagate the data and frequently access it I'm now forced to waste space by storing both the serialized and deserialized versions...

Yes, and it's a cost you assume to preserve the integrity of a signature.

for no reason other than we like Protobufs.

Incorrect. Systems preserve the original data along with the signature for many reasons including reducing the surface for bugs, traceability/auditability, and others.


I insist it is a terrible idea to assume that, even with canonical serialisation, your system will be perennially capable of reconstituting a payload out of its constituents, in a way that it matches the original signature. Developers introduce bugs, systems change, schemas change, and maintaining such hypothetical logic is error-prone, brittle and convoluted.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please explain what you think the downsides are of utilizing a format with canonical serialization?

I've already given a use case that would be helped by enabling canonical serialization, a record type that is infrequently propagated but frequently used would benefit from reduced memory and storage consumption.

Are you suggesting that we are intentionally using a format that has non-canonical serialization to dissuade other people from making design decisions you think are "terrible"? Are there other reasons you feel using IPLD or Canonical CBOR would be bad?

The point I'm trying to make here is that you seem to think "it's a terrible idea" for people to assume de+re serializing data will keep it identical, I think that in some situations it could be useful. Could you please list some of the negatives of utilizing a canonical serialization format and enabling developers to make their own decisions about whether to rely on its ability to de+re serialize accurately?

Copy link
Contributor

@aschmahmann aschmahmann Nov 12, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@raulk this "generic and not opinionated" wrapper cannot be used if someone wanted to share (using a CID as a reference) a collection of envelopes and still access them efficiently.

Concrete example. If IPNS was being created today it could easily use one of these signed envelopes to contain its data. However, if I wanted to share over IPFS a set of IPNS records (e.g. here are the 10 public keys corresponding to my favorite website authors) I could not just take the IPNS records and stuff them into an IPLD object without compromising on storing two copies of the envelope.

I'm not saying the above example is common or something we should definitely do, but it shows a use case that your approach blocks. If we have a justification for ignoring this use case (e.g. you think protobuf is a more "amply supported, performant, well-vetted format" then the alternatives that support canonical serialization) then that's fine rationale.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aschmahmann

  1. This spec does not rely on the serialization of the outer object. When signing and verifying the signature, it takes the outer envelope and deterministically re-serializes it (not using protobuf).
  2. The inner object is just bytes. Those bytes can be CBOR or anything else.

Even better, we have a type field so we can:

  1. Put the IPLD codec in the type field.
  2. Put an arbitrary embedded IPLD object in the content field.

TL;DR: you absolutely free to re-serialize the content on the fly as long as you have chosen a format with a deterministic serialization.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for all the discussion :) I definitely feel @aschmahmann about signing structured data that doesn't have a deterministic encoding - it just feels kind of wrong. Serializing to bytes before signing side-steps the issue, but it's also a bit awkward.

My first pass at this did use IPLD, mostly because of this issue of deterministic encoding, and also because I think the IPLD schema DSL is pretty cool. I ended up backing away from that, but I don't think I explained my thought process very well.

IPLD is attractive because you can get deterministic output with the CBOR encoding, but I was hesitant to rely on that, mostly because IPLD is still pretty new. If we start assuming that we can always serialize IPLD to the same bytes, that seems like it kind of limits the future evolution of the IPLD CBOR format. If we ever need to change how IPLD gets serialized to CBOR, any signatures made with the older implementation will be invalid.

The other problem with IPLD is just that we seem to be in the middle of a Cambrian explosion of libp2p implementations, and it seems like a tough ask to make libp2p implementers also implement IPLD.

I don't think either of those arguments really apply to just using plain CBOR & requiring the canonical encoding (sorted map keys, etc). CBOR has broad language support, and the canonical encoding is (hopefully) stable. And of course, if we did use CBOR, you could embed our records into an IPLD graph as-is without having to treat them as opaque blobs, since a valid CBOR map will presumably always be valid IPLD.

Honestly, I ended up going with protobuf instead simply because it seemed easier to define a protobuf schema than to specify the map keys, value types, etc that we'd need to define for a CBOR-based format. Also, since we need to include the public key & there's already a protobuf definition for that. That's mostly just me being lazy though, and I'd rather revisit this now than after we've baked it into a bunch of implementations.

I do like the idea of having a standard way to ship signed byte arrays around, but it's also possible that because I'm focused on this one use case of routing records that it's not actually as generic or broadly useful as I'm hoping.

You could certainly argue that it would be even more useful to have a standard way of shipping signed structured data around. We could potentially define an envelope as something like

{
  publicKey: {
    // cbor map containing key
  },
  contents: {
    // cbor map containing whatever you want
  },
  signature: "byte blob containing sig of canonically-serialized contents map"
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realized I didn't address @raulk's point in my last comment

I insist it is a terrible idea to assume that, even with canonical serialisation, your system will be perennially capable of reconstituting a payload out of its constituents, in a way that it matches the original signature

That's the other reason I "gave up" on IPLD / CBOR and just went with the signed binary blob, although I don't know if I feel as strongly as Raúl does about it. We could potentially try to guard against differences in encoder implementations by having a ton of test vectors, but of course there's no way to guarantee we'd catch everything.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yusefnapora thanks for the detailed explanation here. I get IPLD being a big ask here, although it's probably worth thinking about (for the future) if there's a minimal subset of IPLD that it would be useful for libp2p to have access to.

It being easier to implement and wanting to get this shipped are totally reasonable reasons for us to want to go with protobufs. I guess I just wanted to clarify why the decision was made.

Also, I'm not sure if this is what @raulk was trying to explain but after speaking with @Stebalien I see that if a new format came along that wanted to have a consistent hash for a set of envelopes that we could just define the encoding for that new format. It's unfortunate, from a developer perspective, that we'd have to define and implement a canonical protobuf encoding instead of just using a pre-standardized and packaged encoder but it's still achievable within the spec. Given that IPLD defines codecs for each serialization format we import if we're not going with a pre-supported IPLD format then we'd have to define a new codec anyway.

@yusefnapora your suggestion would certainly do the job.


## Problem Statement

Sometimes we'd like to store some data in a public location (e.g. a DHT, etc),
or make use of potentially untrustworthy intermediaries to relay information. It
would be nice to have an all-purpose data container that includes a signature of
the data, so we can verify that the data came from a specific peer and that it hasn't
been tampered with.

## Domain Separation

Signatures can be used for a variety of purposes, and a signature made for a
specific purpose MUST NOT be considered valid for a different purpose.

Without this property, an attacker could convince a peer to sign a payload in
one context and present it as valid in another, for example, presenting a signed
address record as a pubsub message.

We separate signatures into "domains" by prefixing the data to be signed with a
string unique to each domain. This string is not contained within the payload or
the outer envelope structure. Instead, each libp2p subsystem that makes use of
signed envelopes will provide their own domain string when constructing the
envelope, and again when validating the envelope. If the domain string used to
validate is different from the one used to sign, the signature validation will
fail.

Domain strings may be any valid UTF-8 string, but should be fairly short and
descriptive of their use case, for example `"libp2p-routing-record"`.

## Type Hinting

The envelope record can contain an arbitrary byte string payload, which will
need to be interpreted in the context of a specific use case. To assist in
"hydrating" the payload into an appropriate domain object, we include a "type
hint" field. The type hint consists of a [multicodec][multicodec] code,
optionally followed by an arbitrary byte sequence.

This allows very compact type hints that contain just a multicodec, as well as
"path" multicodecs of the form `/some/thing`, using the ["namespace"
multicodec](https://github.com/multiformats/multicodec/blob/master/table.csv#L23),
whose binary value is equivalent to the UTF-8 `/` character.

## Wire Format

Since we already have a [protobuf definition for public keys][peer-id-spec], we
can use protobuf for this as well and easily embed the key in the envelope:


```protobuf
message SignedEnvelope {
jacobheun marked this conversation as resolved.
Show resolved Hide resolved
PublicKey publicKey = 1; // see peer id spec for definition
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Food for thought. Including the pubkey may be superfluous for some signature schemes.

Given an ECDSA signature, one can recover the public key provided we know the curve, the hash function, and the plaintext that was signed. Bitcoin and Ethereum use that trick heavily to validate transactions.

See:

https://crypto.stackexchange.com/questions/18105/how-does-recovering-the-public-key-from-an-ecdsa-signature-work
https://crypto.stackexchange.com/questions/60218/recovery-public-key-from-secp256k1-signature-and-message

bytes typeHint = 2; // type hint
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't this be a string?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Few points:

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 on the field naming - I like payload_type better than "type hint" & it looks like snake case is the protobuf way.

I did make the type field bytes based on @Stebalien's suggestion to use multicodec-prefixed byte strings. Question about that: do we have a go library that validates / parses multicodecs? I found go-multicodec-packed, but it was deprecated a while ago.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if this is what you're looking for, but go-cid has references to the codecs in the multicodec table at https://github.com/ipfs/go-cid/blob/9bb7ea69202c6c9553479eb355ab8a8a97d43a2e/_rsrch/cidiface/enums.go

bytes contents = 3; // payload
bytes signature = 4; // see below for signing rules
}
raulk marked this conversation as resolved.
Show resolved Hide resolved
```

The `publicKey` field contains the public key whose secret counterpart was used
to sign the message. This MUST be consistent with the peer id of the signing
peer, as the recipient will derive the peer id of the signer from this key.

The `typeHint` field contains a [multicodec][multicodec]-prefixed type hint as
described in the [Type Hinting section](#type-hinting).

The `contents` field contains the arbitrary byte string payload.

The `signature` field contains a signature of all fields except `publicKey`,
generated as described below.

## Signature Production / Verification

When signing, a peer will prepare a buffer by concatenating the following:

- The length of the [domain separation string](#domain-separation) string in
bytes
- The domain separation string, encoded as UTF-8
- The length of the `typeHint` field in bytes
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd feel better if we also signed the key. I'm concerned users will use this in a "commitment" and/or "ticket" protocol as follows:

  1. Victim publishes a signed envelop stating "I (unspecified) claim X" to a blockchain.
  2. Attacker generates a key such that the key validates the signed envelope.
  3. Attacker claims X.

Basically, "an attacker can't make a public key that validates a target signature S" is not a guarantee of public key cryptography. However, it's one that's often assumed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You know, this is probably more work than necessary.

- The value of the `typeHint` field
- The length of the `contents` field in bytes
- The value of the `contents` field
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may not be an issue but this format doesn't allow us to add new signed fields. We could alternatively have a rule that says: take the protobuf fields, sort them, then sign them (not sure how hard that one is).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorting the protobuf fields seems possible but awkward in go. There's a Properties type that will give you the field names, but you need to use reflection on the message struct and use the reflected Type of each struct field to get the properties.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sort by tag:

  1. Split into raw fields.
  2. Sort the fields.
  3. Reserialize.

However, this won't handle protobufs within protobufs...

(we can probably punt on this)


The length values for each field are encoded as 64-bit unsigned integers in
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

64-bit is rather large; we can get away with u32 here and perhaps save a few precious bytes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, or even a uvarint, unless we're worried about alignment.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or just use minimal uvarints?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does space matter here? The length fields aren't stored or sent on the wire, just used when preparing the buffer to sign / validate. I guess all that time writing javascript has warped my memory efficiency sensibilities 😆

network order (big-endian).

Then they will sign the buffer according to the rules in the [peer id
spec][peer-id-spec] and set the `signature` field accordingly.

To verify, a peer will "inflate" the `publicKey` into a domain object that can
verify signatures, prepare a buffer as above and verify the `signature` field
against it.

[addr-records-rfc]: ./0003-address-records.md
jacobheun marked this conversation as resolved.
Show resolved Hide resolved
[peer-id-spec]: ../peer-ids/peer-ids.md
[multicodec]: https://github.com/multiformats/multicodec
[uvarint]: https://github.com/multiformats/unsigned-varint
243 changes: 243 additions & 0 deletions RFC/0003-routing-records.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,243 @@
# RFC 0003 - Peer Routing Records

- Start Date: 2019-10-04
- Related Issues:
- [libp2p/issues/47](https://github.com/libp2p/libp2p/issues/47)
- [go-libp2p/issues/436](https://github.com/libp2p/go-libp2p/issues/436)

## Abstract

This RFC proposes a method for distributing peer routing records, which contain
a peer's publicly reachable listen addresses, and may be extended in the future
to contain additional metadata relevant to routing. This serves a similar
purpose to [Ethereum Node Records][eip-778]. Like ENR records, libp2p routing
records should be extensible, so that we can add information relevant to as-yet
unknown use cases.

The record described here does not include a signature, but it is expected to
be serialized and wrapped in a [signed envelope][envelope-rfc], which will
prove the identity of the issuing peer. The dialer can then prioritize
self-certified addresses over addresses from an unknown origin.

## Problem Statement

All libp2p peers keep a "peer store", which maps [peer ids][peer-id-spec] to a
set of known addresses for each peer. When the application layer wants to
contact a peer, the dialer will pull addresses from the peer store and try to
initiate a connection on one or more addresses.

Addresses for a peer can come from a variety of sources. If we have already made
a connection to a peer, the libp2p [identify protocol][identify-spec] will
inform us of other addresses that they are listening on. We may also discover
their address by querying the DHT, checking a fixed "bootstrap list", or perhaps
through a pubsub message or an application-specific protocol.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably worth mentioning rendezvous here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also Peer eXchange.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, and mDNS. Linking to the specs would be ideal.


In the case of the identify protocol, we can be fairly certain that the
addresses originate from the peer we're speaking to, assuming that we're using a
secure, authenticated communication channel. However, more "ambient" discovery
methods such as DHT traversal and pubsub depend on potentially untrustworthy
third parties to relay address information.

Even in the case of receiving addresses via the identify protocol, our
confidence that the address came directly from the peer is not actionable, because
the peer store does not track the origin of an address. Once added to the peer
store, all addresses are considered equally valid, regardless of their source.

We would like to have a means of distributing _verifiable_ address records,
which we can prove originated from the addressed peer itself. We also need a way to
track the "provenance" of an address within libp2p's internal components such as
the peer store. Once those pieces are in place, we will also need a way to
prioritize addresses based on their authenticity, with the most strict strategy
being to only dial certified addresses.

### Complications

While producing a signed record is fairly trivial, there are a few aspects to
this problem that complicate things.

1. Addresses are not static. A given peer may have several addresses at any given
time, and the set of addresses can change at arbitrary times.
2. Peers may not know their own addresses. It's often impossible to automatically
infer one's own public address, and peers may need to rely on third party
peers to inform them of their observed public addresses.
3. A peer may inadvertently or maliciously sign an address that they do not
control. In other words, a signature isn't a guarantee that a given address is
valid.
4. Some addresses may be ambiguous. For example, addresses on a private subnet
are valid within that subnet but are useless on the public internet.

The first point can be addressed by having records contain a sequence number
Stebalien marked this conversation as resolved.
Show resolved Hide resolved
that increases monotonically when new records are issued, and by having newer
records replace older ones.

The other points, while worth thinking about, are out of scope for this RFC.
However, we can take care to make our records extensible so that we can add
additional metadata in the future. Some thoughts along these lines are in the
[Future Work section below](#future-work).

## Address Record Format

Here's a protobuf that might work:

```protobuf

// RoutingState contains the listen addresses for a peer at a particular point in time.
message RoutingState {
// AddressInfo wraps a multiaddr. In the future, it may be extended to
// contain additional metadata, such as "routability" (whether an address is
// local or global, etc).
message AddressInfo {
bytes multiaddr = 1;
}

// the peer id of the subject of the record (who these addresses belong to).
bytes peerId = 1;

// A monotonically increasing sequence number, used for record ordering.
uint64 seq = 2;

// All current listen addresses
repeated AddressInfo addresses = 4;
jacobheun marked this conversation as resolved.
Show resolved Hide resolved
}
```

The `AddressInfo` wrapper message is used instead of a bare multiaddr to allow
us to extend addresses with additional metadata [in the future](#future-work).

The `seq` field contains a sequence number that MUST increase monotonically as
new records are created. Newer records MUST have a higher `seq` value than older
records. To avoid persisting state across restarts, implementations MAY use unix
epoch time as the `seq` value, however they MUST NOT attempt to interpret a
`seq` value from another peer as a valid timestamp.

#### Example

```javascript
{
peerId: "QmAlice...",
seq: 1570215229,

addresses: [
{
addr: "/ip4/1.2.3.4/tcp/42/p2p/QmAlice",
},
{
addr: "/ip4/10.0.1.2/tcp/42/p2p/QmAlice",
}
]
}
```


## Certification / Verification

This structure can be serialized and contained in a [signed
envelope][envelope-rfc], which lets us issue "self-certified" address records
that are signed by the peer that the addresses belong to.

To produce a "self-certified" address, a peer will construct a `RoutingState`
containing all of their publicly-reachable listen addresses. A peer SHOULD only
include addresses that it believes are routable via the public internet, ideally
having confirmed that this is the case via some external mechanism such as a
successful AutoNAT dial-back.

In some cases we may want to include localhost or LAN-local address; for
example, when testing the DHT using many processes on a single machine. To
support this, implementations may use a global runtime configuration flag or
environment variable to control whether local addresses will be included.

Once the `RoutingState` has been constructed, it should be serialized to a byte
string and wrapped in a [signed envelope][envelope-rfc]. The `publicKey` field
of the envelope MUST be able to derive the `peerId` contained in the record. If
the envelope's `publicKey` does not match the `peerId` of the routing record,
the record MUST be rejected as invalid.

### Signed Envelope Domain

Signed envelopes require a "domain separation" string that defines the scope
or purpose of a signature.

When wrapping a `RoutingState` in a signed envelope, the domain string MUST be
`libp2p-routing-state`.

### Signed Envelope Type Hint

Signed envelopes contain a "type hint" that indicates how to interpret the
contents of the envelope.

Ideally, we should define a new multicodec for routing records, so that we can
identify them in a few bytes. While we're still spec'ing and working on the
initial implementation, we can use the UTF-8 string
`"/libp2p/routing-state-record"` as the type hint value.

## Peer Store APIs

We will need to add a few methods to the peer store:

- `AddCertifiedAddrs(envelope) -> Maybe<Error>`
- Add a self-certified address, wrapped in a signed envelope. This should
validate the envelope signature & store the envelope for future reference.
If any certified addresses already exist for the peer, only accept the new
envelope if it has a greater `seq` value than existing envelopes.

- `CertifiedAddrs(peerId) -> Set<Multiaddr>`
- return the set of self-certified addresses for the given peer id

- `SignedRoutingState(peerId) -> Maybe<SignedEnvelope>`
- retrive the signed envelope that was most recently added to the peerstore
for the given peer, if any exists.

And possibly:

- `IsCertified(peerId, multiaddr) -> Boolean`
- has a particular address been self-certified by the given peer?


We'll also need a method that constructs a new `RoutingState` containing our
listen addresses and wraps it in a signed envelope. This may belong on the Host
instead of the peer store, since it needs access to the private signing key.

## Dialing Strategies

Once self-certified addresses are available via the peer store, we can update
the dialer to prefer using them when possible. Some systems may want to _only_
dial self-certified addresses, so we should include some configuration options
to control whether non-certified addresses are acceptable.

## Changes to core libp2p protocols

How to publish these to the DHT? Are there backward compatibility issues with
older unsigned address records? Maybe we just publish these to a different key
prefix...

Should we update identify and mDNS discovery to use signed records?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rendezvous needs to be updated to support signed address records as well.


## Future Work

Some things that were originally considered in this RFC were trimmed so that we
can focus on delivering a basic self-certified record, which is a pressing need.

This includes a notion of "routability", which could be used to communicate
whether a given address is global (reachable via the public internet),
LAN-local, etc. We may also want to include some kind of confidence score or
priority ranking, so that peers can communicate which addresses they would
prefer other peers to use.

To allow these fields to be added in the future, we wrap multiaddrs in the
`AddressInfo` message instead of having the `addresses` field be a list of "raw"
multiaddrs.

Another potentially useful extension would be a compact protocol table or bloom
filter that could be used to test whether a peer supports a given protocol
before interacting with them directly. This could be added as a new field in the
`RoutingState` message.



[identify-spec]: ../identify/README.md
[peer-id-spec]: ../peer-ids/peer-ids.md
[autonat]: https://github.com/libp2p/specs/issues/180
[ipld]: https://ipld.io/
[ipld-schema-schema]: https://github.com/ipld/specs/blob/master/schemas/schema-schema.ipldsch
[envelope-rfc]: ./0002-signed-envelopes.md
[eip-778]: https://eips.ethereum.org/EIPS/eip-778