-
Notifications
You must be signed in to change notification settings - Fork 28
Conversation
pb/pstore.proto
Outdated
|
||
// A serialized SignedEnvelope containing the most recent RoutingState record. | ||
// TODO: import the SignedEnvelope type instead | ||
bytes signedRoutingRecord = 4; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, instead of duplicating everything, we can just decode with two different protobufs.
// Used by the peerstore to find addresses for a given peer.
message AddrEntry {
message SignedRoutingRecord {
PublicKey publicKey = 1;
RoutingState contents = 3;
}
SignedRoutingRecord record = 1;
repeated AddressEntry additionalAddresses = 2;
}
// Use to pick out the signed record (for the DHT).
message OpaqueAddrEntry {
bytes record = 1;
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's an interesting technique.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did an initial pass; thanks for this first iteration @yusefnapora!
In general, I think this changeset can be simplified. I'll reason through how these changes would be used in practice (or at least my understanding), and what I think should happen on each action.
-
When the DHT receives a routing record for a peer, it will hand it off to the peerstore, telling it: "memorize this".
- It's possible that the peer already had a routing record, in which case we'll replace it atomically if the incoming seq is higher.
- We add all addresses from the record into our peerstore.
-
When the DHT is responding to incoming
FIND_PEER
,FIND_PROVS
,GET
requests, it will need to ask the peerstore for the latest routing record of the closest peers.- A variadic
func SignedPeerRecords(ids ...peer.ID) ([]SignedPeerRecord, error)
would be useful to batch the query. - If we have no signed peer record for one or many peers, the DHT would call the existing
Addrs()
for those.
- A variadic
-
We should remember the latest signed peer routing all addresses expire. When that happens, the signed peer routing record would be scrapped with it.
I don't think we need to inline the address status (certified
flag) with each address. Unless there's a data access pattern that would benefit from that data layout? It's also a bit funky because addresses that are certified with seq = 1, may disappear from seq = 2, so we'd need to reset the certified
flag of all addrs with every update, at which point the value of the field starts being questionable.
I'd like to explore this design:
addr_book_record
\_ peer_id: bytes
\_ signed_addrs: []AddrEntry
\_ unsigned_addrs: []AddrEntry
\_ certified_record
\_ seq: bytes
\_ raw: bytes
WDYT?
@raulk thanks for the feedback, that makes a lot of sense. I especially like splitting the addr_entries into two sets, rather than having the certified flag. The flag does seem likely to get out of sync. And having a separate list makes it easy to clear out older certified addrs.
For this, do you mean that when we expire the last entry from the |
Yup. |
I realized yesterday that with the addrs split into two lists, it's possible to have duplicates. Also, updating the TTL using the The last commit updates the Add/Set addr methods in both implementations to ensure that an addr only exists in either the unsigned or signed list, but not both. It also changes the SetAddr behavior to update the TTL of existing signed addrs, instead of making a duplicate unsigned address entry with the new TTL as it was doing before. Also, I found a bug! The I fixed it here, but I can also open a new PR with just the fix if that seems better. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the memory peerstore, it makes more sense to have each segment hold both signed and unsigned addresses.
This will simplify the code, and also obviate the need to take two locks when retrieving addresses.
- removes CertifiedAddrs method from addr book implementations - asserts AddrBooks are also CertifiedAddrBooks - ensures we either have certified addrs or uncertified, but not both - updates tests for CertifiedAddrs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@yusefnapora I stopped my review midway because I realised that this PR is assuming that a peer can have both signed and unsigned addresses, which we later decided against. Mind revisiting this logic and updating it to the current model/thinking?
pb/pstore.proto
Outdated
@@ -14,6 +14,11 @@ message AddrBookRecord { | |||
// The multiaddresses. This is a sorted list where element 0 expires the soonest. | |||
repeated AddrEntry addrs = 2; | |||
|
|||
repeated AddrEntry signedAddrs = 3; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Snake case for protobufs, please 🙏 (here and below)
test/addr_book_suite.go
Outdated
id, _ := peer.IDFromPrivateKey(priv) | ||
allAddrs := GenerateAddrs(10) | ||
certifiedAddrs := allAddrs[:5] | ||
uncertifiedAddrs := allAddrs[5:] | ||
rec := peer.NewPeerRecord() | ||
rec.PeerID = id | ||
rec.Addrs = certifiedAddrs | ||
signedRec, err := rec.Sign(priv) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a lot of assignments! Enclosing these in a var ()
block could help readability, due to alignment.
} | ||
|
||
func (s *addrSegments) get(p peer.ID) *addrSegment { | ||
return s[byte(p[len(p)-1])] | ||
func (segments *addrSegments) get(p peer.ID) *addrSegment { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Receiver names are initials in go. Try ss
(plural of s
-- segment).
since signed and unsigned addrs are now mutually exclusive, there's no benefit to keeping two lists of addresses.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few important changes needed here.
// this has to be done after we add the addresses, since if | ||
// we try to flush a datastore record with no addresses, | ||
// it will just get deleted | ||
pr, err := ab.loadRecord(p, true, false) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to lock the record here.
log.Errorf("unable to load record for peer %s: %v", p.Pretty(), err) | ||
return nil | ||
} | ||
if pr.CertifiedRecord == nil || len(pr.CertifiedRecord.Raw) == 0 || len(pr.Addrs) == 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to lock the record for read here.
// GetPeerRecord returns a record.Envelope containing a peer.PeerRecord for the | ||
// given peer id, if one exists. | ||
// Returns nil if no signed PeerRecord exists for the peer. | ||
func (ab *dsAddrBook) GetPeerRecord(p peer.ID) *record.Envelope { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmmm... I think this this method should return (*record.Envelope, ok bool, err error)
. Rationale:
- the
ok
bool is idiomatic in go. - we are currently swallowing errors.
addrs := make([]ma.Multiaddr, len(pr.Addrs)) | ||
for i, a := range pr.Addrs { | ||
addrs[i] = a.Addr |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason for this change?
@@ -98,6 +110,7 @@ func (mab *memoryAddrBook) gc() { | |||
now := time.Now() | |||
for _, s := range mab.segments { | |||
s.Lock() | |||
var collectedPeers []peer.ID |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could use a single slice for all segments, resetting to [:0]
on each iteration. That would produce a smaller memory footprint.
pidSet := peer.NewSet() | ||
for _, s := range mab.segments { | ||
s.RLock() | ||
for pid, _ := range s.addrs { | ||
pids = append(pids, pid) | ||
for pid, amap := range s.addrs { | ||
if amap != nil && len(amap) > 0 { | ||
pidSet.Add(pid) | ||
} | ||
} | ||
s.RUnlock() | ||
} | ||
return pids | ||
return pidSet.Peers() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why this change? peer.Set
is synchronized. Aren't peer IDs unique anyway?
amap = make(map[string]*expiringAddr, len(addrs)) | ||
amap, ok := s.addrs[p] | ||
if !ok { | ||
amap = make(map[string]*expiringAddr) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why remove the initial length?
pstoremem/addr_book.go
Outdated
@@ -171,6 +240,21 @@ func (mab *memoryAddrBook) AddAddrs(p peer.ID, addrs []ma.Multiaddr, ttl time.Du | |||
} | |||
} | |||
} | |||
|
|||
// when adding signed addrs, make sure only the addrs from the input list remain. | |||
if signed { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should do the removal before. This can remove valid signed addresses if we're transitioning from unsigned to signed.
Seq: rec.Seq, | ||
} | ||
s.Unlock() // need to release the lock, since addAddrs will try to take it | ||
mab.addAddrs(rec.PeerID, rec.Addrs, ttl, true) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will this behave well in the transition from unsigned to signed? Or when updating the record? Shouldn't we clear the address set first? This is not a purely additive operation, it's a replacement operation.
pstoremem/addr_book.go
Outdated
// if we have a valid peer record, ignore unsigned addrs | ||
peerRec := mab.GetPeerRecord(p) | ||
if peerRec != nil { | ||
return |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a bit of a tricky case. We don't want to add random addresses from the network, but we might want to add our own observations (not to advertise, just for local use).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Concrete problem: We have an old record for a peer and now ask IPFS to dial the peer at a new address. IPFS will try to add the new address to the peerstore and fail.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah. We might have to bite the bullet and extend the write methods (Add, Set) to accept option flags, or provenances. This really steps on the toes of the proposal for peerstore v2, though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ideally, I'd like to have different "levels" of addresses and, mark the address "level" when returning it (locally set > signed > hearsay). However, that might be a bit tricky to work into the API.
Temporarily store all addresses in the addrbook without classifying their provenance/authoritativeness; even in the presence of a signed peer record.
We want to merge this changeset as soon as possible, but in its original form (certified addresses erase uncertified ones, and block any uncertified addresses added thereafter), it would cause regressive behaviour for a few use cases:
The final solution will revolve around address labels, which are being incubated here: libp2p/go-libp2p-core#123 In the meantime, @Stebalien and I have agreed to merge this PR lifting the restrictions on addresses. Basically, the peerstore will swallow all addresses. This means that dials can be insecure (i.e. hitting uncertified addresses when a certified record exists), but it will allow us to integrate certified records downstream, while we work on tagging addresses properly so they can be subsequently filtered. |
There's a huge refactor of the peerstore coming anyway. |
This adds new methods to add "certified" addresses to the peerstore, as described in this RFC: libp2p/specs#217.
It fits the API defined in this core PR: libp2p/go-libp2p-core#73.
There are a few missing pieces still:
SignedRoutingState(peerId)
method. Should it expire also?bytes
blob, since I couldn't figure out how to import the .proto file from the -core PR into the peerstore .proto file. It would be nice to have the datastore entry just contain aSignedEnvelope
protobuf message instead of serializing the envelope and storing the bytes.Also, @Stebalien, @raulk & I talked in the RFC comments about hoisting the
seq
counter from the routing record to the envelope. I didn't, but now I kind of wish I had, since it's awkward to have to deserialize the routing record payload before you can compareseq
numbers. In this branch so far, I'm storing both the last received routing record (still in the envelope) and separately storing the seq number, so I can skip unwrapping the envelope just to compareseq
s.