-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Implement own peer set for availability-distribution #2177
Comments
In practice, the code here should primarily affect the network bridge in implementation, but there will be some small fallout in the other networking subsystems so they can receive dormancy notifications. I also think that we could implement dormancy as asymmetric. i.e. peers A and B might view each other differently. And that when we consider a peer active for gossip it means that we expect them to send us things, whereas we send them things only if they consider us active. That would leave us with four peer states: // bitflags in practice
enum PeerActive {
Dormant,
ActiveReceive,
ActiveSend,
ActiveBoth,
} The main reason I am thinking about asymmetry is because it makes the process of selecting other peers to be active with much easier. With asymmetrical activity each peer can just make random selections from the peer set to request to become active with. Sending data is cheaper than receiving as there is no data validation or signature checking, and so we should have different tolerances for how many I guess we want some new network messages: // i don't care about names, just purposes
enum Msg {
...
MakeMeActive, // request to be sent gossip messages
MadeActive, // request accepted
RejectActive, // request rejected
MakeMeDormant, // request to stop being sent gossip messages.
MadeDormant, // request accepted. rejection not allowed.
} We can also rotate our set of active peers over time by making some dormant and randomly selecting new ones. That would prevent us from being active only with byzantine peers. So we also need a new |
Around this paritytech/substrate#7797 and paritytech/substrate#7700 we do not currently know if we want longer-lived gossip priority groups, which makes space for low overhead connections, or if gossip messages should ideally be more random for some reason. We could not go purely random gossip because politeness requires tracking our peers' knowledge, which supports tracking longer-lived gossip priority groups like in paritytech/substrate#7700 Yet, all this knowledge required by politeness should normally be fairly localized in time, so we actually could always be phasing peers into and out of the gossip priority group, while still giving us a much more randomized topology. This is clearly future work, and not so important right now, but I wanted to (a) put it on people's radar in case it impacts any lower level design decisions, and (b) use it to explain why the current approach looks like the right one. |
@rphmeier What you described is very similar to the code that is already in
It is totally possible to leave TCP connections open without opening a channel (i.e. keeping a peer dormant). The protocol doesn't require you to open a channel when you open a connection. For the sake of simplicity, the handshake is currently not customizable (it is hardcoded to a |
The exact topology is not so important, as long as it's more-or-less an expander graph. What is important, which we don't do currently, is ensure that all validators are connected to each other, not separated by an attacker. This could be done by defining a topology such as a d-regular random topology over the validators, calculated implicitly on-chain for every epoch, and having validators follow this in addition to any random non-validator full-node peers. |
By having everything peer set related depend directly on the enum the code becomes more clear and it is also straight forward to add more peersets/protocols as the compiler will complain if you forget to implement parts of it.
By having everything peer set related depend directly on the enum the code becomes more clear and it is also straight forward to add more peersets/protocols as the compiler will complain if you forget to implement parts of it.
By having everything peer set related depend directly on the enum the code becomes more clear and it is also straight forward to add more peersets/protocols as the compiler will complain if you forget to implement parts of it.
By having everything peer set related depend directly on the enum the code becomes more clear and it is also straight forward to add more peersets/protocols as the compiler will complain if you forget to implement parts of it.
#2263) * More doc fixes. * Minor refactorings in the process of #2177 By having everything peer set related depend directly on the enum the code becomes more clear and it is also straight forward to add more peersets/protocols as the compiler will complain if you forget to implement parts of it. * Add peer set infos on startup properly For feature real_overseer. + Fixes from review. Thanks @coriolinus and @ordian! * More structure in network-bridge Some changes, which would have helped me in groking the code faster. Entry points/public types more to the top. Factored out implementation in their own files, to clear up the top-level view. * Get rid of local ProtocolName type definition. Does not add much at this level. * Fix tests + import cleanup. * Make spaces tabs. * Clarify what correct parameters to send_message are * Be more less vague in docs of send_message. * Apply suggestions from code review Extend copyright on new files to 2021 as well. Co-authored-by: Andronik Ordian <write@reusable.software> Co-authored-by: Andronik Ordian <write@reusable.software>
Closing in favor of #2306. |
…tytech#2177 (paritytech#2263) * More doc fixes. * Minor refactorings in the process of paritytech#2177 By having everything peer set related depend directly on the enum the code becomes more clear and it is also straight forward to add more peersets/protocols as the compiler will complain if you forget to implement parts of it. * Add peer set infos on startup properly For feature real_overseer. + Fixes from review. Thanks @coriolinus and @ordian! * More structure in network-bridge Some changes, which would have helped me in groking the code faster. Entry points/public types more to the top. Factored out implementation in their own files, to clear up the top-level view. * Get rid of local ProtocolName type definition. Does not add much at this level. * Fix tests + import cleanup. * Make spaces tabs. * Clarify what correct parameters to send_message are * Be more less vague in docs of send_message. * Apply suggestions from code review Extend copyright on new files to 2021 as well. Co-authored-by: Andronik Ordian <write@reusable.software> Co-authored-by: Andronik Ordian <write@reusable.software>
For availability distribution and availability recovery, we need to be able to connect to a lot of peers and communicate with them directly. In both subsystems we need to send node specific data to lots of different nodes, which is most efficient when just sent to them directly.
The problem: Right now the network subsystem treats all connections equally and will use all of them for example for gossiping. Gossiping to a thousand peers would be very inefficient though, therefore we need to provide a way to connect to peers without starting each supported protocol with them. @rphmeier coined the term dormant connections for this.
Use cases:
Use cases 1 and 2 could be implemented by just keeping connections open to all validators. For use case 3 this is not an option, as we don't expect nodes to permanently depend on availability recovery from validators. This is more a security measure in case a node behaves badly. (Collator hides a block sent to Validators from full nodes of its parachain and could thus stall the parachain.) So for use case 3, those connections should in general be short lived and not used much, to limit the load on validators.
Issue #1979 directly depends on this.
The text was updated successfully, but these errors were encountered: