Skip to content
This repository has been archived by the owner on Nov 15, 2023. It is now read-only.

Deprecation of Sentry Nodes #6845

Closed
mxinden opened this issue Aug 7, 2020 · 12 comments
Closed

Deprecation of Sentry Nodes #6845

mxinden opened this issue Aug 7, 2020 · 12 comments
Labels
J1-meta A specific issue for grouping tasks or bugs of a specific category. U2-some_time_soon Issue is worth doing soon. Z2-medium Can be fixed by a coder with good Rust knowledge but little knowledge of the codebase.

Comments

@mxinden
Copy link
Contributor

mxinden commented Aug 7, 2020

Summary

With the upcoming release of Substrate and Pokadot support for sentry nodes will be deprecated. We are currently planning on removing support for sentry nodes by October 2020. Operators protecting their validators via sentry nodes today should decommission their sentry nodes and ensure their validator nodes are publicly routable before support has been removed.

Status Quo

Today one way of operating validator nodes in a secure fashion involves running one or many sentry nodes in front of ones validators. As an example setup see the current Polkadot Secure Validator project. Sentry nodes, operating as full nodes, act as proxies for the validator nodes, thus offer the following security improvements:

  • Sentry nodes serve as a DOS protection mechanism as multiple sentry nodes can handle a large amount of traffic for a single validator node. Sentry nodes operate on application layer. Thereby they can filter out bogus messages e.g. invalid block announcements.
  • Sentry nodes don't have access to relevant private key material. This indirection protects the key material against certain classes of software exploits, though not the most severe exploits that lead to remote code execution.

While sentry nodes can improve validator security, there are multiple trade-offs involved.

  • The notion of sentry nodes adds complexity to both the node implementation itself as well as the overall network topology. One example concerns any component using the DHT. When publishing a validators addresses the authority discovery module on a validator node can not do so directly, but has to forward the signed addresses to one of its sentry nodes for it to publish them on the DHT (See Authority-discovery should be performed only by sentries #6264).
  • The extra hop for all traffic destined for a validator behind sentry nodes adds latency. This latency is not to be confused with the latency a commodity layer-4 proxy would introduce. Instead, as the sentry node operates as a full node, the additional latency does not only involve packet forwarding or transport-layer-security de- and encryption, but also things like block validation.

Reasoning for Deprecation

While the complexity required to support sentry nodes is manageable for simple blockchains that do most if not all communication through a gossip network, this complexity increases heavily for more sophisticated network topologies like the one required for Polkadot.

One example of the additional complexity can be found within Polkadot. In order to support parachains in a scalable manner, one can not do all collator node to validator node communication through gossiping as it would overwhelm the network. Instead collator nodes of parachains need to be able to talk to validator nodes of the relaychain directly.

With sentry nodes in mind, validators would not be directly reachable, but instead only reachable through their sentry nodes. A validator would need to tell its sentry node to allow traffic from a specific collator before that collator could forward messages through the sentry node to the validator. Collators would need to discover not validators but the sentry nodes of those validators. ... For details on the Polkadot topology you can e.g. consult the Polkadot overview paper.

Taking all the above into account we have decided to deprecate support for sentry nodes. This decision might be revisited in the long term future, e.g. once the parachain protocols have stabilized.

Actions Required / Recommended

Required

All operators of validator nodes are required to make the TCP port of the P2P protocol of their validator nodes routable via the public internet. The TCP port of the RPC endpoint should stay unchanged and protected.

Recommended Suggested

While the P2P protocol port of a validator node needs to be publicly routable, one can still protect the endpoint on layer 4 (TCP) and downwards. Depending on your required security level you might want to put a mature TCP proxy in-front of your validator (e.g. Nginx). You can operate a stateful firewall yourself or use a hosted firewall / DOS protection service by your favorite cloud provider. You can consider reaching out to a large CDN. ...

Once supported, we recommend using remote signing, doing all relevant cryptographic operations not on the validator node itself, but on a separate node. There might be an intermediate feature version allowing cryptographic operations to happen in a different OS process on the same machine.

Follow operational best practices. Only expose a minimal amount of ports. Make sure to record logs. Setup monitoring for each machine and application involved. Configure alerting software. ...

Timeline

Deprecation of support for sentry nodes will happen with the next release of Substrate and Polkadot. Updates to the Polkadot secure validator project will happen thereafter. We don't expect the actual removal before October 2020.


Deprecation warning will be introduced through #6779.

@mxinden mxinden added J1-meta A specific issue for grouping tasks or bugs of a specific category. U2-some_time_soon Issue is worth doing soon. Z2-medium Can be fixed by a coder with good Rust knowledge but little knowledge of the codebase. labels Aug 7, 2020
@bkchr
Copy link
Member

bkchr commented Aug 7, 2020

Duplicate of #6762 ?

@lamafab
Copy link

lamafab commented Aug 7, 2020

@mxinden Regarding the firewall, are there some options like limiting the amount of libp2p messages or "weighing" those (such as preventing nodes from sending a lot of "expensive" traffic)? The nginx proxy you refer to is for RPC calls, I assume?

@LukeWheeldon
Copy link

Looking forward to some guidance on nginx so I can start testings, before sentry support is completely dropped. Thank you.

@mxinden
Copy link
Contributor Author

mxinden commented Aug 10, 2020

Regarding the firewall, are there some options like limiting the amount of libp2p messages or "weighing" those (such as preventing nodes from sending a lot of "expensive" traffic)?

I am not aware of a Substrate specific (layer 7) firewall. Thus one needs to use firewalls operating on layer 4 (TCP) and below. Given that these firewalls do not understand the application specific traffic there is no way for them to "weight" the impact of a request.

With the ongoing backpressure efforts the node itself would be able to "weight" per peer and thus ensure fairness.

The nginx proxy you refer to is for RPC calls, I assume?

This issue only addresses the P2P port. It does not address the RPC port. The RPC port should stay as it is: secured and private.

@amnay-mo
Copy link

amnay-mo commented Oct 5, 2020

Does this affect the rules for applying for "Kusama’s Thousand Validators Programme"?

In the rules, it is required that Validators should run "at least one sentry node"

@haikoschol
Copy link

@amnay-mo I believe it does and that those rules need updating. @wpank?

mxinden added a commit to mxinden/1k-validators-be that referenced this issue Oct 21, 2020
Sentry nodes have been deprecated (see paritytech/substrate#6845 for details). Thus there is no need to require sentry node uptime.
mxinden added a commit to mxinden/substrate that referenced this issue Oct 21, 2020
The notion of sentry nodes has been deprecated (see [1] for details).
This commit removes support for sentry nodes in the
`client/authority-discovery` module.

While removing `Role::Sentry` this commit also introduces
`Role::Discover`, allowing a node to discover addresses of authorities
without publishing ones own addresses. This will be needed in Polkadot
for collator nodes.

[1] paritytech#6845
mxinden added a commit to paritytech/polkadot that referenced this issue Oct 21, 2020
The notion of sentry nodes has been deprecated (see [1] for details).
Support for sentry nodes in the `client/authority-discovery` module has
been removed.

This commit adjusts the instantiation of the authority discovery worker
accordingly, only spawning the module on authority nodes.

[1] paritytech/substrate#6845
ghost pushed a commit that referenced this issue Oct 26, 2020
* client/authority-discovery: Remove sentry node logic

The notion of sentry nodes has been deprecated (see [1] for details).
This commit removes support for sentry nodes in the
`client/authority-discovery` module.

While removing `Role::Sentry` this commit also introduces
`Role::Discover`, allowing a node to discover addresses of authorities
without publishing ones own addresses. This will be needed in Polkadot
for collator nodes.

[1] #6845

* client/authority-discovery/service: Improve PeerId comment
ghost pushed a commit to paritytech/polkadot that referenced this issue Oct 26, 2020
…1835)

* node/service/src/lib: Do not spawn authority discovery on sentries

The notion of sentry nodes has been deprecated (see [1] for details).
Support for sentry nodes in the `client/authority-discovery` module has
been removed.

This commit adjusts the instantiation of the authority discovery worker
accordingly, only spawning the module on authority nodes.

[1] paritytech/substrate#6845

* "Update Substrate"

Co-authored-by: parity-processbot <>
infinity0 added a commit to w3f/research that referenced this issue Jan 22, 2021
overview paper: deprecate sentry nodes - see also paritytech/substrate#6845
@mxinden
Copy link
Contributor Author

mxinden commented Feb 9, 2021

Deprecation of support for sentry nodes will happen with the next release of Substrate and Polkadot. Updates to the Polkadot secure validator project will happen thereafter. We don't expect the actual removal before October 2020.

#8079 will remove support for the concept of sentry nodes.

@tomaka
Copy link
Contributor

tomaka commented Feb 18, 2021

Closing after #8079

@tomaka tomaka closed this as completed Feb 18, 2021
@mxinden
Copy link
Contributor Author

mxinden commented Mar 18, 2021

With rust-libp2p now supporting the circuit relay protocol there is a way to support libp2p-level sentry nodes in a non intrusive way in Substrate / Polkadot:

Sentry nodes would operate on the libp2p layer only. On top of the standard transport (tcp + noise + yamux) stack, they would support the relay protocol only. See this example as a POC.

Validator nodes would need to support the relay protocol as well. They would keep a constant connection to their sentry nodes, thus listening for incoming connections. In order for other nodes to connect to them via their sentry node, they would not advertise their own address, but instead would advertise their relayed addresses (ip4/<sentry-ip>/p2p/<sentry-peer-id>/p2p-circuit/<validator-peer-id>). This is already supported today as soon as the validator node listens for incoming connections via the sentry node.

Nodes trying to connect to validator nodes would connect to the relayed addresses (ip4/<sentry-ip>/p2p/<sentry-peer-id>/p2p-circuit/<validator-peer-id>) discovered e.g. through the DHT, first connecting to the validator's sentry node which then relays the connection to the validator node.

All of the above would be transparent to upper layer protocols like e.g. the Grandpa gossip system.

Given that this would happen on the libp2p-level only, one would only defend against attacks on the libp2p-level and below. Still, given its non-intrusive architecture, I would deem it to be worth the extra security.

@burdges
Copy link

burdges commented Mar 18, 2021

We've serious networking problems throughout polkadot, caused in part by libp2p, so sentry nodes should wait until everything else works cleanly. We also donno what libp2p's circuit relay protocol leaks. That does not prevent others playing around with sentry nodes however.

@liamsi
Copy link

liamsi commented Jul 28, 2021

We've serious networking problems throughout polkadot, caused in part by libp2p

@burdges @mxinden of what nature are these problems? Is this mostly about the debuggability of libp2p or rather that it leaks peers that is not supposed to leak (or both?).

Over at @celestiaorg we are exploring different alternatives for our p2p stack and libp2p still seems like the best (and only) solution around. The only alternative seems to be writing ones own specific p2p/networking layer (which has other downsides).
I'd love any insights from a team that is heavily involved and invested in libp2p.

@burdges
Copy link

burdges commented Jul 28, 2021

I only mention libp2p as an aside there.. I'm mostly speaking about the non-commutativity of difficulty in software development.

It's hard to take a system doing X and make it do X+Y because Y is some conceptually hard feature. It's straightforward to take a system doing X+Y and add some invasive but conceptually simple feature Z, like sentry nodes or HSM support, mostly because whoever writes Z can see Y working. It's nearly impossible to take a system doing X+Z and add Y because Z simply proves too distracting while doing the hard thing Y.

We've require harder networking tricks in polkadot, like our availability solution, parachain collators asking approval checkers for their xcmp messages, parachain capture defenses, and later killing off all memepools with sassafras.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
J1-meta A specific issue for grouping tasks or bugs of a specific category. U2-some_time_soon Issue is worth doing soon. Z2-medium Can be fixed by a coder with good Rust knowledge but little knowledge of the codebase.
Projects
None yet
Development

No branches or pull requests

9 participants