-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sybil-like NTP-level attack #1592
Comments
Has someone detailed what the actual attacks are that you can do if you control the time of any arbitrary number of nodes? Presumably they would concern liveness but not safety? One of the possible mitigations is to not accept any dates/times that are outside of a given range of the current system time (which should be stable enough to be trusted not to deviate more than a few seconds per day). This is actually implemented in the standard ntpd as the "panic" flag and set to 1000s by default -- by advising validators to set it to a much smaller value, could we mitigate this attack or at least force the attacker to introduce their skew "slowly" which would make it potentially detectable long before the attack can be executed? |
Severe clock disparity - around epoch duration (384 sec) can lead to liveness violation, i.e. blocks cannot be justified/finalized. Actually, as p2p-interface requires to block/delay early messages, then any clock disparity > 500ms can impede message flow. If it's not severe, e.g. around slot duration or so, beacon chain should be able to tolerate, however, it can be used to facilitate other attacks. Robustness, e.g. performance under attack will definitely be affected (like validators getting lesser or no rewards). So, even low disparities looks like a problem and low 'panic' threshold value won't help to prevent all attacks. However, lowering 'panic' flag to fraction of an epoch probably makes sense - to filter out most severe consequences, which prevent inclusion of attestations in blocks. In general, I think that relying on administrative mitigations is not enough. Some mechanism to detect such problems should be introduced. And if there is a detector then it's a half of a clock synchronization :). |
It's an interesting theoretical question whether safety can be compromised. But beacon chain protocol contain other sub-protocols, so practical safety can be violated, i.e. otherwise honest validators may be forced to be inactive by a time level attack and after their balances fall down due to 'inactivity leak', an adversary which also has enough own validators can reign. So, a pure time attack should not cause safety violations, but if safety property is adjusted to account for the above, then, in theory, a combined time + malicious validator attack is possible to break the adjusted safety property. Not sure, I will have time to properly formalize the above, as I think it will be extremely difficult to do in reality, as validators will definitely respond with counter-measures, when their balances start falling. |
I've written a draft proposal about how the problem can be solved https://hackmd.io/GnJ_Cf4FSZW-BZImH8KF1w |
After the Medalla incident on mid-2020 the consensus (I've seen) is that we should use existing clock syncronization strategies and not roll our own |
Beacon chain protocol assumes validator clocks are roughly synchronized, e.g. fork choice specs states:
No mechanism to assure the assumption is specified, however. Thus, it's a validator responsibility. It's highly probable that many validators will use NTP to synchonize their clocks to the world standard (i.e. UTC), since it's an easy to set up and alternatives can be expensive.
Then, it's also highly probable that such NTP setups will be using NTP pool. An excerpt from the NTP pool page:
Or they can users servers from other public NTP server lists, e.g. http://support.ntp.org/bin/view/Servers/WebHome.
The NTP pool is free to enter (an excerpt from the the NTP pool page):
THis can result in an implicit dependency of the beacon chain on the NTP pool (and/or other public NTP lists), if validator are not carefull enough in configuring their NTP servers.
Such situation creates an opportunity for a Sybil attack on the beacon chain protocol, under certain conditions. I.e. an adversary can populate the NTP pool (or any public NTP server list) with lots of Byzantine-faulty NTP servers, which will report wrong time to validator nodes.
NTP protocol can tolerate certain errors, e.g. detect "falsetickers", by comparing results form several NTP servers. However, in case there are many faulty NTP servers in the pool, there is a high probability that a correct server will look as a "falseticker".
NTP pool servers are also monitored by the pool software. However, if the adversary knows IP addresses of beacon chain protocol participants, its faulty NTP servers can report wrong time results only to clients which IP addresses in the list. This is why the NTP servers controlled by the adversary are considered as Byzantine-faulty (two-faced clocks).
If few validators' clocks are distorted by such an attack then beacon chain protocol can tolerate it. However, the key problem with the scenario is that many validators can be vulnerable to the attack, if they are not careful enough when setting NTP up. So, multiple correlated faults can be induced alone or together with other means to attack the beacon chain protocol.
E.g. p2p-inteface spec prescribes to delay early messages, then such an attack can be used to delay or break message flow in the beacon chain p2p graph.
Note, that as non-validator nodes can participate in the p2p graph, then they can be used to attack the beacon chain protocol too.
The attack is described in more details in a separate document.
It's relatively easy to withstand the attack, e.g. beacon chain participants should be careful when configuring NTP. However, if it's risky to use NTP servers from public NTP server lists, where should they obtain NTP servers?
Using NTP servers controlled by big corporations, non-profit or government agencies can be a possibility, however, it can lead to a similar correlated implicit dependency and lack of decentralization. Which maybe not desirable for various reasons.
Wealthy validators can set up their own time servers, however it increases significantly an entry barrier to run a validator node.
We will elaborate in more details on possible ways of reliable clock synchronization in another document, including BFT clock syncronization solutions and/or anonymous access to public time services (e.g. GNSS, Radio clocks, public NTP servers, etc).
The main goal of the issues is to warn Ethereum 2.0 implementers and researchers that it can be dangerous to rely on the default NTP setup and public NTP server pools and lists.
It's also dangerous to assume most validators can set up NTP/time service in a secure manner.
Thus, it's a risk to the overall beacon chain protocol.
As very minimal counter-measures, we propose:
These minimal counter-measures are hardly enough, so the best solution would be to design BFT clock synchronization protocol, so that validator and non-validator node adminstrators are relieved from the secure Time Service setup burden.
However, such BFT protocol can be prohibitively expensive given expected beacon chain protocol scale (thousands of nodes), so a cheaper solutions are to be investigated too.
We stress out that beacon chain protocol can tolerate limited number of validators with vulnerable NTP setups, so probably a separate BFT clock synchronization protocol can be excessive, if there exists a way to prevent correlated NTP-level failures.
The text was updated successfully, but these errors were encountered: