-
Notifications
You must be signed in to change notification settings - Fork 75
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ETCM-168: Discovery part4 #770
Conversation
be22f07
to
820c273
Compare
63d85d5
to
a806e7c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I needed fixes in Scalanet at the same time. Those were published locally and copied to Mantis before it was started:
Should a new version of scalanet be published then? Which commit is required? (for me to test it)
These can be added to mantis-3.0/conf/mordor.conf to cut down the noise of unreachable nodes
From were did you get the new bootstrap nodes? For now we have been copying the bootstrap from ETC's geth, though we haven't updated them in more than a month.
But if you found better bootstrap we should definitely add them!
src/main/scala/io/iohk/ethereum/network/discovery/PeerDiscoveryManager.scala
Outdated
Show resolved
Hide resolved
src/main/scala/io/iohk/ethereum/network/discovery/DiscoveryServiceBuilder.scala
Show resolved
Hide resolved
src/test/scala/io/iohk/ethereum/ledger/BlockExecutionSpec.scala
Outdated
Show resolved
Hide resolved
src/main/scala/io/iohk/ethereum/network/discovery/codecs/RLPCodecs.scala
Show resolved
Hide resolved
It's been published now, it was the PR I referred to in the description.
Sorry I was vague, I by "adding to ./conf/mordor.conf" I meant to add it in the unzipped config file, to override the default values. These 4 nodes are a subset of the defaults. I got them from looking at the block sync logs, these were the only ones which were logged serving any data. |
… definitely captured.
I recommend we also include input-output-hk/scalanet#103, it makes the initial lookup phase much faster on mordor. |
When trying to connect to mainnet it turned out that the settings from the PR description lead to a rather slow uptake of nodes, it discovered ~1000 nodes in ~30 minutes. Trying with these different setting was faster: ./mantis-3.0/bin/mantis-launcher etc -Dmantis.metrics.enabled=true \
-Dmantis.network.discovery.discovery-enabled=true \
-Dmantis.network.discovery.reuse-known-nodes=false \
-Dmantis.network.discovery.scan-interval=30.seconds \
-Dmantis.network.discovery.kademlia-bucket-size=1024 \
-Dmantis.network.discovery.kademlia-alpha=1024 The difference is that the algorithm will always keep the closest 1024 nodes in the lookup loop, not just 16, so it will try to bond with many more at the same time. That resulted in ~500 nodes discovered in 2 minutes, ~2100 in ~14 minutes. However most nodes rejected the TCP handshake because they already have too many peers, and this was the point where the node found 3 peers it could use to pick a pivot block (it needs an majority agreement between them, arguably 2 peers would be enough as long as they agree). The handshaked nodes are persisted, so restarting with normal settings later should see the node reconnect to those: ./mantis-3.0/bin/mantis-launcher etc -Dmantis.metrics.enabled=true \
-Dmantis.network.discovery.discovery-enabled=true \
-Dmantis.network.discovery.reuse-known-nodes=true \
-Dmantis.network.discovery.scan-interval=30.seconds \
-Dmantis.network.discovery.kademlia-bucket-size=16 \
-Dmantis.network.discovery.kademlia-alpha=3 However after a restart it still could only connect to 2 nodes and fail to pick a pivot block, possibly because they other side thought we were already connected and it took them some time to clean up the broken connection. |
Some of the default params can be tweaked, like the discovery interval of 15 minutes is surely not often enough. Other clients like trinity only did discovery when it didn't have enough candidate peers, it didn't do periodic lookups, while the Go client does a self lookup and 3 random lookups every 30 minutes. The bucket size is kinda part of the spec I think. Alpha can be raised if needed, up to the bucket size (I don't think trinity used alpha, the Go client does). We can have another look at the go-ethereum codebase too if you think it would be useful. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with what Konrad mentioned of creating a separate task to tweak the parameters! What we have is probably good enough for starting deploying on the testnet
src/main/scala/io/iohk/ethereum/network/discovery/DiscoveryConfig.scala
Outdated
Show resolved
Hide resolved
src/main/scala/io/iohk/ethereum/network/discovery/PeerDiscoveryManager.scala
Show resolved
Hide resolved
src/main/scala/io/iohk/ethereum/network/discovery/PeerDiscoveryManager.scala
Show resolved
Hide resolved
src/main/scala/io/iohk/ethereum/network/discovery/DiscoveryServiceBuilder.scala
Show resolved
Hide resolved
src/main/scala/io/iohk/ethereum/network/discovery/codecs/RLPCodecs.scala
Show resolved
Hide resolved
By the way @ntallar I'm not sure how exactly it works now in Mantis now, but I think the way we can't connect to almost anyone from 2000+ nodes because they all have too many peers could be improved in the new network. For example take a look at the grafting and pruning strategy in gossipsub. Perhaps our node could also have a min-target-max range for incoming/outgoing connections and get rid of them if they go above the threshold, opening up room for newcomers. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apart from this and the minor comment update, LGTM!
import org.scalatest.flatspec.AnyFlatSpec | ||
import org.scalatest.matchers.should.Matchers | ||
|
||
class PeerDiscoveryManagerSpec |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we add tests for our new PeerDiscoveryManager?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know, the actor doesn't have too much logic in it, so I thought I'd leave it up to you if you're happy with the three of us having seen it, and tested it with mordor/etc, or you'd prefer having unit tests with a mock service.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What would be cool is if there was some end-to-end test that started and least a couple of Mantis nodes that discovered each other, so we knew the service builder is correct as well as the actor.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added unit tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That would be cool... I'm not sure if we had that on Midnight or not 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Kinda, it builds the project (although I think that could be skipped) and starts the node as subprocesses to test the wallet. Not sure if multiple nodes are tested though, probably not.
src/test/scala/io/iohk/ethereum/network/discovery/PeerDiscoveryManagerSpec.scala
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only a single comment remains, but LGTM!
.\\ //.
. \ \ / /.
.\ ,\ /` /,.-
-. \ /'/ / .
` - `-' \ -
'. /.\`
- .-
:`//.'
.`.'
.' BP
Description
Replaces the node discovery currently in Mantis with the one in Scalanet, implementing the Ethereum Discovery v4 spec.
Requires fixes from input-output-hk/scalanet#100
Important changes
Mantis previously lacked automation to resolve its own external IP address. It used to send
0.0.0.0
in thePing
messages as its own address, which the other nodes probably ignored and used the address visible in the connection. However clients fully implementing the discovery protocol get the address from the ENR, so it should be correctly set.The PR adds an external IP resolution mechanism, however if that were to fail, the address can and should be manually set with the
mantis.network.discovery.host
. If it's not set the discovery module will log an error and not discovery peers, it will only serve previously discovered and persisted peers. See the below example for setting it via command line arguments.Testing
To test it, package Mantis, extract it, and start it with discovery and Metrics enabled:
With metrics running, the number of connections can be checked with
curl
:I needed fixes in Scalanet at the same time. Those were published locally and copied to Mantis before it was started:
mill scalanet.discovery.publishLocal cp ~/.ivy2/local/io.iohk/scalanet-discovery_2.12/0.4-SNAPSHOT/jars/scalanet-discovery_2.12.jar ../mantis/target/universal/mantis-3.0/lib/io.iohk.scalanet-discovery_2.12-0.4-SNAPSHOT.jar
The
mordor
config comes with 20-something bootstrap nodes but I noticed that out of those only a few seem to work. These can be added tomantis-3.0/conf/mordor.conf
to cut down the noise of unreachable nodes (I also temporarily disabled syncing and consensus to see anything in the logs, otherwise they get full very quickly and get zipped up by log rotation):The logs also indicate how many peers have been discovered: