Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Off chain permissioning issue with Discovery peers #1252

Closed
joshuafernandes opened this issue Jul 23, 2020 · 3 comments
Closed

Off chain permissioning issue with Discovery peers #1252

joshuafernandes opened this issue Jul 23, 2020 · 3 comments
Assignees
Labels
bug Something isn't working P3 Medium (ex: JSON-RPC request not working with a specific client library due to loose spec assumtion) TeamRevenant GH issues worked on by Revenant Team

Comments

@joshuafernandes
Copy link
Contributor

Description

In spinning up a network of 8 nodes I'd like to setup offchain permissioning from the beginning. So have followed the docs and put the enodes in a permissions.json file and started the nodes up which works fine and I can see the chain head progressing, but the issue is that all nodes only report a peer count of 1 instead of 7. When I turn permissions off, all 7 peers are reported straight off. Also not seeing any errors in the logs re disallowed permissions and the chain is progressing so the permissions look about right.

I've tried stopping and starting a node again and that resulted in it reporting 3 or 4 peers (& the ones it connected to also reported more peers) but not the full 7. Im not sure when the discovery rounds fire and maybe this will be eventually consistent? Ping @arash009 or @joshuafernandes for more details

Acceptance Criteria

Offchain node permissioning reports more peers initially than just the bootnode

Steps to Reproduce (Bug)

Ping @arash009 or @joshuafernandes for more details - test setup can be provided

Expected behavior: [What you expect to happen]
Offchain node permissioning should get all peers reporting 7 nodes in a reasonable time frame

Actual behavior: [What actually happens]
Offchain node permissioning shas all peers reporting 1 peer only out of 7

Frequency: [What percentage of the time does it occur?]
100%

Versions (Add all that apply)

Besu 1.5.x on docker-compose

Additional Information

@lucassaldanha lucassaldanha added the TeamRevenant GH issues worked on by Revenant Team label Jul 23, 2020
@pinges pinges added the bug Something isn't working label Jul 23, 2020
@macfarla macfarla self-assigned this Jul 26, 2020
@macfarla
Copy link
Contributor

macfarla commented Jul 27, 2020

I can reproduce this behaviour. Best guess is something to do with how/when discovery is initiated.

modifying dependencies in docker-compose so that the nodes start sequentially (rather than bootnode then the rest all at once) seems to improve things - ie after about 3 min most nodes have n-1 peers. However this is still less than expected, and one node is stuck with only 1 peer.

That node initiates neighbours round with 1 candidate (bootnode) but does not receive any neighbours packets

{"timestamp":"2020-07-27T03:21:59,300","container":"981c7511f6cb","level":"DEBUG","thread":"vert.x-eventloop-thread-2","class":"RecursivePeerRefreshState","message":"Initiating neighbours round with 1 candidates from 1 tracked nodes","throwable":""}

After restarting this one node, neighbours packets are received, and all nodes then end up with n peers.

{"timestamp":"2020-07-27T03:46:38,407","container":"981c7511f6cb","level":"DEBUG","thread":"vert.x-eventloop-thread-2","class":"RecursivePeerRefreshState","message":"Start peer search.","throwable":""}
{"timestamp":"2020-07-27T03:46:38,408","container":"981c7511f6cb","level":"DEBUG","thread":"vert.x-eventloop-thread-2","class":"RecursivePeerRefreshState","message":"Skipping bonding round because no candidates are available","throwable":""}
{"timestamp":"2020-07-27T03:46:38,409","container":"981c7511f6cb","level":"DEBUG","thread":"vert.x-eventloop-thread-2","class":"RecursivePeerRefreshState","message":"Initiating neighbours round with 3 candidates from 4 tracked nodes","throwable":""}
{"timestamp":"2020-07-27T03:46:38,430","container":"981c7511f6cb","level":"DEBUG","thread":"vert.x-eventloop-thread-2","class":"RecursivePeerRefreshState","message":"Received neighbours packet with 4 neighbours","throwable":""}
{"timestamp":"2020-07-27T03:46:38,431","container":"981c7511f6cb","level":"DEBUG","thread":"vert.x-eventloop-thread-2","class":"RecursivePeerRefreshState","message":"Received neighbours packet with 4 neighbours","throwable":""}
{"timestamp":"2020-07-27T03:46:38,432","container":"981c7511f6cb","level":"DEBUG","thread":"vert.x-eventloop-thread-2","class":"RecursivePeerRefreshState","message":"Received neighbours packet with 4 neighbours","throwable":""}
{"timestamp":"2020-07-27T03:46:38,433","container":"981c7511f6cb","level":"DEBUG","thread":"vert.x-eventloop-thread-2","class":"RecursivePeerRefreshState","message":"Skipping bonding round because no candidates are available","throwable":""}
{"timestamp":"2020-07-27T03:46:38,434","container":"981c7511f6cb","level":"DEBUG","thread":"vert.x-eventloop-thread-2","class":"RecursivePeerRefreshState","message":"Initiating neighbours round with 1 candidates from 4 tracked nodes","throwable":""}
{"timestamp":"2020-07-27T03:46:38,442","container":"981c7511f6cb","level":"DEBUG","thread":"vert.x-eventloop-thread-2","class":"RecursivePeerRefreshState","message":"Received neighbours packet with 4 neighbours","throwable":""}
{"timestamp":"2020-07-27T03:46:38,443","container":"981c7511f6cb","level":"DEBUG","thread":"vert.x-eventloop-thread-2","class":"RecursivePeerRefreshState","message":"Skipping bonding round because no candidates are available","throwable":""}```

@macfarla
Copy link
Contributor

TRACE logging at startup shows some connections are being rejected by Node Permissioning Controller with reason "Sync Status" which should not be checked for offchain permissioning.

@macfarla
Copy link
Contributor

macfarla commented Jul 29, 2020

@joshuafernandes I did a fresh checkout of the quorum-dev repo, reproduced your problem, updated docker-compose.yaml to my local build and it works. Peering AND producing blocks. This PR has been merged into master now, please let us know whether it fixes the problem for you.

@macfarla macfarla added the P3 Medium (ex: JSON-RPC request not working with a specific client library due to loose spec assumtion) label Aug 6, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working P3 Medium (ex: JSON-RPC request not working with a specific client library due to loose spec assumtion) TeamRevenant GH issues worked on by Revenant Team
Projects
None yet
Development

No branches or pull requests

5 participants