Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

K8S Permissioning to use of Service IP's rather than pod IP's which can fail #1190

Closed
joshuafernandes opened this issue Jul 2, 2020 · 0 comments
Assignees
Labels
bug Something isn't working P2 High (ex: Degrading performance issues, unexpected behavior of core features (DevP2P, syncing, etc))

Comments

@joshuafernandes
Copy link
Contributor

joshuafernandes commented Jul 2, 2020

Description

As an Ops Engineer, I want to deploy Besu to K8S for consortium type networks and use permissioning to allow/disallow members

Acceptance Criteria

Permissioning works on K8S with me using service IPs or equivalent, but not Pod IPs - pods can fail and restart with a new IP which then cannot participate because the IP is different

Steps to Reproduce (Bug)

  1. Use the new functionality in the Besu Nat-Manager
  2. Setup local permissioning (same behaviour with on chain permissioning as well, local is easier to edit and restart) Helm chart here
  3. Deploy to K8S

Expected behavior:
That permissioning works:

  • for a start specifying just the enodes with public keys & validator service ips
  • that I can add a new node in (acceptable that a new node be deployed as a node + service)

Actual behavior:
Nodes don't join and start the network and chain and the like - they check permissions and allow the services, but it appears that some of the socket connections use the pod IP which causes the whole thing to halt

Frequency:
Always

Versions

  • Software version: besu 1.5.1-SNAPSHOT
  • Cloud VM, type, size: [Azure AKS so far that I've test this on, suspect this will be all]

Additional Information

  • Besu 1.5.1 commit 50db46f or later will work
  • Have been working with @matkt on this to put some examples and templates in place, he's done some really fantiastic work in building the nat manager out to allow for ClusterIp services too and automating that. The guts of it appears to be at the VertX level and believe this needs a ticket to get this looked at and fixed. In the very least something that allows for permissions on K8S that is more tolerant of failure at the pod level.

Some log messages like the one below where comms should only be on 10.0.x.x (service ips) but enodes are reported coming from 10.244.x.z (pod ips). Initial config for the nodes had the p2p-host set as 10.0.x.x via the nat manager so it should respect that ?
2020-07-02 04:59:33.731+00:00 | vert.x-eventloop-thread-1 | TRACE | DiscoveryProtocolLogger | <<< Sending NEIGH packet to peer 0x5fc1f8dc9f0c03087128e4bd724530e8 (enode://5fc1f8dc9f0c03087128e4bd724530e883d7de1a431269876dff9c95b8952f73c7e85ac7b49d85a2ad4950e967319482af435e07a0eab0a98d98449437787a00@10.244.1.38:30303): Packet{type=NEIGHBORS, data=NeighborsPacketData{peers=[DiscoveryPeer{status=bonded, enode=enode://5fc1f8dc9f0c03087128e4bd724530e883d7de1a431269876dff9c95b8952f73c7e85ac7b49d85a2ad4950e967319482af435e07a0eab0a98d98449437787a00@10.244.1.38:30303, firstDiscovered=1593665973436, lastContacted=1593665973518, lastSeen=1593665973526}, DiscoveryPeer{status=bonded, enode=enode://5d812c3c25ff398ab416968fce9009c2be7ed70a87abc8ea30bd667ce17a9287a6341fbf6ce757bb8148436c39c71296639ea81afcc94cdf908b6e1344f26188@10.0.179.209:30303, firstDiscovered=1593665966323, lastContacted=1593665971451, lastSeen=1593665967307}], expiration=1593666033}, hash=0xbcf0644e18e05d0b1ca03d50f634e1f51772522d29c68ad24268d84390688318, signature=SECP256K1.Signature{r=89146151500703781876909065874071324192052354470190482233597100072946940926983, s=52435911823807714770890529437538386980170334409869377928205490124983253432422, recId=1}, publicKey=0x00b20ab6a385a2403d64637b3d93cb6d83215a08f29adb6feb4b8bf03387b734444e8b060f53150dea4b9b897823540d19918c13d6f57a5153d190b5fad7bf51}

Log file from one validator attached:
logs-from-validator3-in-besu-validator3-0.tar.gz

We can also see some logs like that Received PING packet from peer 0xa77d23cab569301854728da8a3b4e0d5 (enode://a77d23cab569301854728da8a3b4e0d55e1df694d7302102a5228a16b4fc86bbdb198fd58550c533a6e2da5dea4f06dbac5e274498e49960f6effd53ad8a1dba@10.244.0.29:30303): Packet{type=PING, data=PingPacketData{from=Endpoint{host=‘10.0.6.71’, udpPort=30303, getTcpPort=30303} . In this log there is the valid IP in the PingPacketData message but there is also an invalid IP added by Vertx. Besu is using the IP added by Vertx

@MadelineMurray MadelineMurray added bug Something isn't working P2 High (ex: Degrading performance issues, unexpected behavior of core features (DevP2P, syncing, etc)) labels Jul 3, 2020
@matkt matkt self-assigned this Jul 8, 2020
@lucassaldanha lucassaldanha added the TeamRevenant GH issues worked on by Revenant Team label Jul 15, 2020
@lucassaldanha lucassaldanha removed the TeamRevenant GH issues worked on by Revenant Team label Jul 17, 2020
@matkt matkt closed this as completed Aug 26, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working P2 High (ex: Degrading performance issues, unexpected behavior of core features (DevP2P, syncing, etc))
Projects
None yet
Development

No branches or pull requests

5 participants