Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fluffy state network now enabled by default and improve status logs #2640

Merged
merged 4 commits into from
Sep 19, 2024

Conversation

bhartnett
Copy link
Contributor

@bhartnett bhartnett commented Sep 19, 2024

Changes in this PR:

  • State network is now enabled by default
  • Status log loop added to beacon and state networks to enable visibility of the respective routing table sizes.
  • Status log loop added to portal node for metrics that are global and shared between each subnetwork.

… beacon networks. Create status log loop for portal node. Implement stop functions.
@bhartnett bhartnett changed the title Improve Fluffy status logs Improve Fluffy status logs and support graceful shutdown Sep 19, 2024
setupForeignThreadGc()

notice "Got interrupt, Fluffy shutting down..."
node.stop()
Copy link
Contributor Author

@bhartnett bhartnett Sep 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently this doesn't actually wait for every chronos task to complete because we are using cancelSoon() rather than cancelAndWait() in most places.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, any close call that does not actual wait for closing anything below should be deprecated and the awaited version should be used.

Like this one in discv5: https://github.com/status-im/nim-eth/blob/master/eth/p2p/discoveryv5/protocol.nim#L1120

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this would be a bit of changes in portal wire and in the different networks. If you want you can split off those changes from this PR change to activate state network.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for sharing that example. I'll follow that pattern. Yes, it's turning out to be a bigger change than I expected so I'll do it in two PRs.

@bhartnett bhartnett marked this pull request as draft September 19, 2024 08:37
@@ -320,8 +320,7 @@ proc start*(self: var LightClientManager) =
doAssert self.loopFuture == nil
self.loopFuture = self.loop()

proc stop*(self: var LightClientManager) {.async: (raises: []).} =
Copy link
Contributor

@kdeme kdeme Sep 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All the stop calls should remain async so that they can be awaited.

Else closing them down without awaiting in the ctrl-c handler will not actually ensure proper clean-up, defeating the point of having a graceful shutdown in the first place.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, thanks. I'm working on fixing this now. I'm planning to make all the cancel calls use closeAndWait.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will revert this for now

node.stop()
quit QuitSuccess

setControlCHook(controlCHandler)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a raw exception that might be thrown here needs to be catched?
At least that is what is being done in nimbus-eth2: https://github.com/status-im/nimbus-eth2/blob/bf4abf8b9e07c35442b00966f51cc9af5857af33/beacon_chain/nimbus_beacon_node.nim#L2264

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I'll look into that

@bhartnett bhartnett changed the title Improve Fluffy status logs and support graceful shutdown Fluffy state network now enabled by default and improve status logs Sep 19, 2024
@bhartnett bhartnett marked this pull request as ready for review September 19, 2024 12:36
@bhartnett
Copy link
Contributor Author

I've reverted the graceful shutdown changes which will now be part of my next PR.

info "History network status",
radiusPercentage = radiusPercentage.toString(10) & "%",
radius = n.portalProtocol.dataRadius().toHex(),
dbSize = $(n.contentDB.size() div 1000) & "kb",
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kdeme Did you have any concerns about moving this log out into Portal node? I thought it would make sense to do so because these fields are based on the contentDb and are shared between all the portal sub networks.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that's fine.

@bhartnett bhartnett merged commit a9ad10c into master Sep 19, 2024
10 checks passed
@bhartnett bhartnett deleted the fluffy-log-status branch September 19, 2024 13:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants