-
Notifications
You must be signed in to change notification settings - Fork 20.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cmd/blsync, light/beacon: standalone beacon light sync tool (refactored version, WIP) #26874
Conversation
beacon/light/sync/head_sync.go
Outdated
} | ||
} | ||
for _, server := range servers { | ||
if server, ok := server.(signedHeadServer); ok { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if server, ok := server.(signedHeadServer); ok { | |
if server, ok := server.(signedHeadServer); !ok { | |
continue | |
} | |
... |
Please unindent
beacon/light/sync/scheduler.go
Outdated
} | ||
|
||
// call before starting the scheduler | ||
func (s *Scheduler) RegisterModule(m syncModule) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please document what a module represents, and why we need several of them
beacon/light/sync/scheduler.go
Outdated
func (s *Scheduler) syncLoop() { | ||
s.lock.Lock() | ||
for { | ||
wait := true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This sync loop is not very straight-forward, with the modules-abstraction, the logic regarding 'changed', the async wait for reqDone
and combination with s.trigger
.
What is the goal here?
I have the feeling that there are many possible "not happy paths" that can happen, if some internal assumption about consistency is broken.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, it was a quick and dirty first version, I think it's a lot better now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice
// SerializedCommitteeRoot calculates the root hash of the binary tree representation | ||
// of a sync committee provided in serialized format | ||
func (s *SerializedCommittee) Root() common.Hash { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What hash format is this? Is it a guerilla ssz hasher you implemented here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Basically yes :) I wrote some simple tools for binary merkle trees because I needed kinda special things but then I started using it for hashing everything :) Now I started using tools from github.com/protolambda/zrnt for hashing beacon blocks, I'll check if I can easily use it here too.
…d state proof db storage
@lightclient and I did a longer review session on this PR today, and it became clear that there is a big issue with the overall structure of the sync implementation. The change introduces the If we leave the framework in, then nobody but @zsfelfoldi will be able to maintain the light client code. The core idea is pretty neat however. Having reviewed a lot of LES protocol code also written by Zsolt in the past, I can totally see where he is coming from with this, and would like to figure out a way forward. Let me first try to explain why the active vs. passive componentsWe tend to distinguish 'active' and 'passive' components in our designs. Examples of active components are On the other end, we have passive components, which are essentially just data structures. Examples are The motivation for splitting up a problem into active and passive components is that active components with their goroutines and channels can be hard to test. This is especially true when timers are involved or the logic becomes complex. The best you can hope for are end-to-end tests on toy data, which will never be exhaustive and usually take kind of long to run. If you have a complex 'passive component' driven by a simpler 'active component', it is much easier to come up with exhaustive test cases that will run in almost no time at all. Over time, we have added tools such as package common/mclock to help with testing of such components and make it independent of the system clock, so even timeout scenarios can be tested. request.Scheduler/Module, the good partsSo what about the structure of sync in this PR? What @zsfelfoldi has realized here, is a system where whole mechanism is defined using mostly passive components. In his design, the client is split into Each Since modules run sequentially and not concurrently, they don't need to be safe for concurrent use. What's also nice, is that the scheduler abstracts away most of the complexity that comes with having multiple servers. The modules don't have to care about peers too much, all they need to do is react to changes of the 'best chain head' and then use Finally, the abstraction allows having optional modules defined across multiple packages. This PR uses this to good effect by defining the block-syncing logic and engine API driver in cmd/blsync, and these modules are not very large at all! the bad parts: trigger systemAs mentioned in the beginning, the main issue with the proposed design is additional complexity for code reviewers. When you look at this thing for the first time, it is very hard to see how sync actually works, because there is no obvious place where 'the steps happen'. Rather, it's the interaction between modules 'triggering' each other that makes it all work. I kind of feel @zsfelfoldi has taken the abstraction a bit too far. The triggers themselves are a bit weird too. They are identified by name, and can be subscribed or non-subscribed. Modules are chained together by registering a trigger of the same name. So any module can trigger any other module and you won't really know why, unless you look at trace logs. Wait, there is no logging of scheduler triggers? How did you ever debug this Zsolt!? the bad parts: servers and requestsRequest sending is heavily integrated within the Modules must discover the available request capabilities of servers at runtime by testing for interface implementation. It's a neat use of Go interfaces, but is it really necessary? Can't we just commit to a fixed interface provided by all servers? In the current implementation, modules can store per-server state by putting it into an And finally there is the issue of handling responses. While a lot of the logic runs within how to fix itI like the idea of structuring the beacon client code as independent simple passive components. It will definitely help with testing and future extensibility. We can make this work. Here are some ideas for improving the trigger system.
An idea for the requests:
the alternativeThe alternative is rewriting the sync completely from scratch and just throwing the framework away. The lower-level light client types like I'm personally not drawn to this alternative, because I think the current approach results in a more robust and flexible codebase if done properly. Even if we ship something very simple now, we will need to add support for multiple servers and strange edge cases later. It won't stay simple. However, it is up to the people who will actually work on it to decide. |
Closed in favor of #28822 |
This PR is the new refactored version of #25901 and is still WIP (no tests, no comments on the new parts) and some cosmetic changes are also planned but it is already working.
Please note that my Lodestar test node is resyncing atm so you can try it with the Nimbus node:
See also this top level overview doc intended to help the review:
https://gist.github.com/zsfelfoldi/254d9356632d384a05a56905fa401db6