eth/protocols/snap: use ephemeral channels to avoid cross-sync delveries #22678
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Snap sync has a run loop. All events are passed though this, which ensures race free consistency. This run loop can be stopped and later restarted if for example the pivot block moves.
Unfortunately, the run loop used persistent channels across sync cycles. Whilst this seemed like a good and clean idea, this means extra care needs to be taken to not leak "events" from one cycle into the next. The dirty-snapshot PR surfaced an interesting data race where a very quick stop/restart cycle ended up delivering stale events to the new sync. Since events assume certain invariants, such stale deliveries crashed.
This PR replaces the persistent channels with ephemeral ones, new for every sync cycle. This way even if a great deal of events are piled up, nobody will consume them upon a sync restart and they will just get dropped on the floor via the
cancel
channel.An alternative approach would be some multi-stage shutdown where we first disable accepting new packets and then wait for all active threads to terminate. That seems overly complicated though and a lot more things can be messed up with the complex sync logic.