Skip to content
This repository has been archived by the owner on Nov 15, 2023. It is now read-only.

Don't send ActiveLeaves from leaves in db on startup in Overseer #6727

Merged
merged 6 commits into from
Mar 6, 2023

Conversation

tdimitrov
Copy link
Contributor

@tdimitrov tdimitrov commented Feb 16, 2023

This PR modifies Overseer not to send ActiveLeaves on startup with the leaves passed to it on construction. These leaves are obtained from the database and they usually are stale. Sending them to subsystems for processing leads to failing runtime calls (because the leaves are already pruned).

The new behaviour is only to import the leaves but not to notify the subsystems about it. First ActiveLeaves will be sent when fresh a fresh leaf is imported.

Partially addresses paritytech/polkadot-sdk#793. Fixes #6694.

Also check #6689 for more context.

cumulus companion: paritytech/cumulus#2275

@tdimitrov tdimitrov added A0-please_review Pull request needs code review. B0-silent Changes should not be mentioned in any release notes C1-low PR touches the given topic and has a low impact on builders. T4-parachains_engineering This PR/Issue is related to Parachains performance, stability, maintenance. labels Feb 16, 2023
@tdimitrov
Copy link
Contributor Author

And if this gets merged we should revert #6691

ActiveLeavesUpdate::start_work(ActivatedLeaf { hash, number, status, span });
self.broadcast_signal(OverseerSignal::ActiveLeaves(update)).await?;
}
self.on_head_activated(&hash, None).await;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does this call actually do?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and why do we care about stale leaves here?

Copy link
Contributor Author

@tdimitrov tdimitrov Feb 16, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

on_head_activated creates a span for the leaf and notifies external listeners (in case someone is subscribed for it). I think it's better not to remove it.

The entries it creates are cleaned up on finalization.

and why do we care about stale leaves here?

My theory: the initial idea was to start the subsystems as fast as possible. And there is a contract, that first leaf is special and it should be used just for a initialization. For example the dispute-coordinator waits for the first leaf before starting its main loop.

But at some point the subsystems started doing actual work when the first leaf is received and we started seeing the state discarded issues.

I misunderstood your question. After discussing it online - we don't care. I'll remove it.

@tdimitrov tdimitrov requested a review from rphmeier February 16, 2023 08:36
Copy link
Member

@eskimor eskimor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@tdimitrov
Copy link
Contributor Author

I want to test how this performs on Versi and then I'll merge

@ordian
Copy link
Member

ordian commented Feb 27, 2023

And if this gets merged we should revert #6691

Not sure about that. The issue #6452 was addressing is related to node falling out of sync after startup. So it will go into major sync from time to time due to some slowness. It could be due to network issues or CPU/IO or some load problems in dispute-coordinator or other subsystems.

Reverting #6691 would reintroduce #6656 I'm afraid.

This PR doesn't seem to address finality notifications being sent during major sync (probably by design), so I believe #6691 and #6452 are still relevant.

@eskimor
Copy link
Member

eskimor commented Mar 2, 2023

@tdimitrov the recent tests on versi - have they been based on this one? And if so - can we merge?

@tdimitrov
Copy link
Contributor Author

The recent tests didn't include this PR.
I had it on versi for a quick run, it didn't cause any issues, but also I haven't tested it thoroughly.
Let me run some more tests and I'll merge?

@tdimitrov
Copy link
Contributor Author

The PR is working fine on versi for 3+ days. I don't think it solves paritytech/polkadot-sdk#793 completely but definitely makes it better.

I'm merging once CI is green,

@tdimitrov
Copy link
Contributor Author

bot merge

@paritytech-processbot
Copy link

Waiting for commit status.

@paritytech-processbot
Copy link

Merge cancelled due to error. Error: Statuses failed for 4c52942

@tdimitrov
Copy link
Contributor Author

bot merge

@paritytech-processbot
Copy link

Waiting for commit status.

@paritytech-processbot paritytech-processbot bot merged commit 177ceca into master Mar 6, 2023
@paritytech-processbot paritytech-processbot bot deleted the tsv-no-db-leaves branch March 6, 2023 15:02
paritytech-processbot bot pushed a commit to paritytech/cumulus that referenced this pull request Mar 6, 2023
* Don't pass `leaves()` to `Overseer::builder()`

This is a companion for paritytech/polkadot#6727

* update lockfile for {"substrate", "polkadot"}

---------

Co-authored-by: parity-processbot <>
ordian added a commit that referenced this pull request Mar 7, 2023
* master: (27 commits)
  bump `zombienet` version to v1.3.37 (#6773)
  Bump `blake2b_simd` to 1.0.1 (#6829)
  changelog: update template for new label behavior (E3/E4) (#6804)
  Companion for paritytech/substrate#12828 (#6380)
  Don't send `ActiveLeaves` from leaves in db on startup in Overseer (#6727)
  Polkadot XCM Body constants (#6788)
  Decrease expected peer count in zombinenet tests (#6826)
  Additional tracing in `provisioner`, `vote_selection` and `dispute-coordinator` (#6775)
  Change node-key for bootnodes (#6772)
  Change handle_import_statements to FatalResult (#6820)
  Introduce XCM matcher for writing barriers (#6756)
  Freeze note on `SessionInfo`. (#6818)
  Bump parity-db (#6816)
  Removing Outdated References to Misbehavior Arbitration Subsystem (#6814)
  Forgotten re-export for `MatchedConvertedConcreteId` (#6815)
  Companion for substrate#13509: bump API versions of {Beefy,Mmr}Api (#6809)
  Migrate to `Weight::from_parts` (#6794)
  [XCM] Multiple `FungiblesAdapter`s support + `WeightTrader::buy_weight` more accurate error (#6739)
  Get rid of unnecessary cloning and work. (#6808)
  changelog: fix migration listing (#6806)
  ...
ordian added a commit that referenced this pull request Mar 7, 2023
* master: (27 commits)
  bump `zombienet` version to v1.3.37 (#6773)
  Bump `blake2b_simd` to 1.0.1 (#6829)
  changelog: update template for new label behavior (E3/E4) (#6804)
  Companion for paritytech/substrate#12828 (#6380)
  Don't send `ActiveLeaves` from leaves in db on startup in Overseer (#6727)
  Polkadot XCM Body constants (#6788)
  Decrease expected peer count in zombinenet tests (#6826)
  Additional tracing in `provisioner`, `vote_selection` and `dispute-coordinator` (#6775)
  Change node-key for bootnodes (#6772)
  Change handle_import_statements to FatalResult (#6820)
  Introduce XCM matcher for writing barriers (#6756)
  Freeze note on `SessionInfo`. (#6818)
  Bump parity-db (#6816)
  Removing Outdated References to Misbehavior Arbitration Subsystem (#6814)
  Forgotten re-export for `MatchedConvertedConcreteId` (#6815)
  Companion for substrate#13509: bump API versions of {Beefy,Mmr}Api (#6809)
  Migrate to `Weight::from_parts` (#6794)
  [XCM] Multiple `FungiblesAdapter`s support + `WeightTrader::buy_weight` more accurate error (#6739)
  Get rid of unnecessary cloning and work. (#6808)
  changelog: fix migration listing (#6806)
  ...
ggwpez pushed a commit to ggwpez/runtimes that referenced this pull request Jul 13, 2023
* Don't pass `leaves()` to `Overseer::builder()`

This is a companion for paritytech/polkadot#6727

* update lockfile for {"substrate", "polkadot"}

---------

Co-authored-by: parity-processbot <>
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
A0-please_review Pull request needs code review. B0-silent Changes should not be mentioned in any release notes C1-low PR touches the given topic and has a low impact on builders. T4-parachains_engineering This PR/Issue is related to Parachains performance, stability, maintenance.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

overseer: signal completion of major sync
4 participants