Skip to content
This repository has been archived by the owner on Nov 15, 2023. It is now read-only.

Some code cleanup in overseer #2008

Merged
5 commits merged into from
Nov 25, 2020
Merged

Some code cleanup in overseer #2008

5 commits merged into from
Nov 25, 2020

Conversation

bkchr
Copy link
Member

@bkchr bkchr commented Nov 24, 2020

  • Switches to select! in the overseer run loop to be more fair about
    message processing between the different sources.
  • Added a check to only send ActiveLeaves if the update actually
    contains any data.

@bkchr bkchr added A0-please_review Pull request needs code review. B0-silent Changes should not be mentioned in any release notes C1-low PR touches the given topic and has a low impact on builders. labels Nov 24, 2020
- Switches to select! in the overseer run loop to be more fair about
message processing between the different sources.
- Added a check to only send `ActiveLeaves` if the update actually
contains any data.
@bkchr bkchr force-pushed the bkchr-overseer-random-stuff branch from 8a06925 to e741864 Compare November 24, 2020 15:53
@@ -1402,7 +1413,9 @@ where

self.clean_up_external_listeners();

self.broadcast_signal(OverseerSignal::ActiveLeaves(update)).await?;
if !update.is_empty() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this could be empty if we import the same block twice, how can this happen?
maybe we should log it

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is actually a good question. I will take another look at the code why we see empty updates at all.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You were right and after looking at the code, it was clear that the finalization function was the problem. We will very likely already have all leaves closed when they are finalized, leading to empty updates.

Comment on lines +1332 to +1336
select! {
msg = self.events_rx.next().fuse() => {
let msg = if let Some(msg) = msg {
msg
} else {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In fact we probably need the priority of the external events to signal subsystems about deactivated blocks and such

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah my reasoning was probably bad, however I don't assume that these events will not hit the overseer on time.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this should be a problem in practice. However, a bigger problem is, that we have bounded channels between overseer and subsystems and if a subsystem is busy, this can block overseer from any other action until the subsystem receives a message/signal. Maybe we should consider dropping messages instead.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think dropping messages is a good idea. It would be nice to have backpressure on block import instead, but Substrate isn't tooled for that yet.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still, these channels are pretty performant and the amount of messages coming through are relatively low (~hundreds or low thousands per block). I would be highly surprised if the overseer could not deliver at least an order of magnitude beyond that.

Comment on lines +1332 to +1336
select! {
msg = self.events_rx.next().fuse() => {
let msg = if let Some(msg) = msg {
msg
} else {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this should be a problem in practice. However, a bigger problem is, that we have bounded channels between overseer and subsystems and if a subsystem is busy, this can block overseer from any other action until the subsystem receives a message/signal. Maybe we should consider dropping messages instead.

self.activated.iter().collect::<HashSet<_>>() == other.activated.iter().collect::<HashSet<_>>() &&
self.deactivated.iter().collect::<HashSet<_>>() == other.deactivated.iter().collect::<HashSet<_>>()
self.activated.len() == other.activated.len() && self.activated.iter().all(|a| other.activated.contains(a))
&& self.deactivated.len() == other.deactivated.len() && self.deactivated.iter().all(|a| other.deactivated.contains(a))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd reorder the checks a bit, we can check the length on both activated and deactivated as a fast path, although this probably doesn't matter performance-wise. Once thing to note though that this check is now quadratic.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point on the length check reordering. Regarding the quadratic check, while that is true, I highly assume that we will here be faster because we don't allocate and most of the time there is only one element in the update.

node/subsystem/src/lib.rs Outdated Show resolved Hide resolved
@montekki
Copy link
Contributor

bot merge

@ghost
Copy link

ghost commented Nov 25, 2020

Trying merge.

@ghost ghost merged commit 9a32ab1 into master Nov 25, 2020
@ghost ghost deleted the bkchr-overseer-random-stuff branch November 25, 2020 09:27
ordian added a commit that referenced this pull request Nov 25, 2020
* master:
  Some code cleanup in overseer (#2008)
  PoV Distribution optimization (#1990)
  Approval Distribution Subsystem (#1951)
  Session management for approval voting (#1973)
  Do not send messages twice in bitfield distribution (#2005)
This pull request was closed.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
A0-please_review Pull request needs code review. B0-silent Changes should not be mentioned in any release notes C1-low PR touches the given topic and has a low impact on builders.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants