-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Conversation
- Switches to select! in the overseer run loop to be more fair about message processing between the different sources. - Added a check to only send `ActiveLeaves` if the update actually contains any data.
8a06925
to
e741864
Compare
node/overseer/src/lib.rs
Outdated
@@ -1402,7 +1413,9 @@ where | |||
|
|||
self.clean_up_external_listeners(); | |||
|
|||
self.broadcast_signal(OverseerSignal::ActiveLeaves(update)).await?; | |||
if !update.is_empty() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this could be empty if we import the same block twice, how can this happen?
maybe we should log it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is actually a good question. I will take another look at the code why we see empty updates at all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You were right and after looking at the code, it was clear that the finalization function was the problem. We will very likely already have all leaves closed when they are finalized, leading to empty updates.
select! { | ||
msg = self.events_rx.next().fuse() => { | ||
let msg = if let Some(msg) = msg { | ||
msg | ||
} else { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In fact we probably need the priority of the external events to signal subsystems about deactivated blocks and such
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah my reasoning was probably bad, however I don't assume that these events will not hit the overseer on time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this should be a problem in practice. However, a bigger problem is, that we have bounded channels between overseer and subsystems and if a subsystem is busy, this can block overseer from any other action until the subsystem receives a message/signal. Maybe we should consider dropping messages instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think dropping messages is a good idea. It would be nice to have backpressure on block import instead, but Substrate isn't tooled for that yet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still, these channels are pretty performant and the amount of messages coming through are relatively low (~hundreds or low thousands per block). I would be highly surprised if the overseer could not deliver at least an order of magnitude beyond that.
select! { | ||
msg = self.events_rx.next().fuse() => { | ||
let msg = if let Some(msg) = msg { | ||
msg | ||
} else { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this should be a problem in practice. However, a bigger problem is, that we have bounded channels between overseer and subsystems and if a subsystem is busy, this can block overseer from any other action until the subsystem receives a message/signal. Maybe we should consider dropping messages instead.
node/subsystem/src/lib.rs
Outdated
self.activated.iter().collect::<HashSet<_>>() == other.activated.iter().collect::<HashSet<_>>() && | ||
self.deactivated.iter().collect::<HashSet<_>>() == other.deactivated.iter().collect::<HashSet<_>>() | ||
self.activated.len() == other.activated.len() && self.activated.iter().all(|a| other.activated.contains(a)) | ||
&& self.deactivated.len() == other.deactivated.len() && self.deactivated.iter().all(|a| other.deactivated.contains(a)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd reorder the checks a bit, we can check the length on both activated and deactivated as a fast path, although this probably doesn't matter performance-wise. Once thing to note though that this check is now quadratic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point on the length check reordering. Regarding the quadratic check, while that is true, I highly assume that we will here be faster because we don't allocate and most of the time there is only one element in the update.
bot merge |
Trying merge. |
message processing between the different sources.
ActiveLeaves
if the update actuallycontains any data.