Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: VM Execution Lanes #10551

Merged
merged 22 commits into from
Mar 30, 2023
Merged

feat: VM Execution Lanes #10551

merged 22 commits into from
Mar 30, 2023

Conversation

vyzo
Copy link
Contributor

@vyzo vyzo commented Mar 23, 2023

Overview

This pr implements effective execution lanes for the fvm, supporting a default and a priority execution lane.

The priority execution lane reserves a number of execution units so that there is always some capacity for it even if default execution load spikes, typically through user load.

Implementation Notes

  • Configuration is handled through the LOTUS_FVM_CONCURRENCY (also used by the fvm itself to reserve execution units) and LOTUS_FVM_CONCURRENCY_RESERVED env vars.
  • Currently, only ApplyBlocks execution is prioritized, as it is critical for sync/consensus.

@vyzo vyzo requested a review from raulk March 23, 2023 15:20
@maciejwitowski maciejwitowski added this to the Lotus - v1.23.0 milestone Mar 23, 2023
@vyzo vyzo marked this pull request as ready for review March 23, 2023 19:12
@vyzo vyzo requested a review from a team as a code owner March 23, 2023 19:12
chain/consensus/compute_state.go Outdated Show resolved Hide resolved
chain/stmgr/call.go Outdated Show resolved Hide resolved
chain/vm/execution.go Show resolved Hide resolved

const (
DefaultAvailableExecutionLanes = 4
DefaultPriorityExecutionLanes = 2
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we discussed synchronously, we need to justify this number by looking at the critical path usages in the chain sync subsystem. From my understanding, those are the following, but @magik6k will probably have better ideas.

  • Incoming block validation via gossip
  • Head coalescer executions.
  • Actual deferred execution.

Also note that time-sensitive miner gas estimations and mpool pushes will fall under the "user" section. I'm not sure how much of a problem that could be, probably not much since those daemons won't be facing arbitrary user workloads. But worth confirming.

chain/vm/vmi.go Outdated Show resolved Hide resolved
metrics/metrics.go Outdated Show resolved Hide resolved
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't reviewed this whole file yet. Pending.

Copy link
Contributor

@arajasek arajasek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Broadly looks like it should work, some notes.

chain/vm/execution.go Outdated Show resolved Hide resolved
chain/vm/execution.go Show resolved Hide resolved
chain/consensus/compute_state.go Outdated Show resolved Hide resolved
chain/stmgr/call.go Outdated Show resolved Hide resolved
chain/vm/vmi.go Outdated Show resolved Hide resolved
@vyzo vyzo requested a review from Stebalien March 27, 2023 16:16
Copy link
Contributor

@fridrik01 fridrik01 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, just some minor comments

chain/vm/execution.go Outdated Show resolved Hide resolved
chain/vm/vmi.go Outdated Show resolved Hide resolved
// DefaultPriorityExecutionLanes is the number of reserved execution lanes for priority computations.
// This is purely userspace, but we believe it is a reasonable default, even with more available
// lanes.
DefaultPriorityExecutionLanes = 2
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add a comment somewhere that DefaultAvailableExecutionLanes include the DefaultPriorityExecutionLanes so they are not added together (unless I am somehow terribly mistaken).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, this should not be confusing.

Comment on lines +137 to +141
reserving := 0
if e.reserved > 0 {
e.reserved--
reserving = 1
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I understand, is the executionToken.reserved mainly used to handle the case where we have a ExecutionLanePriority where all the priority lanes are used/full but there are free default ones?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

exactly, it is the spill mechanism.

Copy link
Contributor

@arajasek arajasek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but let's let @fridrik01 also approve.

Copy link
Contributor

@fridrik01 fridrik01 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM also

@Stebalien
Copy link
Member

Doing this entirely in lotus is looking pretty gnarly. Someone is going to forget to call Done in some error path and cause a deadlock (or call Done multiple times).

Please don't merge this yet, I think there's a better way and I'll spend a few minutes trying to flesh it out.

@Stebalien
Copy link
Member

Ok, so:

  1. The benefit of this approach is that it limits the lotus node to LOTUS_FVM_CONCURRENCY concurrent machines, limiting memory usage.
  2. The downsides are:
    1. This prevents interleaving message execution. This doesn't matter for throughput, but it increases the latency of otherwise fast requests by forcing them to block on other slower requests.
    2. It infects everything with .Done() statements.

An alternative approach is to handle this at the ApplyMessage level. I.e.:

  1. ApplyBlocks calls ApplyMessage with a priority of 0.
  2. Everything else calls ApplyMessage with the default priority (e.g., 100).
  3. Internally, ApplyMessage uses a semaphore to stop "normal" priority calls when the number of available lanes reaches LOTUS_FVM_CONCURRENCY_RESERVED.

Unlike the current approach, this doesn't require any deferred calls to some "done" method. Really, this can be entirely handled with a flag in the context passed to ApplyMessage.

In the future, this can be refined further: Instead of always reserving a couple of lanes, ApplyBlocks can raise the number of reserved priority lanes when called and reduce them when it returns.

@vyzo
Copy link
Contributor Author

vyzo commented Mar 29, 2023

calling Done multiple times is idempotent, so that part is safe.

@vyzo
Copy link
Contributor Author

vyzo commented Mar 29, 2023

Your proposal would use a similar mechanism (ie tokens for execution lanes).

The advantage it has is that the lock is more finegrained and that the Done call is limited in scope at ApplyMessage.

@vyzo
Copy link
Contributor Author

vyzo commented Mar 29, 2023

I am not opposed to doing it that way, but the benefit really is marginal imo.

@vyzo
Copy link
Contributor Author

vyzo commented Mar 29, 2023

Also, we dont need the context option at all, we can do the locking at the ApplyMessage shim and have the priority come from vmopts as it currently does.

In short, we can hide the whole getToken/putToken/Done business inside the ApplyMessage shim of the executor, and it's not the end of the world if you forget to call Done outside.

That's a change I am more willing to implement.

@vyzo
Copy link
Contributor Author

vyzo commented Mar 29, 2023

See #10590 for implementation of the finegrained lock.

Honestly, I don't think perf will improve (it will likely be slightly worse for ApplyBlocks as we'll have to take the lock a thousand times), but not having to call Done is a win safety-wise.

chain/vm/execution.go Outdated Show resolved Hide resolved
chain/consensus/compute_state.go Outdated Show resolved Hide resolved
chain/consensus/compute_state.go Outdated Show resolved Hide resolved
chain/consensus/compute_state.go Outdated Show resolved Hide resolved
chain/stmgr/call.go Outdated Show resolved Hide resolved
chain/vm/execution.go Show resolved Hide resolved
chain/vm/execution.go Show resolved Hide resolved
@vyzo vyzo merged commit bf666a3 into master Mar 30, 2023
@vyzo vyzo deleted the vyzo/feat/exec-lanes branch March 30, 2023 19:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants