Execution context to enable flow-like scheduler #7875

mhofman · 2023-06-01T19:56:10Z

What is the Problem Being Solved?

Our current scheduler implemented by swingset and cosmic-swingset is simplistic and has run to completion semantics for every I/O event: an input event (cosmos action, time update) adds entries on the swingset run queue, and controller.run() executes until no more work can be done (the run queue is empty), modulo block execution limits.

This leaves the execution vulnerable to runaway message loops, either direct or timer based. See #7847.

We need to switch to a more advanced scheduler design that is capable of interleaving executions so that new I/O events are not preempted by existing work.

Description of the Design

The high level approach is to associate messages together in an "execution flow", which would be roughly (but not exactly) equivalent to the messages that would have been processed in the "run to completion" scheduler. Unlike run to completion, multiple "execution flows" may be active at the same time. These "execution flows" are managed by the host application (cosmic-swingset), which may use them to instruct Swingset which message should be processed next.

This assumes Swingset has multiple queues of messages that are eligible to be processed next, instead of the single run queue we have now. Most likely these would be the per vat inbound and outbound queues described in #5025 in order to implement the message ordering guarantees defined in #3465: messages from a given vat to a same presence must be delivered in the same order they were sent, even if these messages are associated to separate flows.

At its core, every message in a queue would be associated with a "flow id" or "execution context". Swingset is responsible for replicating this "flow id" / "execution context" during vat execution: every message send, promise resolution or subscription made during a delivery automatically inherits the execution flow of the message which triggered the vat execution.

The run policy would be replaced by a mechanism that allows the host application to select which queue to process next, if any. Information such as the queue details (vat inbound or outbound, or promise queue, queue depth, etc.) as well as the details of the top most message of the queue (message type, "flow id"/"execution context") would be available to the host application.

Even if the host does not use the "execution flow" to decide which queue to process next, this would allow users to gain better visibility into the state of the execution triggered by their action.

Single stream limitation

This "execution flow" is an implicit dynamic context which is not revealed to vats. Because of that, we have 2 general limitations:

an execution cannot generate new execution flows. In particular if a vat was to process a single message bundling multiple messages from a remote swingset, these individual messages would not be able to carry explicit "execution flows". We may be able to add a privileged API for this later.
merging multiple flows of execution inside a vat results in execution associated to the last flow. There is currently no way to detect merging of flows in JavaScript, and even the AsyncContext proposal does not enable us to detect this.

Interaction with timers / devices

In order to properly associate a timer event with an "execution flow" the host implementation of the timer device should be provided with the "flow id" / "execution context" that triggered the queueing. That way once the host selects a new event from the timer queue, it can restore the correct "execution flow". This assumes #7846.

Similarly, when executing devices (timer wake, bridge inbound), the host must be able to set what the current "flow id" / "execution context" is. This is actually how new "execution flows" are created.

When a vat makes a device call, swingset would provide the related context info to the host, which can then transmit it forward if appropriate. One possible use case is to automatically annotate vstorage writes with the transaction info that originally triggered it (which may be different from the blockheight / time where the write happens)

Deferral of prioritization decision

Swingset itself would not have any logic to decide which active queue should be serviced next. The active queues simply enforce basic ordering guarantees, and in the future will enable partial parallelization of execution. The scheduling decision is offloaded to the host application by allowing it to select the order in which active queues should be processed.

The mechanism described here does not define how the host should implement its prioritization. It simply adds a dimension to the information available to the host to make scheduling decisions.

One possibility would be for the host to select the next queue / message to process based on the amount of execution a flow has seen to date, prioritizing flows that are in progress but have not been executing for too long (like a bell curve), and maybe mixing in a priority to certain flows.

It's also possible that the prioritization of certain flows may be influenced by some available economic data, like paid prioritization.

References

Per vat inbound and outbound queues #5025 is the a pre-requisite to the flow-like scheduler described in this issue, introducing multiple queues inside the kernel.
Consistent scheduling of timer events #7846 is a pre-requisite to associate timer events to the right execution context
Per vat priority (was: two run queues) #3465 has a lot of comments related to flow like grouping of messages, and ultimately the per vat queues. The issue also describes the relaxation of E-order to a point to point (vat to presence) order guarantee. It was then repurposed to detail a per vat priority mechanism, which I think can now simply be expressed as a bit of information available to the host when making scheduling decisions
Implement basic "flow escalators" #3517 describes a "flow escalator" mechanism, however that approach makes flows a first party concept in swingset and allocates a queue per flow.
cosmic-swingset "flow escalators" #3530 describes how cosmic-swingset may integrate with these flows. A lot of that issue may still be relevant as it deals mostly with tracking of the flow itself
implement kernel-side escalator scheduler #23 describes the original kernel escalators which mix in economic concepts like meters and keepers.
hard to correlate message sends and deliveries in slogfile #6501 describes adding a messageId to send/notify performed by vats to more easily identify the source/cause of deliveries.

Security Considerations

None I can think of right now as this is only information internal to the host application and never exposed to contract code.

Scaling Considerations

By itself this issue does not impact scaling, however it enables various scheduling changes which will likely have impact on perceived performance.

Test Plan

Since this issue is about associating flow information with existing messages in queues, the only testing surface is making sure the flow information is propagated as expected.

A cosmic-swingset scheduler built on top of this information would be the interesting bit to test.

The text was updated successfully, but these errors were encountered:

mhofman · 2023-06-13T16:21:40Z

Updated to add propagation of context info to upstream golang calls by the cosmic-swingset host.

mhofman · 2023-08-24T15:28:15Z

Propagating a "flow id" from delivery to syscalls could be done today before we add per vat queues. The hardest part is propagation through timer wakeups, but that might be mitigated in part through promise subscriptions.

This is also likely related to adding messageIds when performing send/notify to associate them to the resulting delivery, although in the other direction: #6501

mhofman · 2023-10-24T02:16:09Z

The mechanism described here does not define how the host should implement its prioritization. It simply adds a dimension to the information available to the host to make scheduling decisions.

Discussing scheduling with @dtribble and @zmanian, one idea that was floated was to allow the block proposer to decide which activity/flow would be executed during the block. A combination of newer cosmos-sdk/CometBFT and of Skip SDK would allow the host to model pending swingset activities as one of "lanes" that get included in a block alongside new activities from txs included in the block. The motivation is that the block proposer may have information regarding which activity may be more important to execute.

One approach would be for the block proposer to decide how many computrons get allocated for each activity, possibly with a prioritization between activities (strict or not). Then the scheduler would pick deliveries from queues based on which activity the delivery is associated to, and keep track of the accumulated computron usage of each activity. How the scheduler, cosmic-swingset and swingset interface together is still TBD within these new requirements, as long as it allows tracking the computron usage of activities. It would likely require modeling new txs as new "queues" with a single delivery representing the starting action, unless there is a strict prioritization between activities.

mhofman added enhancement New feature or request SwingSet package: SwingSet cosmic-swingset package: cosmic-swingset performance Performance related issues labels Jun 1, 2023

JimLarson added the needs-design label Jul 19, 2023

JimLarson assigned warner Jul 19, 2023

mhofman mentioned this issue Aug 19, 2023

Persistent cosmic-swingset state #8222

Open

mhofman mentioned this issue Dec 15, 2023

pause vats on meter underflow, rather than terminate #3528

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Execution context to enable flow-like scheduler #7875

Execution context to enable flow-like scheduler #7875

mhofman commented Jun 1, 2023 •

edited

Loading

mhofman commented Jun 13, 2023

mhofman commented Aug 24, 2023

mhofman commented Oct 24, 2023

Execution context to enable flow-like scheduler #7875

Execution context to enable flow-like scheduler #7875

Comments

mhofman commented Jun 1, 2023 • edited Loading

What is the Problem Being Solved?

Description of the Design

Single stream limitation

Interaction with timers / devices

Deferral of prioritization decision

References

Security Considerations

Scaling Considerations

Test Plan

mhofman commented Jun 13, 2023

mhofman commented Aug 24, 2023

mhofman commented Oct 24, 2023

mhofman commented Jun 1, 2023 •

edited

Loading