Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

vat creation must wait between cranks for the vat to finish startup #2908

Closed
warner opened this issue Apr 18, 2021 · 1 comment
Closed

vat creation must wait between cranks for the vat to finish startup #2908

warner opened this issue Apr 18, 2021 · 1 comment
Assignees
Labels
bug Something isn't working SwingSet package: SwingSet

Comments

@warner
Copy link
Member

warner commented Apr 18, 2021

Describe the bug

While trying to use the xsnap worker for dynamic vats, I found a problem with the way we spin up dynamic vats. The sequence looks like this:

  • crank 1: client (parent) vat sends message to vatAdmin, asking it to create a new vat, getting a result promise in return
  • crank 2: vatAdmin does syscall.invoke to the vat-creation device, which leads into the kernel's createVatDynamically() function, which:
    • allocates a new vatID
    • calls vatManagerFactory(vatID, options), which kicks off the vat creation process and returns a Promise for when it's done
    • that Promise is wired to a function that will enqueue a message to vatAdmin when the vat is ready
    • createVatDynamically() returns (synchronously) the newly-allocated vatID
  • (still crank 2): vatAdmin arranges to react to the vat-is-ready message by resolving the result promise to the client (parent) vat
  • at this point, the kernel appears to be idle: the ready-promise queue is empty, (probably) nothing is on the run-queue
    • so e.g. c.run() would return, the host application would allow a block to be finished
  • (some indeterminate random time later, possibly in the middle of some other crank): the vat creation process finishes
    • the "vat is ready" message is enqueued on the run-queue
  • (next time the kernel is run) (crank N): vat-is-ready is delivered to vatAdmin, which resolves the result promise
  • crank N+1: client (parent) vat gets notified of their result promise being resolved, which gives them the admin facet and the new vat's root object Presence

There are several significant problems with this:

  • the kernel appears idle, but in fact work is taking place behind the scenes (mostly in a child process)
    • so we might finish a block too early
  • the vat-is-ready message occurs at a random time: non-determinism
  • multiple vats being created at the same time will enqueue their vat-is-ready messages in a random order: non-determinism
  • any syscalls that the new vat wants to make during setup (e.g. syscall.vatstoreSet if it creates virtual objects during buildRootObject) happen at a random time, which makes state changes at random, non-deterministic times

Since we can't wait for the vat to become ready during vatAdmin's syscall.invoke, I think the next best policy is to wait for the vat to become ready during a special crank-like window immediately after that invoke calls createVatDynamically. Basically, each time createVatDynamically is called, we should push the vatID onto a queue. When the normal crank finishes, we check the queue, pop the first vatID off it, and run a special "wait for vatID X to get ready" pseudo-crank. c.run() waits for this (we don't pretend the kernel is idle), nothing else gets to run (reacquiring determinism), the pseudo-crank has a transaction window for setup-time syscalls to happen (so syscall.vatstoreSet has a home), and the whole thing finishes before we let the next vat creation happen (so the vat-is-ready messages are queued strictly in the order in which syscall.invoke/createVatDynamically ran during the last crank).

I think this might mean we want to defer even starting vat creation until this pseudo-crank. In that case, the queue we manage will contain both the allocated vatID, plus all the managerOptions we want to deliver to vatManagerFactory. The actual call to vatManagerFactory() won't happen until the top of the pseudo-crank, and the crank won't finish until the factory is done and we've enqueued the vat-is-ready message.

cc @FUDCo @dckc

@warner
Copy link
Member Author

warner commented Apr 23, 2021

fixed by #2946

@warner warner closed this as completed Apr 23, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working SwingSet package: SwingSet
Projects
None yet
Development

No branches or pull requests

1 participant