-
Notifications
You must be signed in to change notification settings - Fork 227
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kernel should commitCrank at end of start() to record device initialization state writes #4450
Comments
Hrm, |
Huh, right, so For vats, we dealt with this by not doing any state-changing work as we bring the vats online (for the first time, or any time). Now that we want vats to be able to use I wonder if we need to do something simliar with devices. the wrinkle is that they legitimately(??) need to read their state on each host restart. Except.. well, they could defer that until someone invokes the device, so even the first read happens in the context of a crank. I think we went through this before, with the mailbox or timer device, where having a flag in RAM to avoid re-reading DB state cause a consensus failure. Here, it's not a flag in RAM, but rather state-mutating vatstore calls happening at startup, outside the context of a crank. Needs more thought. |
So maybe the rule is that raw devices are not allowed to perform syscalls during I made this change on So it's highly probable that this ticket should be closed without code changes. Maybe we should document this prohibition in the raw-device docs (docs/devices.md), maybe we should provide some sort of enforcement. |
@FUDCo with the new SQLite -based storage (and the removal of the crank buffer), do you think this problem is fixed now? |
What is the Problem Being Solved?
While working on #4372, I noticed that one unit test (test/virtualObjects/test-representatives.js 'exercise cache') was unexpectedly broken by the introduction of a new devices (
devices.bundle
). This (raw) device does avatstoreGet
andvatstoreSet
as it starts up, to initialize the "next device node" counter. The unit test uses an instrumentedkvStore
that records all gets and sets, which appear there after the crank buffer is flushed (so we'll observe all the reads happen first, then a batch of writes, etc).The unit test was flunked by the unexpected pair of
devices.bundle
operations. However, they didn't happen next to each other. Instead, I saw theset
occur a long time after the initialget
.I think the issue is that we aren't doing a
commitCrank()
at the end ofkernel.js start()
. So any kvstore writes that happen during startup (e.g. while devices are being loaded and executed) are left in the crank buffer until the first real crank happens, then they get flushed along with the real work.This is a bit weird: the delivery to vat-bootstrap shouldn't be entangled with state initialization for the whole kernel. It probably wouldn't cause a problem unless the host application performed
start()
and then immediately committed the hostDB and exited. In that case, the device writes would probably get dropped. This would cause the device to do them again at next reboot, beforebootstrap()
is delivered, which would cause a consensus failure.It seems to me that we ought to flush all of these out these changes by adding a
commitCrank()
to the end ofstart()
. The consequences I can think of are that the crank numbers will be one higher (which will probably break a few unit tests).A related issue that occurred to me: device
vatstore
writes that happen because of vat invocations will get unwound if that vat delivery is aborted (along with any other state changes caused by the abandoned delivery). But if the devices invoked its endowments, those changes can't be unwound. Anything the device does is supposed to go into a transaction or buffer or something that doesn't get committed until the host application commits (hangover inconsistency prevention), but it might not be obvious to device authors that they need to pay attention to this restriction.Description of the Change
in
kernel.js
, at the end ofstart()
, just after thekernelKeeper.loadStats()
, add akernelKeeler.commitCrank()
.Security Considerations
none that I can think of
Test Plan
Some tests may need to be updated, with a higher crank number than before.
cc @FUDCo for thoughts and in case you grab it before I do
The text was updated successfully, but these errors were encountered: