Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

exported object ID counter vs upgrade #4730

Closed
warner opened this issue Mar 3, 2022 · 2 comments · Fixed by #4898 or #4926
Closed

exported object ID counter vs upgrade #4730

warner opened this issue Mar 3, 2022 · 2 comments · Fixed by #4898 or #4926
Assignees
Labels
enhancement New feature or request SwingSet package: SwingSet
Milestone

Comments

@warner
Copy link
Member

warner commented Mar 3, 2022

What is the Problem Being Solved?

While reviewing @gibson042 's #4717, I realized that liveslots keeps the "next exported object ID counter" in RAM (nextExportID and nextPromiseID, via allocateExportID() and allocatePromiseID()). That's fine for a single vat-version lifetime, but when the vat is upgraded, those RAM counters will be reset, and the vat will start exporting new objects (and promises) from o+1 and p+1.

That means the new vat-version will export objects that collide with their earlier exports. We might terminate all non-durable object exports during upgrade (I'm still undecided whether vats should reconnect their "precious" Far/Remotable exports, or if they should be terminated), which might reduce the scope of the problem, but I think we still use that counter to allocate the base vref for new virtual/durable Kinds, and we need that to not collide.

Promises are troublesome too. Half of them aren't a problem: we intend to reject all outstanding promises during upgrade, so we'll remove them from the upgraded vat's c-list (the kernel rejects them, so vat-version2 doesn't need to, so it doesn't even need to reference them, and there are not references to them in durable data so vat-version2 doesn't need to be told about them either).

But that's determined by the decider, not the vpid allocator. If vat-version1 sent a message and attached a result promise vref of p+1, then that's going to be decided by some other vat, and so we won't reject it when the sending vat is upgraded. What we can do is remove the upgraded vat from the subscribers list, so it won't receive notifications of resolution, so it won't be a problem that the kpid is missing from the c-list. That sounds expensive, though: we have an index from kpid to subscribing VatIDs, but nothing mapping VatID to a list of subscribed promises. So maybe we should leave the upgraded vat in the subscribing list, but delete all c-list entries involving promises. When the notification arrives, handleNotify will see that the kpid is missing from the c-list, and conclude that the vat had already received the message somehow (batched resolutions can cause this).

Ok, so nextPromiseID isn't a problem: the c-list seen by vat-version2 will be entirely devoid of promise references.

But nextExportID might. The fix is to keep it in the vatStore, durably, but that sounds expensive too.

It's starting to sound like a pre-upgrade "cache flush" step would be convenient. We already plan to do a bringOutYourDead before upgrade, to write out the virtual object cache. Maybe we should write nextExportID to the vatStore at the same time? Kinda gross but maybe ok.

cc @FUDCo

Description of the Design

Security Considerations

Test Plan

@warner
Copy link
Member Author

warner commented Mar 16, 2022

To unsubscribe the promises, we might also walk the upgraded vat's c-list during update, and examine the status of all promises it mentions as we remove them. If the promise is unresolved and the vat is a subscriber, remove it from the subscriber list. If the promise is unresolved and the vat is the decider, reject it. It might be both.

@FUDCo
Copy link
Contributor

FUDCo commented Mar 22, 2022

We already have a bunch of this kind of thing going on in the kernel, so I don't know that using the DB for this will appreciably add to the load. Certainly we can use a write-thru cache to replace the read-modify-write cycle with a simple write, and I think with a little bit of additional cleverness we can arrange to have that write happen at most once per crank.

@Tartuffo Tartuffo added this to the Mainnet 1 milestone Mar 23, 2022
@mergify mergify bot closed this as completed in #4898 Mar 24, 2022
warner added a commit that referenced this issue Mar 26, 2022
Liveslots maintains three counters: the Export ID (used for `o+NN`
exported object vrefs), the Promise ID (for `p+NN` allocated
promises), and the Collection ID (one per virtual/durable
collection).

Originally, these counters were only held in RAM: initialized during
`makeLiveSlots()` and incremented when needed. To fix #4730,
`makeLiveSlots()` was changed to read them from the DB, and then
liveslots writes them back to the DB at the end of any delivery which
modifies them. This ensures that a future version of the vat+liveslots
can pick up allocation where the previous version left off.

However the DB read to initialize the in-RAM copy did not set the
'dirty' flag. This caused variations in DB writes across different
vats: `startVat` would not write out the initial values if
`buildRootObject` did not happen to export any additional objects. In
addition, since the DB value is a simple `JSON.stringify` of the
counter record, the order of the keys varied depending upon whether an
Object or Promise or Collection ID got allocated first.

This commit changes the counter initialization process to use a fixed
order for the counter names, and to always perform the
initialization (and DB write) at the same time in all vats: during
`startVat`, with a write at the end of the delivery. This should make
the initial kvStore contents (just after `startVat`) more consistent.

The new initialization code should behave correctly when we add new
counters in the future: it will merge the new counter names (and
initial values) into the record it reads from the DB during startVat.

A handful of unit tests were updated to expect the new order.
warner added a commit that referenced this issue Mar 26, 2022
Liveslots maintains three counters: the Export ID (used for `o+NN`
exported object vrefs), the Promise ID (for `p+NN` allocated
promises), and the Collection ID (one per virtual/durable
collection).

Originally, these counters were only held in RAM: initialized during
`makeLiveSlots()` and incremented when needed. To fix #4730,
`makeLiveSlots()` was changed to read them from the DB, and then
liveslots writes them back to the DB at the end of any delivery which
modifies them. This ensures that a future version of the vat+liveslots
can pick up allocation where the previous version left off.

However the DB read to initialize the in-RAM copy did not set the
'dirty' flag. This caused variations in DB writes across different
vats: `startVat` would not write out the initial values if
`buildRootObject` did not happen to export any additional objects. In
addition, since the DB value is a simple `JSON.stringify` of the
counter record, the order of the keys varied depending upon whether an
Object or Promise or Collection ID got allocated first.

This commit changes the counter initialization process to use a fixed
order for the counter names, and to always perform the
initialization (and DB write) at the same time in all vats: during
`startVat`, with a write at the end of the delivery. This should make
the initial kvStore contents (just after `startVat`) more consistent.

The new initialization code should behave correctly when we add new
counters in the future: it will merge the new counter names (and
initial values) into the record it reads from the DB during startVat.

A handful of unit tests were updated to expect the new order.
warner added a commit that referenced this issue Mar 28, 2022
Liveslots maintains three counters: the Export ID (used for `o+NN`
exported object vrefs), the Promise ID (for `p+NN` allocated
promises), and the Collection ID (one per virtual/durable
collection).

Originally, these counters were only held in RAM: initialized during
`makeLiveSlots()` and incremented when needed. To fix #4730,
`makeLiveSlots()` was changed to read them from the DB, and then
liveslots writes them back to the DB at the end of any delivery which
modifies them. This ensures that a future version of the vat+liveslots
can pick up allocation where the previous version left off.

However the DB read to initialize the in-RAM copy did not set the
'dirty' flag. This caused variations in DB writes across different
vats: `startVat` would not write out the initial values if
`buildRootObject` did not happen to export any additional objects. In
addition, since the DB value is a simple `JSON.stringify` of the
counter record, the order of the keys varied depending upon whether an
Object or Promise or Collection ID got allocated first.

This commit changes the counter initialization process to use a fixed
order for the counter names, and to always perform the
initialization (and DB write) at the same time in all vats: during
`startVat`, with a write at the end of the delivery. This should make
the initial kvStore contents (just after `startVat`) more consistent.

The new initialization code should behave correctly when we add new
counters in the future: it will merge the new counter names (and
initial values) into the record it reads from the DB during startVat.

A handful of unit tests were updated to expect the new order.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request SwingSet package: SwingSet
Projects
None yet
3 participants