Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

can/should zoe expose vat-termination on the instance admin facet? #9716

Closed
Tracked by #10137 ...
warner opened this issue Jul 15, 2024 · 1 comment · Fixed by #10267
Closed
Tracked by #10137 ...

can/should zoe expose vat-termination on the instance admin facet? #9716

warner opened this issue Jul 15, 2024 · 1 comment · Fixed by #10267
Assignees
Labels
enhancement New feature or request Zoe package: Zoe

Comments

@warner
Copy link
Member

warner commented Jul 15, 2024

What is the Problem Being Solved?

To perform #9483 , we need a want to terminate about ten soon-to-be-unused vats (eg v29-price_feed-ATOM and v46-scaledPriceAuthority-ATOM). We can think of three approaches:

  • 1: use the new controller.terminateVat() API from within a chain-halting upgrade to kill the vats by their vatID
  • 2: use governance or a core-eval to "upgrade" the vat to a version whose only job is to call vatPowers.exitVat() and self-terminate
  • 3: add a feature to Zoe that allows a bootstrap-vat core-eval to terminate the contract vats

This ticket is about the third option. (The first is awkward because of how cosmic-swingset works, the second is awkward because the "upgrade" code must also re-defined all the durable Kinds or else the upgrade won't work long enough to reach the self-termination).

Description of the Design

TBD, but currently, when someone asks Zoe to instantiate a previously-installed contract, Zoe creates a new contract vat for the instance, starts ZCF, tells ZCF to start the contract, and then Zoe returns some sort of "zoe instance admin facet" to the caller. This facet currently has an upgrade() method. The plan would be to add a terminate() method as well.

Internally, Zoe holds on to the vat-admin service's adminNode (one of the values returned by E(vatAdminSvc).createVat()), so Zoe has the ability to call E(adminNode).terminate(), but Zoe does not currently have any pathway to do that. So we'd add that pathway, exposed as a terminate() on the instance admin facet.

The core-evals which create the price feeds are storing these instance admin facets in the vat-bootstrap promise space. So once the admin facets have a .terminate() method, a later core-eval could pull the facet from promise space and then invoke terminate() on them.

Security Considerations

Nominally, this is not an increase in the authority of the instance-admin-facet holders: they already have the ability to upgrade the vat, which includes (as an annoying-to-use subset) the ability to upgrade to a version which just self-terminates. So from that argument, creating a fast-path-to-termination isn't anything new.

But, making termination a lot easier is a security concern in its own right. We could imagine audit procedures or linting rules that scans proposed contract upgrades for calls to vatPowers.exitVat(), or for leaks of the vatPowers object, and this external-to-the-vat termination pathway would completely bypass those scans.

Scaling Considerations

We should not exercise this new power until the #8928 slow-vat deletion code (and all necessary cosmic-swingset integration code) is in place, otherwise the sudden all-at-once deletion of a large price-feed vat could crash the chain.

Once that is in place, all vat terminations become slow vat deletions. The slow-deletion scheduler will limit the work done during any single block, so even if multiple vats are terminated at the same time, we'll still delete just one at a time. Nevertheless, we should start small, by deleting the smallest stale vat (probably v112) and watching it carefully, so that we'll have time to adjust rates or make changes to the process before we initiate termination of the next one. Each core-eval that triggers this will require a stake-holder vote, but the core-eval could delete one or all of them, so the community needs to keep the scaling concerns in mind when evaluating those proposals.

Test Plan

This should be testable by packages/zoe/ unit tests.

In a3p, once slow-deletion is in place, and once we've upgraded/replaced some price-feed vats, we should have an a3p core-eval step to delete an old vat. That would serve as a better end-to-end integration test, before we exercise this on mainnet. Note that slow-deletion means the a3p chain will keep doing deletion work even after the core-eval completes, which might challenge our assumptions about when an a3p step should be considered "done" (and when we make a snapshot to prepare for the next step). We might need to adjust the way a3p works to wait longer, and avoid leaving lingering cleanup work that would confuse later a3p steps.

Upgrade Considerations

Deploying this will require an upgrade to vat-zoe, as well as a new method on an existing Zoe Kind/Exo.

cc @Chris-Hibbert @erights @dckc

@warner warner added enhancement New feature or request Zoe package: Zoe labels Jul 15, 2024
@Chris-Hibbert
Copy link
Contributor

The adminNode is stored in InstanceStorageManager, and is accessible inside ZCF to make it possible to terminate vats from the inside. That OCap could be passed out to external view in a refactoring--the hard question is how to pass it out to some place where it can continue to be tightly controlled. (There is no creatorFacet for Zoe.)

I think the best answer is to add another method to the configFacet. This means that anything that has the ability to updateZcfBundleId() can also do this. I think that's acceptable, as both should only be called from bootstrap.

@warner warner assigned Chris-Hibbert and unassigned warner Aug 14, 2024
@mergify mergify bot closed this as completed in #10267 Oct 17, 2024
mergify bot added a commit that referenced this issue Oct 17, 2024
closes: #9716

## Description

Add the ability to terminate a vat if you have the adminFacet.

### Security Considerations

Doesn't add authority, because anyone with the adminFacet could also upgrade the contract to something that calls `zcf.shutdown()`.  In practice, this will be controlled by governance or the bootstrap space.

### Scaling Considerations

The scaling issue is about cleaning up abandoned vats that are holding memory we'd like to release.

### Documentation Considerations

We should (not urgently) update the zoe docs to mention this ability.

### Testing Considerations

Adding a unit test wouldn't show anything, since it relies on the adminNode. 

A SwingSet test shows that the vat is dead.

I will pass this to @warner as a draft, so he can do extensive testing of the cleanup process.

### Upgrade Considerations

The upgrade requires only merging the code and upgrading Zoe. This will redefine all the existing `adminFacets`. The holders of those facets (in the bootstrap space) will then be able to invoke the method and observe the results.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request Zoe package: Zoe
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants