Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add Meters, assign to dynamic vats to track compute usage #3508

Merged
merged 8 commits into from
Jul 25, 2021

Conversation

warner
Copy link
Member

@warner warner commented Jul 22, 2021

refs #3308

@warner warner force-pushed the 3308-meter-cranks branch from c345047 to 6a24d11 Compare July 22, 2021 17:22
@warner
Copy link
Member Author

warner commented Jul 22, 2021

Still to do:

  • change API to provide/require BigInts everywhere: the per-crank meter consumption is limited to whatever C type XS uses (probably a 32-bit integer), but the long-term Meter value will accumulate many cranks (thousands?), so some more headroom would be better
  • figure out the TypeScript complaints
  • split into smaller PRs, the first few commits are refactorings that could land earlier
  • add a test of a Meter underflowing and reaching the notify threshold in the same crank (to make sure the notification message isn't unwound along with dead vat's messages)
  • document the new Meter API
  • add createInfiniteMeter, as a placeholder for the metered: true that Zoe et al are currently using
  • update all external callers to use it

@warner warner force-pushed the 3308-meter-cranks branch from 6a24d11 to a301ac1 Compare July 24, 2021 21:04
@warner warner changed the base branch from master to 3518-remove-local-metering July 24, 2021 21:04
@warner warner added this to the Testnet: Metering Phase milestone Jul 24, 2021
@warner warner added metering charging for execution (was: package: tame-metering and transform-metering) SwingSet package: SwingSet labels Jul 24, 2021
@warner warner self-assigned this Jul 24, 2021
@warner warner force-pushed the 3308-meter-cranks branch from a301ac1 to f5ade3a Compare July 24, 2021 21:29
@warner warner requested review from FUDCo and katelynsills July 24, 2021 21:29
@warner
Copy link
Member Author

warner commented Jul 24, 2021

I'm still working on docs, but I'm not going to be changing any of the actual code, so I think this is ready for reviewers to look at.

@warner
Copy link
Member Author

warner commented Jul 24, 2021

Er, ok I gotta fix those tests too.

@warner
Copy link
Member Author

warner commented Jul 24, 2021

Oh, right, I removed metered: from the createVat options, but didn't provide a replacement. I'm going to add a createInfiniteMeter for things like Zoe to use for now, until @katelynsills changes it to use a charge account.

@warner warner removed request for katelynsills and FUDCo July 24, 2021 22:02
@warner warner force-pushed the 3308-meter-cranks branch 2 times, most recently from 39865e3 to 2491051 Compare July 24, 2021 23:50
warner added 5 commits July 24, 2021 17:14
This introduces the "meterID" and the "meter record": a pair of
Nats (`remaining` and `threshold`). kernelKeeper functions are added to
create and manipulate them, and test-state.js is enhanced to exercise these.

refs #3308
This introduces user-visible Meter objects, and allows new dynamic vats to be
bound to a single Meter, both exposed through the vatAdmin facet. Meters are
long-term reservoirs of execution credits, denominated in "computrons". Each
bound vat will deduct credits from its meter until exhausted, at which point
the vat will be terminated. This limits the long-term CPU consumption of a
vat, in addition to the fixed per-crank computron limit applied to any
metered vat.

Meters can be refilled and queried through their API. Each meter also has a
Notifier, and a configurable notification threshold: the notifier will be
updated if/when the remaining credits drop below the threshold. This should
allow a supervisor in userspace enough time to refill the meter before the
associated vat(s) are terminated.

See `docs/metering.md` for documentation.

Some notes:
* The vatAdmin facet now offers `createMeter()`, which returns a `Meter`
  object with methods to manipulate its `remaining` and `threshold` values, as
  well as a `getNotifier` to subscribe to threshold-passing events.
  * It also offers `createUnlimitedMeter()`, which never deducts.
* The vatAdmin `createVat()` call now takes a `meter: Meter` option instead
  of `metered: boolean`. If a Meter is provided, two things happen:
  * Each delivery to that (XS) vat is subject to a per-crank compute limit.
  * Each delivery deducts the compute usage from the Meter.
* When a Meter's `remaining` drops below its `threshold`, the notifier is
  triggered with the current `remaining` value. The actual Meter's value might
  have changed by the time the subscriber hears about the update.
* When a vat's Meter reaches zero, the vat is terminated, just as if it had
  violated the per-crank limit.
  * Currently the termination message (used to reject the control facet's
    `.done()` Promise) is different for the per-crank limit vs the Meter limit,
    but this may change.
* Meter deductions and threshold notifications are stashed in a new
  'postAbortActions' record, to make sure they happen even if the crank is
  aborted and all other state changes are unwound.
* The vatManager `managerOptions` still use `metered: boolean`, because the
  vat manager doesn't know about Meters: it only need to know whether to apply
  the per-crank limits or not.

closes #3308
Now that dynamic vats require a Meter object in the `meter:` option, rather
than just a `metered:` boolean flag, the swingset-runniner `--meter` option
doesn't work anymore. I'm removing it to allow the test to keep passing.
Eventually it might be interesting to improve the CLI arguments to create and
provide a Meter, but the need for user-level code to supervise and refill it
makes that non-trivial.
Zoe and spawner use `createVat()` to do their jobs. Previously, they used `{
metered: true }`, which provided a per-crank limit but not cumulative limit.
This changes both to use `createUnlimitedMeter()`, then pass that as a `{
meter }` option, to achieve the same effect.

When Zoe is ready to maintain (and refill) a Meter, change that to use
`createMeter()` instead of `createUnlimitedMeter()`.

This also adds `createMeter` and `createUnlimitedMeter` methods to the fake
vatAdmin objects used by the zoe/spawner/pegasus unit tests, so the new
Zoe/spawner code will work against the mocks.

refs #3308
This improves the error messages when something goes wrong.
@warner warner force-pushed the 3308-meter-cranks branch from 2491051 to a886f41 Compare July 25, 2021 00:14
@warner
Copy link
Member Author

warner commented Jul 25, 2021

Ok, adding the "unlimited" meter should fix that. Ready for review. I have one test to add, but no further code changes to make.

@warner warner requested review from FUDCo and katelynsills July 25, 2021 00:16
@warner warner marked this pull request as ready for review July 25, 2021 00:16
When a crank causes the compute Meter to both hit zero *and* cross the
notification threshold at the same time, we need extra code to make sure the
run-queue push of the notification message does not get deleted by the
`abortCrank()` that unwinds the vat's side-effects.

The mechanism I wrote for this worked, but was overkill. After writing a test
for it, I noticed the test still passed even if I commented out the mechanism
that I thought was necessary. I simplified that code (we only need to repeat
the `deductMeter`, because that will take care of repeating the
notification), and added the test.

refs #3308
Base automatically changed from 3518-remove-local-metering to master July 25, 2021 07:29
Copy link
Contributor

@FUDCo FUDCo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whew!

packages/SwingSet/docs/metering.md Outdated Show resolved Hide resolved

## Meter Objects

The kernel manages `Meter` objects. Each one has a `remaining` capacity and a notification `threshold`. The Meter has a `Notifier` which can inform interested parties when the capacity drops below the threshold, so they can refill it before any associated vats are in danger of being terminated due to an underflow.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like we'd want to avoid (or at least, provide the means to avoid if desired) the race between the notification going out to be acted upon and ongoing further computation in the vat underflowing the meter. Instead of just a threshold for notification, it would make sense to me to have a threshold that when reached causes computation to stop in that vat until the meter is replenished, since this is a recoverable condition whereas meter underflow is not (though we'd need an answer as to who pays the freight for any message traffic that piles up in the meantime).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that's a great idea, although I think we'll need more sophistication in the scheduler first. We need a state for message to be in that says "I want to run this, everything that came before it has been delivered, but for various reasons this one is blocked". I don't know what ordering properties we should impose on the messages that follow: maybe it's just everything for the "paused in arrears" vat that gets stalled, but I think "E order" has something to say about that.

The #3517 "Flow Escalator" approach might provide a mechanism for this, when a flow's first message is blocked, the whole flow gets moved onto a special pseudo-queue that doesn't get serviced until the target vat is resumed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like @FUDCo's idea, which seems related to @zarutian's ideas:

Perhaps chargeAccounts could be „over-charged“, that is go into negative. When that happens ZCFVats of the contract instances drawing from that chargeAccount wont run until the chargeAccount is in the positive again.

He goes on to explain a mechanism for how users can add to the chargeAccount for a contract instance.

Termination seems too extreme, if stopping delivering messages is plausible. Making this a future goal sounds like the right thing to me.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tracked in #3528


## runPolicy

TODO: The host application can limit the number of cranks processed in a single call to `controller.run()` by providing a `runPolicy` object. This policy object is informed about each crank and the number of computrons it consumed. By comparing the cumulative computrons against an experimentally (and externally) determined threshold, the `runLimit` object can tell the kernel to stop processing before the run-queue is drained. For a busy kernel, with an ever-increasing amount of work to do, this can limit the size of a commitment domain (e.g. the "block" in a blockchain / consensus machine).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, this is a partial answer to my above comment.

Copy link
Contributor

@katelynsills katelynsills left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! One question about the relationship between enforcing a per-crank limit and subtracting from a meter. It seems like we might want them separately, so perhaps the settings should be separate.

import test from 'ava';
// eslint-disable-next-line import/order
import { test } from '@agoric/swingset-vat/tools/prepare-test-env-ava.js';
// eslint-disable-next-line import/order
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huh, I haven't seen this eslint setting before. Tiny thing, but can you tell me more about why we need to disable it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yarn lint-fix keeps rearranging my import statements, I think it wants all "external" imports to appear in alphabetical order, followed by all "internal" imports in order, and this idiom seems to work to keep the prepare-test-env-ava as the very first one (since it has side-effects and won't work if it appears too late). I wouldn't be surprised if there is a way to phrase it with just a single override, but I haven't found it yet, and I got tired of fighting with eslint so I stopped once I got something that worked :).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you may be running a different (newer) version of ESLint than I am. My version (which I believe is the agoric-sdk version, but I should check) doesn't enforce that. Here's the output after removing the eslint-disable:

Screen Shot 2021-07-25 at 12 04 19 PM


## Meter Objects

The kernel manages `Meter` objects. Each one has a `remaining` capacity and a notification `threshold`. The Meter has a `Notifier` which can inform interested parties when the capacity drops below the threshold, so they can refill it before any associated vats are in danger of being terminated due to an underflow.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like @FUDCo's idea, which seems related to @zarutian's ideas:

Perhaps chargeAccounts could be „over-charged“, that is go into negative. When that happens ZCFVats of the contract instances drawing from that chargeAccount wont run until the chargeAccount is in the positive again.

He goes on to explain a mechanism for how users can add to the chargeAccount for a contract instance.

Termination seems too extreme, if stopping delivering messages is plausible. Making this a future goal sounds like the right thing to me.


The default (omitting a `meter` option) leaves the vat unmetered.

Assigning a Meter to a vat activates the per-crank limit. To achieve a per-crank limit without a Meter object (which must be refilled occasionally to keep the vat from being terminated), use an unlimited meter:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, that seems like a setting that we might want to make explicit, especially since per-crank limits don't really have much to do with reducing a meter.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll add this as a future goal to consider. My half-considered opinion is that having a per-crank limit but not a Meter isn't really much of a limit: it protects against infinite loops but not infinite send-to-self cycles (with the assistance of a second vat, e.g. purse.getCurrentAmount()). So a vat in that state could consume effectively all of the CPU time.

The only use case I can think of for that mode would be the REPL, which is in a sort of intermediate state between "protect against accidents" but "trust that the code isn't malicious" / "if you break it, you get to keep both pieces". I added UnlimitedMeter to serve as a bridge from the old behavior to the new (charge account) one, and maybe the REPL will stay on it, but I figured real contract vats should not be allowed to use it.

I'll add a ticket with these thoughts where we can think through the use cases. It wouldn't be hard to implement, but I didn't want to leave an attractive nuisance lying around in the API :).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added as #3527

@warner warner merged commit 4ea700d into master Jul 25, 2021
@warner warner deleted the 3308-meter-cranks branch July 25, 2021 18:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
metering charging for execution (was: package: tame-metering and transform-metering) SwingSet package: SwingSet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants