Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add support for Pebble check-failed and check-recovered events #1281

Merged
merged 6 commits into from
Jul 17, 2024

Conversation

tonyandrewmeyer
Copy link
Contributor

@tonyandrewmeyer tonyandrewmeyer commented Jul 3, 2024

Two new event types are supported (actually getting these events will also require a Juju change):

  • PebbleChangeFailedEvent - emitted when a Pebble check in a container hits the failure threshold
  • PebbleChangeRecoveredEvent - emitted when a Pebble check in a container passes after previously hitting the failure threshold

Each event type has a single new attribute (in addition to those inherited from WorkloadEvent):

  • info - this is a shortcut to Pebble's get_check method (not otherwise accessible from the charm), but will only query Pebble if more than the name is requested, and will cache the result on the event instance.

Two new enums are added to Pebble, to make it easier to work with these events, particularly with Harness:

  • ChangeStatus - the possible values for ops.pebble.Change.status (note that this is the full list, but we only expect charms to see Doing, Done, and Error for now).
  • ChangeKind - the possible values for ops.pebble.Change.kind (again, this is the current full list, but we only expect charms to see perform-check and recover-check at the moment).

Harness.pebble_notify is extended to support emitting the two new events. This requires implementation knowledge (that the check events are triggered by a Pebble notice of type change-updated), but provides minimal support for testing with Harness, and will be supplemented by user-friendly testing in Scenario.

The testing Pebble backend gains an implementation for get_change and get_checks. These are not used directly here, but will be used in Scenario, and it makes more sense to have the implementation in ops.

This is the ops portion of OP046. It requires a corresponding change to Pebble and change to Juju.

This also exposes PebbleCheckEvent, which I don't love, but it works much better for the documentation without repeating a bunch of content.
@tonyandrewmeyer
Copy link
Contributor Author

This is not particularly usable until the Juju and Pebble changes land, but it should be ok to review now (but not merge: we need to wait in case any changes in the Juju/Pebble side have impacts here).

@tonyandrewmeyer tonyandrewmeyer marked this pull request as ready for review July 5, 2024 03:58
@tonyandrewmeyer tonyandrewmeyer requested review from dimaqq and benhoyt July 5, 2024 03:58
benhoyt added a commit to canonical/pebble that referenced this pull request Jul 5, 2024
…-check (#444)

For notices of kind `change-updated`, the notice data will include the
`kind` field that is standard for these notices, but also when `kind` is
`perform-check` or `recover-check`, it will also include `check-name`,
set to the name of the check that is being performed or recovered.

This is the Pebble portion of
[OP046](https://docs.google.com/document/d/13ItH8l5ZSmmv9WqZXpqjV8EifZU5e3zxINiNLubnC68/edit).
It is used by Juju ([PR](juju/juju#17655)) and
ops ([PR](canonical/operator#1281)).

---------

Co-authored-by: Ben Hoyt <benhoyt@gmail.com>
Copy link
Contributor

@dimaqq dimaqq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Partial review from a mobile device

ops/charm.py Outdated Show resolved Hide resolved
ops/model.py Show resolved Hide resolved
ops/pebble.py Show resolved Hide resolved
ops/pebble.py Show resolved Hide resolved
ops/pebble.py Outdated Show resolved Hide resolved
ops/charm.py Outdated Show resolved Hide resolved
ops/pebble.py Show resolved Hide resolved
@tonyandrewmeyer tonyandrewmeyer requested review from benhoyt and dimaqq July 8, 2024 06:45
Copy link
Collaborator

@benhoyt benhoyt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

test/test_testing.py Show resolved Hide resolved
ops/testing.py Show resolved Hide resolved
ops/testing.py Show resolved Hide resolved
ops/testing.py Show resolved Hide resolved
ops/model.py Show resolved Hide resolved
Comment on lines +854 to +856
d = super().snapshot()
d['check_name'] = self.info.name
return d
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Super-minor: why d? because it's a dict? delta? data?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about:

Suggested change
d = super().snapshot()
d['check_name'] = self.info.name
return d
return {**super().snapshot(), "check_name": self.info.name}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not a fan of the d either, but it's consistent with all of the others. @benhoyt what's your take on this? A separate PR to adjust them all? Have this one be inconsistent? Change them all in this PR as a drive-by? Leave it alone for now?

Copy link
Collaborator

@benhoyt benhoyt Jul 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a separate PR to adjust them all would be fine. Probably dct or snapshot?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dimaqq would you like to open one for that? I can do it if not.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My vote is for an inline dictionary merge, as it obviates local variable entirely, while is just as readable.

ops/model.py Show resolved Hide resolved
jujubot added a commit to juju/juju that referenced this pull request Jul 16, 2024
#17655

This creates new `<container-name>-pebble-check-failed` and `<container-name>-pebble-check-failed` hooks, which are triggered by Pebble change-update notices of particular kinds & statuses, with the effect that we'll now respond to Pebble checks reaching the failure threshold, or passing after having reached the threshold, by sending a pebble-check-failed or pebble-check-recovered event to the charm.

## QA steps

### Test the hooks directly:
```shell
# Pack and deploy the test charm
$ juju add-model t
$ cd testcharms/charms/pebble-checks
$ charmcraft pack
$ juju deploy ./juju-qa-pebble-checks_ubuntu-22.04-amd64.charm --resource ubuntu-image=public.ecr.aws/ubuntu/ubuntu:22.04

# The check is set to succeed if `/trigger/` exists, so make it.
$ juju ssh --container ubuntu juju-qa-pebble-checks/0 mkdir /trigger/

$ juju status # unit status should settle to "active" after pebble-ready

# Make the check fail
$ juju ssh --container ubuntu juju-qa-pebble-checks/0 rmdir /trigger/
$ juju status # will now say status "maintenance" with message "check failed: exec-check"

# Make the check pass again
$ juju ssh --container ubuntu juju-qa-pebble-checks/0 mkdir /trigger/
$ juju status # will now say status "active" with message "check recovered: exec-check"
```

### Test the shell test:
`./main.sh -p k8s -c minikube sidecar test_pebble_checks`

## Documentation changes

 prdesc [CHARMTECH-160](https://warthogs.atlassian.net/browse/CHARMTECH-160), [CHARMTECH-161](https://warthogs.atlassian.net/browse/CHARMTECH-161), [CHARMTECH-162](https://warthogs.atlassian.net/browse/CHARMTECH-162) cover updating the juju.is (main and SDK) docs

## Links

<!-- Link to all relevant specification, documentation, bug, issue or JIRA card. -->

* [OP046](https://docs.google.com/document/d/13ItH8l5ZSmmv9WqZXpqjV8EifZU5e3zxINiNLubnC68/edit)
* [juju/charm PR](juju/charm#432)
* [ops PR](canonical/operator#1281)
* [pebble PR](canonical/pebble#444)
* [jhack PR](canonical/jhack#172)
* [Scenario PR](canonical/ops-scenario#151)

**Jira card:** [CHARMTECH-109](https://warthogs.atlassian.net/browse/CHARMTECH-109)

[CHARMTECH-160]: https://warthogs.atlassian.net/browse/CHARMTECH-160?atlOrigin=eyJpIjoiNWRkNTljNzYxNjVmNDY3MDlhMDU5Y2ZhYzA5YTRkZjUiLCJwIjoiZ2l0aHViLWNvbS1KU1cifQ
@tonyandrewmeyer tonyandrewmeyer merged commit e3eae86 into canonical:main Jul 17, 2024
27 checks passed
@tonyandrewmeyer tonyandrewmeyer deleted the pebble-check-events branch July 17, 2024 00:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants