Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add plugin status plumbing and update server to use it #2041

Merged
merged 8 commits into from
Feb 7, 2020

Conversation

patrick-east
Copy link
Contributor

@patrick-east patrick-east commented Jan 28, 2020

This PR adds in a concept of plugin's being able to report their own status to the plugin manager. There is then plumbing added for status update listeners, which the status plugin does. The OPA server has been updated to use the plugin statuses for the bundle and discovery plugin to more correctly report on /health?bundle=true for readiness probes (it used to not include discovery at all). In addition the server has a new option for /health?plugins that will return a status based on all plugins registered (and reporting status). This gives plugin developers a way to influence the result of the health check if a plugin goes into an error state or needs time to initialize.

Its broken up into smaller bite-sized commits, each with a commit message explaining more about what changed.

Fixes: #2010
Fixes: #2015

@patrick-east patrick-east requested a review from tsandall January 28, 2020 23:55
Copy link
Member

@tsandall tsandall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few high level comments. Once the mutex cleanup is done we can merge this.

docs/content/management.md Outdated Show resolved Hide resolved
docs/content/rest-api.md Outdated Show resolved Hide resolved
plugins/bundle/plugin.go Outdated Show resolved Hide resolved
@@ -173,8 +181,15 @@ func (p *Plugin) Reconfigure(ctx context.Context, config interface{}) {
}
p.downloaders[name] = p.newDownloader(name, source)
p.downloaders[name].Start(ctx)
readyNow = false
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an interesting change. The most pressing issue is that readiness checks in K8s don't take into account discovery. This PR fixes that however if bundles are added/updated and fail to download within a certain amount of time, the liveness probe will fail and the OPA pod will get whacked. Maybe that's the right behaviour. Do we document this somewhere?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, and I think that is probably the right behavior for some use-cases.. At that point in time the OPA is not in a state that it should be answering queries since its missing bundles, maybe some applications it doesn't matter, some it does.

Specific to the OPA HTTP server reporting a 500 on /health we document that any non-activated and configured bundle will cause the health check to fail. Maybe what i'll do is revert the change to the example deployment to omit the ?plugin option on the liveness probe to avoid any accidental confusion on this. People should probably opt in to using it versus just copying the default and having this more aggressive behavior.

This defines a new status API on the plugins.Manager for plugins
to be able to update their status.

Signed-off-by: Patrick East <east.patrick@gmail.com>
The discovery plugin will update its status based on the first time
it initializes and activates configuration. If a bundle is configured
it will switch to OK the first time it activates the configuration of
that bundle. If no dynamic configuration is present it will just go
to an "ok" status at Start() time.

Signed-off-by: Patrick East <east.patrick@gmail.com>
The bundle plugin will emit status based on it having an activated
bundle for each configured bundle. At startup it will remain in a "not
ready" state until all bundles are activated. Upon reconfiguration it
will drop back into "not ready" if new bundles are added or the config
for an existing one was changed, until the bundles have been
activated.

Signed-off-by: Patrick East <east.patrick@gmail.com>
The log plugin will report its status as "ok" at start time and "not
ready" at stop. It currently doesn't change status or report any
errors. This can potentially be extended in the future.

Signed-off-by: Patrick East <east.patrick@gmail.com>
This adds in the status updates for the status plugin (reported to the
plugin manager) as well as hooking the status plugin up as a listener
for *all* plugin status updates to include in the status API events.

Signed-off-by: Patrick East <east.patrick@gmail.com>
The health check now supports a `?plugin` option which will make the
response depend on whether or not all configured plugins are in an OK
state.

The `bundle` parameter will now use the bundle *and* discovery plugin
statuses to determine if the bundles are ready. This corrects an issue
where discovery bundles, and bundles defined by the discovery dynamic
config, were not included with `/health?bundle=true` checks.

The URL parameter parsing has also changed to allow for omitting the
value for the `bundle` option. It will default to `true` so that
`/health?bundle=true` can be shortened to `/health?bundle`.

Fixes: open-policy-agent#2010
Fixes: open-policy-agent#2015
Signed-off-by: Patrick East <east.patrick@gmail.com>
This deprecates `/health?bundle` in favor of (the plural)
`/health?bundles`.

Both have the same API result, taking into account *all* bundle
statuses. This just lessen any confusion with regards to whether
`bundle` means a single bundle or all. It fits better with the new
`plugins` option.

Signed-off-by: Patrick East <east.patrick@gmail.com>
Signed-off-by: Patrick East <east.patrick@gmail.com>
Copy link
Member

@tsandall tsandall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@tsandall tsandall merged commit 52691bf into open-policy-agent:master Feb 7, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

OPA server ready check doesn't account for discovery bundles Allow plugins to change Health endpoint status
2 participants