Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug 1876680: Reduce influence of webhooks and convert some errors to non-fatal #697

Merged
merged 5 commits into from
Sep 24, 2020

Conversation

JoelSpeed
Copy link
Contributor

@JoelSpeed JoelSpeed commented Sep 9, 2020

This should allow our validation/defaulting webhooks to respond with some of their validations as warnings rather than errors.

The webhooks have been provided opinionated defaults and requiring these in validation. This may cause issues for existing users and as such, we need to reel back on these changes and ensure that we only error when a value is missing that would cause a Machine to not be created successfully.

This PR should reduce the opinionation and ensure that we don't break any existing installations

Copy link
Contributor

@michaelgugino michaelgugino left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should do a separate PR for k8s version bump.

For instance, openshift/client-go looks like it should be release-4.6 but we're using something much older.

return admission.Denied(err.Error())
ok, warnings, errs := h.webhookOperations(m, h.clusterID)
if !ok {
return responseWithWarnings(admission.Denied(errs.Error()), warnings)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will result in a denial still?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, the intention here is that within the webhookOperations we should be able to define fatal and non-fatal errors.

Fatal errors are added to errs, non fatal errors are added to warnings.

If there are any fatal errors, we return and admission.Denied as we currently do for any error.

If there are no fatal errors, we return admission.Allowed as we currently do for no errors.

However, in both cases, we can also return non-fatal errors (warnings), which will allow the request to be allowed or denied as appropriate, but send those warnings back to the user even though it is accepted.

For a worked example, imagine everything is happy, as a user I try to set the disk size to 60gb, the validation logic thinks this should be 120gb or more so sets a warning, but there are no other errors, so it returns allowed with a warning disk size should be greater than 120gb, the request to update from the user was accepted, but kubectl shows them this warning.

}

return admission.Allowed("Machine valid")
return responseWithWarnings(admission.Allowed("Machine valid"), warnings)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems confusing to me. We're saying it's valid, but also there may be warnings?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@openshift-ci-robot openshift-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Sep 12, 2020
@openshift-ci-robot openshift-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Sep 16, 2020
@JoelSpeed JoelSpeed changed the title Allow admission controllers to respond with warnings Reduce influence of webhooks and convert some errors to non-fatal Sep 16, 2020
Comment on lines +413 to +404
func responseWithWarnings(response admission.Response, warnings []string) admission.Response {
response.AdmissionResponse.Warnings = warnings
return response
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I intend for this to be replaced by kubernetes-sigs/controller-runtime#1157 once it is merged

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

may be add a TODO comment

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since it's merged, and to keep the scope of this PR reduced, I'm gonna raise a follow up PR that we can keep open until a point at which we are happy to merge it rather than adding the todo

}
if providerSpec.OSDisk.ManagedDisk.StorageAccountType == "" {
errs = append(errs, field.Required(field.NewPath("providerSpec", "osDisk", "managedDisk", "storageAccountType"), "storageAccountType must be provided"))
if providerSpec.OSDisk.DiskSizeGB <= 0 || providerSpec.OSDisk.DiskSizeGB >= 32768{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move this into a constant 32768?
Missing space before {

workspaceWarnings, workspaceErrors := validateVSphereWorkspace(providerSpec.Workspace, field.NewPath("providerSpec", "workspace"))
warnings = append(warnings, workspaceWarnings...)
errs = append(errs, workspaceErrors...)

errs = append(errs, validateVSphereNetwork(providerSpec.Network, field.NewPath("providerSpec", "network"))...)

if providerSpec.NumCPUs != 0 && providerSpec.NumCPUs < minVSphereCPU {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't we remove the if providerSpec.NumCPUs != 0 check? so wouldn't providerSpec.NumCPUs < minVSphereCPU be enough?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, good point, will fix

@enxebre
Copy link
Member

enxebre commented Sep 16, 2020

can we also drop defaultGCPServiceAccounts on GCP? we can always add it back later.

enxebre added a commit to enxebre/cluster-api-actuator-pkg that referenced this pull request Sep 16, 2020
@enxebre
Copy link
Member

enxebre commented Sep 16, 2020

/retitle Bug 1876680: Reduce influence of webhooks and convert some errors to non-fatal

@openshift-ci-robot openshift-ci-robot changed the title Reduce influence of webhooks and convert some errors to non-fatal Bug 1876680: Reduce influence of webhooks and convert some errors to non-fatal Sep 16, 2020
@openshift-ci-robot openshift-ci-robot added the bugzilla/severity-high Referenced Bugzilla bug's severity is high for the branch this PR is targeting. label Sep 16, 2020
@openshift-ci-robot
Copy link
Contributor

@JoelSpeed: This pull request references Bugzilla bug 1876680, which is valid. The bug has been moved to the POST state. The bug has been updated to refer to the pull request using the external bug tracker.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target release (4.6.0) matches configured target release for branch (4.6.0)
  • bug is in the state NEW, which is one of the valid states (NEW, ASSIGNED, ON_DEV, POST, POST)

In response to this:

Bug 1876680: Reduce influence of webhooks and convert some errors to non-fatal

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci-robot openshift-ci-robot added the bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. label Sep 16, 2020
@JoelSpeed
Copy link
Contributor Author

can we also drop defaultGCPServiceAccounts on GCP? we can always add it back later.

In my testing, if there was no GCP Service Account then the node wouldn't join the cluster. Based on what I've done in this PR so far, I'd suggest making this a warning if changing at all, WDYT?

@enxebre
Copy link
Member

enxebre commented Sep 16, 2020

/approve

@openshift-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: enxebre

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci-robot openshift-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Sep 16, 2020
The webhooks have been provided opinionated defaults and requiring these
in validation. This may cause issues for existing users and as such, we
need to reel back on these changes and ensure that we only error when a
value is missing that would cause a Machine to not be created
successfully
The webhooks have been provided opinionated defaults and requiring these
in validation. This may cause issues for existing users and as such, we
need to reel back on these changes and ensure that we only error when a
value is missing that would cause a Machine to not be created
successfully
The webhooks have been provided opinionated defaults and requiring these
in validation. This may cause issues for existing users and as such, we
need to reel back on these changes and ensure that we only error when a
value is missing that would cause a Machine to not be created
successfully
The webhooks have been provided opinionated defaults and requiring these
in validation. This may cause issues for existing users and as such, we
need to reel back on these changes and ensure that we only error when a
value is missing that would cause a Machine to not be created
successfully
@enxebre
Copy link
Member

enxebre commented Sep 23, 2020

/retest

Copy link
Contributor

@elmiko elmiko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this generally looks good to me, i'm withholding adding the label though because i couldn't till if there were still changes requested. happy to revisit if there are no more questions.

@enxebre
Copy link
Member

enxebre commented Sep 24, 2020

/retest

@enxebre
Copy link
Member

enxebre commented Sep 24, 2020

/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Sep 24, 2020
@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

1 similar comment
@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-merge-robot openshift-merge-robot merged commit 5cf5a28 into openshift:master Sep 24, 2020
@openshift-ci-robot
Copy link
Contributor

@JoelSpeed: All pull requests linked via external trackers have merged:

Bugzilla bug 1876680 has been moved to the MODIFIED state.

In response to this:

Bug 1876680: Reduce influence of webhooks and convert some errors to non-fatal

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@JoelSpeed JoelSpeed deleted the webhook-warnings branch November 18, 2020 14:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. bugzilla/severity-high Referenced Bugzilla bug's severity is high for the branch this PR is targeting. bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. lgtm Indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants