Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CRO admission plugin: never mutate below namespace minimums #18553

Merged
merged 6 commits into from
Feb 13, 2018

Conversation

frobware
Copy link
Contributor

@frobware frobware commented Feb 9, 2018

Update the ClusterResourceOverride admission plugin to never mutate
container resource requirements below the floor specified by any
LimitRange from the same namespace.

Reason:

  • Web console stripped out cluster resource override config awareness,
    but the user experience is broken right now when users select a
    memory limit that is the floor of the LimitRange value.

  • You can see this on free-int by setting a memory limit of 100Mi,
    your deployment will not work because we allowed
    ClusterResourceOverride to go below the LimitRange floor.

The implication of this change is the over-commit ratio is skewed at
the lower end of the resource consumption scale (but honestly, that
makes sense).

@openshift-ci-robot openshift-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Feb 9, 2018
@frobware
Copy link
Contributor Author

frobware commented Feb 9, 2018

/cc @derekwaynecarr PTAL

@openshift-ci-robot
Copy link

@frobware: GitHub didn't allow me to request PR reviews from the following users: PTAL.

Note that only openshift members and repo collaborators can review this PR, and authors cannot review their own PRs.

In response to this:

/cc @derekwaynecarr PTAL

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@@ -45,10 +44,12 @@ func Register(plugins *admission.Plugins) {
glog.Infof("Admission plugin %q is not configured so it will be disabled.", api.PluginName)
return nil, nil
}
return newClusterResourceOverride(pluginConfig)
return newClusterResourceOverride(pluginConfig, defaultGetNamespaceLimitRanges)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is an odd injection pattern... why not have a LimitRangeLister in the plugin and populate it like this:

// SetInternalKubeInformerFactory implements the WantsInternalKubeInformerFactory interface.
func (p *clusterResourceOverridePlugin) SetInternalKubeInformerFactory(f informers.SharedInformerFactory) {
	p.limitRangesLister = f.Core().InternalVersion().LimitRanges().Lister()
}


for _, limitRange := range limitRanges {
for _, limit := range limitRange.Spec.Limits {
if limit.Type != kapi.LimitTypeContainer {
Copy link
Contributor

@liggitt liggitt Feb 12, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not familiar with what this means... if there's a pod limitrange that says min is 256MB, and a container limit range that says the min is 384MB, does that not mean the min on a container is 384? if so, don't we need to include this type of limit?

I also expected opting into specifically recognized types (pod and maybe container), rather than negative matching (this logic would include PVC limit ranges, for example)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also expected opting into specifically recognized types (pod and maybe container), rather than negative matching (this logic would include PVC limit ranges, for example)

Doesn't this only consider container types?

There's an explicit test case for checking against non-container types:
e76cb4d#diff-62f7f5e71423c1d5dfa6407e3984804bR306

Happy to drop the continue and do the appends iff it's a container type.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks fine to me. You are only modifying pod objects and not pvcs in this scenario. Checking for limit type of Container only is fine.

Copy link
Member

@derekwaynecarr derekwaynecarr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One structural nit so it’s clearer. Also not clear why we have the integration changes in this pr but no big deal there. Update if you agree then lgtm

// minResourceLimits finds the minimum CPU and minimum Memory resource
// values across all the limits in limitRanges. Nil is returned if
// there is CPU or Memory resource respectively.
func minResourceLimits(limitRanges []*kapi.LimitRange) (*resource.Quantity, *resource.Quantity) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I think this is clearer if minResourceLimits(limits, resourceName) quantity

Then calling side you can just call twice with cpu and memory


for _, limitRange := range limitRanges {
for _, limit := range limitRange.Spec.Limits {
if limit.Type != kapi.LimitTypeContainer {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks fine to me. You are only modifying pod objects and not pvcs in this scenario. Checking for limit type of Container only is fine.

@@ -123,7 +123,7 @@ func TestRegistryClientConnectPulpRegistry(t *testing.T) {
}, imageNotFoundErrorPatterns...)
if err != nil {
if strings.Contains(err.Error(), "x509: certificate has expired or is not yet valid") {
t.Skip("SKIPPING: due to expired certificate of %s: %v", pulpRegistryName, err)
t.Skipf("SKIPPING: due to expired certificate of %s: %v", pulpRegistryName, err)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this in this pr?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was a drive-by fix; noticed this when building and running the unit tests for this change. Can drop it and follow up with discrete change.

@frobware
Copy link
Contributor Author

Also not clear why we have the integration changes in this pr

  1. The commit fixing calls to Skip I've just dropped.
  2. I needed to change the CRO integration test because we now floor to the minimums and the existing integration test was expecting the defaults.

Commit b886341 is my change to accommodate this new behaviour. I think its worthwhile taking a look that to see if a) the change makes sense and b) that the test overall makes sense now that we clamp to the floor of the minimums.

@derekwaynecarr derekwaynecarr added this to the 3.9.0 milestone Feb 13, 2018
@derekwaynecarr
Copy link
Member

/approve
/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Feb 13, 2018
@openshift-ci-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: derekwaynecarr, frobware

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these OWNERS Files:

You can indicate your approval by writing /approve in a comment
You can cancel your approval by writing /approve cancel in a comment

@openshift-ci-robot openshift-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Feb 13, 2018
@openshift-merge-robot
Copy link
Contributor

Automatic merge from submit-queue (batch tested with PRs 18576, 18553).

@openshift-merge-robot openshift-merge-robot merged commit 060b9d6 into openshift:master Feb 13, 2018
@frobware frobware deleted the cro-admission-fix branch February 13, 2018 16:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants