Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KEP 2200: Block service ExternalIPs via admission #2176

Merged
merged 1 commit into from
Jan 27, 2021

Conversation

thockin
Copy link
Member

@thockin thockin commented Dec 8, 2020

This KEP proposes to add a surgical admission controller (optional) to
block use of the Service.spec.externalIPs misfeature.

See CVE-2020-8554: Man in the middle using LoadBalancer or ExternalIPs (kubernetes/kubernetes#97110)
for details.

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory labels Dec 8, 2020
@k8s-ci-robot k8s-ci-robot added sig/network Categorizes an issue or PR as relevant to SIG Network. approved Indicates a PR has been approved by an approver from all required OWNERS files. labels Dec 8, 2020
@thockin thockin force-pushed the externalips-admission branch 3 times, most recently from c9eb4d0 to 81b596d Compare December 8, 2020 05:56
@maplain
Copy link

maplain commented Dec 8, 2020

/cc @maplain

@k8s-ci-robot
Copy link
Contributor

@maplain: GitHub didn't allow me to request PR reviews from the following users: maplain.

Note that only kubernetes members and repo collaborators can review this PR, and authors cannot review their own PRs.

In response to this:

/cc @maplain

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

## Proposal

This KEP proposes to add a built-in admission controller
"BlockServiceExternalIPs", which rejects any CREATE or UPDATE operation which
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like to see the plugin created with the ability for a cluster-admin (or someone with sufficient privilege) to set this value. You can see an example of how this has been done by other distros here: https://github.com/openshift/kubernetes/blob/9b48f001e5488bbfb1199fe1401dd70bda8ca5b0/openshift-kube-apiserver/admission/network/externalipranger/externalip_admission.go#L159-L165 .

I also think that ranges allowed by that plugin reflect a balance that we found useful over these years, but are slightly less important.

FWIW, that plugin has been in use since 1.2, if we're looking for an implementation that has walked the usability/security balance for a while.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not clear what you are asking for?

I don't think allowing ranges is that useful at the cluster granularity. Nothing stops 2 users from using the same IPs, even if they are in-range.

What are you actually asking for?

### Risks and Mitigations

Some installations may want to use this feature in a more controlled way. They
can use a custom webhook admission controller or a policy controller to enforce
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Somewhat related to this, there were past discussions in SIG Cloud Provider on having the cloud provider module stand up an admission webhook server as part of cloud-controller-manager and cloud providers would just have to implement a ValidateService() and MutateService() interface method. The main use-cases were to validate port protocols (since some don't support UDP or mixed protocols) and defaulting spec.allocateLoadBalancerNodePorts, but this seems like another useful reason to have it.

This would just be one of many custom webhooks for policy control but seems like an easier/faster way to enable this policy control for users since many clusters already deploy cloud-controller-manager (or will in the future). ExternalIP validation on cloud providerse could be more granular than the proposed admission control here. Would that be a valuable framework to have and do you think we should tackle it in parallel to this? cc @cheftako @cici37 @maplain

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this particular knob - externalIPs is really cloud-provider centric. I don't know how I feel about your proposal overall (would have to think it thru) but I think it is orthogonal to this.

In this case I really just want a big hammer.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree it's orthogonal. It wouldn't replace this, just an added alternative to it.

It is a slippery-slope to other ad hoc policies. Counter: this is very
surgical and overwhelmingly not a useful feature.

## Alternatives
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not change validation instead of making the admission plugin?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would be a breaking change. Admission is opt-in. We could add a totally different flag to control it, but this seems more appropriate as a plugin amongst the other plugins.

@thockin thockin changed the title KEP NNNN: Block service ExternalIPs via admission KEP 2200: Block service ExternalIPs via admission Dec 18, 2020
@thockin
Copy link
Member Author

thockin commented Dec 18, 2020

Updated with feedback so far.

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Dec 29, 2020
@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jan 4, 2021
Copy link
Member Author

@thockin thockin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some smallish updates - hope to get this approved in the next weeks.

#### Alpha -> GA Graduation

- Tests and code complete
- One cycle in alpha
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lavalamp @johnbelamaric Regarding alpha for a new admission control: I could add a gate that controls this admission control - is that the normal lifecycle? The controller itself is trivial - what is the norm here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That seems sort of weird, since it's already opt-in and doesn't have any API associated with it. What is the risk you're trying to mitigate with a staged alpha -> beta -> ga rollout?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm just trying to not subvert the existing processes. If we're all comfortable with it, I'd happily skip this. Will add a commit to remove this and if we like, we keep.

- sig-auth
- sig-security
- sig-api-machinery
status: implementable
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have taken the liberty of marking it implementable because kubernetes/kubernetes#97395 exists

@tallclair PTAL

Copy link
Member

@tallclair tallclair left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No fundamental issues with this, just a bunch of questions.

keps/sig-network/2200-externalips-admission/README.md Outdated Show resolved Hide resolved
## Design Details

One simple admission controller should be enough to disable this misfeature.
Unfortunately it can not be on by default (that would be breaking).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An argument against this model for triggering enablement in the past is that it requires k8s providers to surface the configuration option (or enable the protection) for customers. I don't have a strong stance on this in this case, but it's something to call out / consider.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, with my provider hat on, I am keenly aware of it. In general I would say to avoid this pattern. That said, there are a few factors to consider:

  1. This is a pretty egregious misfeature.
  2. We do not have an API for dynamically enabling admission controllers. If we want that, I'm game to help design it, but that seems like a different can of worms for providers. Perhaps we want a class of control that isn't as broad as "all the built in admission controls" but is more built-in that "run a webhook". The problem with that right now is that we have a single example (that I know of). If we had 3 or 4 we might want to tackle that. I posit that we can tackle that additively.
  3. I'd argue that sooner is better than later. Ideally we;d have had this before we dropped a CVE, but it didn't seem feasible before (not sure why).

I'll add some of this to the text.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are others:

  • PodSecurityPolicy (deprecated, but likely to be replaced with another controller
  • AlwaysPullImages - not sure who is surfacing this today, but there's a lot of requests for it
  • ImagePolicy (depends on additonal config)

I agree with the goal of getting this out there, and the admission controller could be a first step towards removing the feature.

#### Alpha -> GA Graduation

- Tests and code complete
- One cycle in alpha
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That seems sort of weird, since it's already opt-in and doesn't have any API associated with it. What is the risk you're trying to mitigate with a staged alpha -> beta -> ga rollout?

* Force users to use policy controllers as webhooks. Forever.
* Make a breaking API change and disable or rip-out the feature.
* Add a new flag telling validation logic to dissallow this field.
* Make a more complex API to define which namespaces can use this feature
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe a subset of this bullet, but what about requiring a status update, similar to how load balancer services work? In other words, the externalIP field would be treated as a request, but only the status field would actually be respected by kube-proxy. Then, either a privileged user or controller would need to "approve" the request by setting the status IP?

Probably not worth the trouble for a feature that should just be derpecated.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You expressed my main point - not worth the effort to try to solve as built-in. At least not without real use-cases to justify it. It would still be a breaking change and much more complicated. Very low RoI.

@thockin
Copy link
Member Author

thockin commented Jan 21, 2021

ping to @tallclair

@tallclair
Copy link
Member

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jan 26, 2021
@annajung
Copy link
Contributor

annajung commented Jan 27, 2021

Hi, forgot to mention, all KEP now requires a production readiness review to be included.

Please don't forget to follow the instruction to include PRR correctly. Specifically, you are required to create a prod-readiness/<sig>/<KEP number>.yaml file, with the PRR approver's GitHub handle for the specific stage.

It doesn't have to be part of this PR if you're planning on submitting another one. thank you!

Copy link
Member

@johnbelamaric johnbelamaric left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PRR looks mostly good, some minor points. Also will need to create a PRR approval request yaml as described in here: https://github.com/kubernetes/community/blob/master/sig-architecture/production-readiness.md#submitting-a-kep-for-production-readiness-approval

Example PR: https://github.com/kubernetes/enhancements/pull/2179/files

That's how we handle PRR approval now so that we can leverage the OWNERS files in the approval process.

@thockin
Copy link
Member Author

thockin commented Jan 27, 2021

I think all fixed.

@k8s-ci-robot k8s-ci-robot removed lgtm "Looks good to me", indicates that a PR is ready to be merged. approved Indicates a PR has been approved by an approver from all required OWNERS files. labels Jan 27, 2021
@johnbelamaric
Copy link
Member

/approve for prod readiness

there are a couple tweaks needed in kep.yaml since it's going straight to stable now.

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: johnbelamaric, thockin

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jan 27, 2021
This KEP proposes to add a surgical admission controller (optional) to
block use of the `Service.spec.externalIPs` misfeature.

See CVE-2020-8554: Man in the middle using LoadBalancer or ExternalIPs
(https://www.first.org/cvss/calculator/3.0#CVSS:3.0/AV:N/AC:L/PR:L/UI:N/S:U/C:L/I:L/A:L)
for details.
@johnbelamaric
Copy link
Member

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jan 27, 2021
@k8s-ci-robot k8s-ci-robot merged commit 0f59248 into kubernetes:master Jan 27, 2021
@k8s-ci-robot k8s-ci-robot added this to the v1.21 milestone Jan 27, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory lgtm "Looks good to me", indicates that a PR is ready to be merged. sig/network Categorizes an issue or PR as relevant to SIG Network. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.