-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
KEP 2200: Block service ExternalIPs via admission #2176
Conversation
c9eb4d0
to
81b596d
Compare
/cc @maplain |
@maplain: GitHub didn't allow me to request PR reviews from the following users: maplain. Note that only kubernetes members and repo collaborators can review this PR, and authors cannot review their own PRs. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
81b596d
to
4348dbf
Compare
## Proposal | ||
|
||
This KEP proposes to add a built-in admission controller | ||
"BlockServiceExternalIPs", which rejects any CREATE or UPDATE operation which |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would like to see the plugin created with the ability for a cluster-admin (or someone with sufficient privilege) to set this value. You can see an example of how this has been done by other distros here: https://github.com/openshift/kubernetes/blob/9b48f001e5488bbfb1199fe1401dd70bda8ca5b0/openshift-kube-apiserver/admission/network/externalipranger/externalip_admission.go#L159-L165 .
I also think that ranges allowed by that plugin reflect a balance that we found useful over these years, but are slightly less important.
FWIW, that plugin has been in use since 1.2, if we're looking for an implementation that has walked the usability/security balance for a while.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not clear what you are asking for?
I don't think allowing ranges is that useful at the cluster granularity. Nothing stops 2 users from using the same IPs, even if they are in-range.
What are you actually asking for?
### Risks and Mitigations | ||
|
||
Some installations may want to use this feature in a more controlled way. They | ||
can use a custom webhook admission controller or a policy controller to enforce |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Somewhat related to this, there were past discussions in SIG Cloud Provider on having the cloud provider module stand up an admission webhook server as part of cloud-controller-manager
and cloud providers would just have to implement a ValidateService()
and MutateService()
interface method. The main use-cases were to validate port protocols (since some don't support UDP or mixed protocols) and defaulting spec.allocateLoadBalancerNodePorts
, but this seems like another useful reason to have it.
This would just be one of many custom webhooks for policy control but seems like an easier/faster way to enable this policy control for users since many clusters already deploy cloud-controller-manager (or will in the future). ExternalIP validation on cloud providerse could be more granular than the proposed admission control here. Would that be a valuable framework to have and do you think we should tackle it in parallel to this? cc @cheftako @cici37 @maplain
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this particular knob - externalIPs is really cloud-provider centric. I don't know how I feel about your proposal overall (would have to think it thru) but I think it is orthogonal to this.
In this case I really just want a big hammer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree it's orthogonal. It wouldn't replace this, just an added alternative to it.
It is a slippery-slope to other ad hoc policies. Counter: this is very | ||
surgical and overwhelmingly not a useful feature. | ||
|
||
## Alternatives |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not change validation instead of making the admission plugin?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That would be a breaking change. Admission is opt-in. We could add a totally different flag to control it, but this seems more appropriate as a plugin amongst the other plugins.
4348dbf
to
9a08d89
Compare
Updated with feedback so far. |
9a08d89
to
06c003a
Compare
5d1917e
to
924fea4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some smallish updates - hope to get this approved in the next weeks.
#### Alpha -> GA Graduation | ||
|
||
- Tests and code complete | ||
- One cycle in alpha |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lavalamp @johnbelamaric Regarding alpha for a new admission control: I could add a gate that controls this admission control - is that the normal lifecycle? The controller itself is trivial - what is the norm here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That seems sort of weird, since it's already opt-in and doesn't have any API associated with it. What is the risk you're trying to mitigate with a staged alpha -> beta -> ga rollout?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm just trying to not subvert the existing processes. If we're all comfortable with it, I'd happily skip this. Will add a commit to remove this and if we like, we keep.
- sig-auth | ||
- sig-security | ||
- sig-api-machinery | ||
status: implementable |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have taken the liberty of marking it implementable because kubernetes/kubernetes#97395 exists
@tallclair PTAL
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No fundamental issues with this, just a bunch of questions.
## Design Details | ||
|
||
One simple admission controller should be enough to disable this misfeature. | ||
Unfortunately it can not be on by default (that would be breaking). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An argument against this model for triggering enablement in the past is that it requires k8s providers to surface the configuration option (or enable the protection) for customers. I don't have a strong stance on this in this case, but it's something to call out / consider.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, with my provider hat on, I am keenly aware of it. In general I would say to avoid this pattern. That said, there are a few factors to consider:
- This is a pretty egregious misfeature.
- We do not have an API for dynamically enabling admission controllers. If we want that, I'm game to help design it, but that seems like a different can of worms for providers. Perhaps we want a class of control that isn't as broad as "all the built in admission controls" but is more built-in that "run a webhook". The problem with that right now is that we have a single example (that I know of). If we had 3 or 4 we might want to tackle that. I posit that we can tackle that additively.
- I'd argue that sooner is better than later. Ideally we;d have had this before we dropped a CVE, but it didn't seem feasible before (not sure why).
I'll add some of this to the text.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are others:
- PodSecurityPolicy (deprecated, but likely to be replaced with another controller
- AlwaysPullImages - not sure who is surfacing this today, but there's a lot of requests for it
- ImagePolicy (depends on additonal config)
I agree with the goal of getting this out there, and the admission controller could be a first step towards removing the feature.
#### Alpha -> GA Graduation | ||
|
||
- Tests and code complete | ||
- One cycle in alpha |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That seems sort of weird, since it's already opt-in and doesn't have any API associated with it. What is the risk you're trying to mitigate with a staged alpha -> beta -> ga rollout?
* Force users to use policy controllers as webhooks. Forever. | ||
* Make a breaking API change and disable or rip-out the feature. | ||
* Add a new flag telling validation logic to dissallow this field. | ||
* Make a more complex API to define which namespaces can use this feature |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe a subset of this bullet, but what about requiring a status update, similar to how load balancer services work? In other words, the externalIP field would be treated as a request, but only the status field would actually be respected by kube-proxy. Then, either a privileged user or controller would need to "approve" the request by setting the status IP?
Probably not worth the trouble for a feature that should just be derpecated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You expressed my main point - not worth the effort to try to solve as built-in. At least not without real use-cases to justify it. It would still be a breaking change and much more complicated. Very low RoI.
924fea4
to
3c1b851
Compare
ping to @tallclair |
/lgtm |
Hi, forgot to mention, all KEP now requires a production readiness review to be included. Please don't forget to follow the instruction to include PRR correctly. Specifically, you are required to create a
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PRR looks mostly good, some minor points. Also will need to create a PRR approval request yaml as described in here: https://github.com/kubernetes/community/blob/master/sig-architecture/production-readiness.md#submitting-a-kep-for-production-readiness-approval
Example PR: https://github.com/kubernetes/enhancements/pull/2179/files
That's how we handle PRR approval now so that we can leverage the OWNERS files in the approval process.
3c1b851
to
4ac83d7
Compare
I think all fixed. |
/approve for prod readiness there are a couple tweaks needed in |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: johnbelamaric, thockin The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
4ac83d7
to
96adcff
Compare
This KEP proposes to add a surgical admission controller (optional) to block use of the `Service.spec.externalIPs` misfeature. See CVE-2020-8554: Man in the middle using LoadBalancer or ExternalIPs (https://www.first.org/cvss/calculator/3.0#CVSS:3.0/AV:N/AC:L/PR:L/UI:N/S:U/C:L/I:L/A:L) for details.
96adcff
to
3d565d0
Compare
/lgtm |
This KEP proposes to add a surgical admission controller (optional) to
block use of the
Service.spec.externalIPs
misfeature.See CVE-2020-8554: Man in the middle using LoadBalancer or ExternalIPs (kubernetes/kubernetes#97110)
for details.