-
Notifications
You must be signed in to change notification settings - Fork 5.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
design: reduce scope of node on node object #911
Conversation
[initializer](admission_control_extension.md) mechanism, a centralized | ||
controller can register an initializer for the node object and build the | ||
sensitive fields by consulting the machine database. The | ||
`cloud-controller-manager` is an obvious candidate to house such a controller. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since those fields change over time I'm not even sure that initializers are required for anything except strong exclusion rules.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you are right. I think we want strong exclusion of sensitive labels, taints and addresses though. I will keep the initializer and add a stanza that allows the central controller reconcile objects as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"sensitive labels" is going to be really tough to pin down. Examples of node self-labeling I've seen just in the past two days are cpu policy and kernel version, both of which could be used for capability steering (a workload requiring a specific CPU policy or kernel version to run) or for security purposes ("find nodes running a kernel with a known vulnerability", "schedule pods to a known good kernel")
Has come up a lot in security auditing. |
dedicated nodes in the workload controller running `customer-info-app`. | ||
|
||
Since the nodes self reports labels upon registration, an intruder can easily | ||
register a compromised node with label `foo/dedicated=customer-info-app`. The |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The label part seems minor in comparison to the problem of compromised nodes registering themselves at all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Which is another way to say, if we can't trust a node to set appropriate labels then why are we trusting it at all?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
labels allow steering workloads. we need to be able to set up labeled topologies of nodes to keep classes of workloads separate, and be able to isolate a compromised node by tainting it, unlabeling it, and know that it didn't steer workloads to itself by adding to its own labels.
Since the nodes self reports labels upon registration, an intruder can easily | ||
register a compromised node with label `foo/dedicated=customer-info-app`. The | ||
scheduler will then bind `customer-info-app` to the compromised node potentially | ||
giving the intruder easy access to the PII. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the same applies to allowing a node to remove taints (or delete its own Node API object while tainted)
cc @kubernetes/sig-node-proposals |
This was discussed in sig-auth and the proposal needs some further consideration:
|
this also came up in the GCE cloud provider trying to determine what zones exist by looking at the node API objects and trusting whatever zone they reported they were in (kubernetes/kubernetes#52322 (comment)) |
@mikedanese - are you aiming to get this merged and something implemented for 1.9? Or is this a longer term proposal? |
I'd like to see the labeling/tainting approach agreed on and implemented for 1.9 The node addresses are more involved and have more ties to cloud providers and variance between cloud/non-cloud environments. |
(e.g. `foo/dedicated=customer-info-app`) on the node and to select these | ||
dedicated nodes in the workload controller running `customer-info-app`. | ||
|
||
Since the nodes self reports labels upon registration, an intruder can easily |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm all for belts and suspenders, but is "an intruder registers a node into our cluster" a high-prio attack vector?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It turns "can launch a VM inside a given infrastructure account" into "can root the entire infrastructure account" when you take into account that the masters need certain privileges on the infrastructure account. So I would say yes.
|
||
``` | ||
kubernetes.io/hostname | ||
failure-domain.[beta.]kubernetes.io/zone |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe simpler to say:
kubernetes.io/hostname
kubernetes.io/os
kubernetes.io/arch
kubernetes.io/instance-type
[*.]beta.kubernetes.io/* (deprecated)
failure-domain.kubernetes.io/*
[*.]kubelet.kubernetes.io/*
[*.]node.kubernetes.io/*
Could we maybe argue to allow all of naked kubernetes.io
?
Or at least make it clear that this list may change in the future. Concretely, we might add more top-level things, we might enable new prefixes, and we might even provide policy rules to users.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe simpler to say
...
Updated to just enumerate allowed labels
Could we maybe argue to allow all of naked
kubernetes.io
?
I'd prefer to start with this specific set, and document the allowed set may grow or shrink in the future.
Or at least make it clear that this list may change in the future. Concretely, we might add more top-level things, we might enable new prefixes, and we might even provide policy rules to users.
+1
[*.]node.kubernetes.io/* | ||
``` | ||
|
||
2. Reserve/recommend the `node-restriction.kubernetes.io/*` label prefix for users |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this name already codified somewhere? I don't like the length or specificity of it.
spitballing:
protected.kubernetes.io
user.kubernetes.io
admin.kubernetes.io
local.kubernetes.io
site.kubernetes.io
my.kubernetes.io
x.kubernetes.io
Or maybe a distinct TLD?
[.]local/
[.].k8s/
Think through whether this prefix will be applicable in any other context (one advantage of node-restriction is that it is pretty clearly node-related).
Naming is hard, but I am OK with the rest of this proposal
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this name already codified somewhere? I don't like the length or specificity of it.
No, it's new to this proposal. I agree on the length, but I like the specificity.
Think through whether this prefix will be applicable in any other context (one advantage of node-restriction is that it is pretty clearly node-related).
having stewed on it for a few days, I actually like this prefix for this use:
- it is clearly node-related
- it only reserves a single specific label prefix
- it connects the label to the admission plugin that enforces it
- it connotes both a restriction on the node itself, and reads ok as a node selector (
node-restriction.kubernetes.io/fips=true
,node-restriction.kubernetes.io/pci-dss=true
, etc)
I'll LGTM for merge now, but would appreciate just a BIT more thought on naming and scope. /lgtm |
who is working on the implementation ? |
I am, will update the PR today after thinking through Tim's last comments |
See https://github.com/kubernetes/kubernetes/pull/70555/files for another
proposed use of a label to control daemonset addons.
…On Fri, Nov 9, 2018 at 3:52 AM Jordan Liggitt ***@***.***> wrote:
I am, will update the PR today after thinking through Tim's last comments
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#911 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AFVgVGPte5wD9MFcCrFtMF-nFAxJT2_gks5utWyWgaJpZM4O2_Ym>
.
|
…within the kubernetes.io/k8s.io label namespace
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: smarterclayton, thockin The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/hold cancel |
/lgtm
…On Mon, Nov 12, 2018 at 10:15 AM k8s-ci-robot ***@***.***> wrote:
Merged #911 <#911> into
master.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#911 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AFVgVO26YI_pZRtksm-ziTqYfGDsE3diks5uubrNgaJpZM4O2_Ym>
.
|
design: reduce scope of node on node object
design: reduce scope of node on node object
Super mini design doc about centralizing reporting of some sensitive kubelet attributes.
@kubernetes/sig-auth-misc
@roberthbailey
@kubernetes/sig-cluster-lifecycle-misc as it relates to registration