Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

design: reduce scope of node on node object #911

Merged
merged 2 commits into from
Nov 12, 2018

Conversation

mikedanese
Copy link
Member

Super mini design doc about centralizing reporting of some sensitive kubelet attributes.

@kubernetes/sig-auth-misc
@roberthbailey
@kubernetes/sig-cluster-lifecycle-misc as it relates to registration

@k8s-ci-robot k8s-ci-robot added sig/auth Categorizes an issue or PR as relevant to SIG Auth. sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Aug 14, 2017
[initializer](admission_control_extension.md) mechanism, a centralized
controller can register an initializer for the node object and build the
sensitive fields by consulting the machine database. The
`cloud-controller-manager` is an obvious candidate to house such a controller.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since those fields change over time I'm not even sure that initializers are required for anything except strong exclusion rules.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you are right. I think we want strong exclusion of sensitive labels, taints and addresses though. I will keep the initializer and add a stanza that allows the central controller reconcile objects as well.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"sensitive labels" is going to be really tough to pin down. Examples of node self-labeling I've seen just in the past two days are cpu policy and kernel version, both of which could be used for capability steering (a workload requiring a specific CPU policy or kernel version to run) or for security purposes ("find nodes running a kernel with a known vulnerability", "schedule pods to a known good kernel")

@smarterclayton
Copy link
Contributor

Has come up a lot in security auditing.

@k8s-github-robot k8s-github-robot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Aug 15, 2017
dedicated nodes in the workload controller running `customer-info-app`.

Since the nodes self reports labels upon registration, an intruder can easily
register a compromised node with label `foo/dedicated=customer-info-app`. The
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The label part seems minor in comparison to the problem of compromised nodes registering themselves at all.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which is another way to say, if we can't trust a node to set appropriate labels then why are we trusting it at all?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

labels allow steering workloads. we need to be able to set up labeled topologies of nodes to keep classes of workloads separate, and be able to isolate a compromised node by tainting it, unlabeling it, and know that it didn't steer workloads to itself by adding to its own labels.

Since the nodes self reports labels upon registration, an intruder can easily
register a compromised node with label `foo/dedicated=customer-info-app`. The
scheduler will then bind `customer-info-app` to the compromised node potentially
giving the intruder easy access to the PII.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the same applies to allowing a node to remove taints (or delete its own Node API object while tainted)

@liggitt
Copy link
Member

liggitt commented Aug 24, 2017

cc @kubernetes/sig-node-proposals

@k8s-ci-robot k8s-ci-robot added sig/node Categorizes an issue or PR as relevant to SIG Node. kind/design Categorizes issue or PR as related to design. labels Aug 24, 2017
@mikedanese
Copy link
Member Author

This was discussed in sig-auth and the proposal needs some further consideration:

  • We need to dig into the label/taint whitelisting mechanism. We need to make it easy for kubelet to set some of these on itself. We need to make it clear that kubelet self-set labels cannot be used to implement strong exclusion.
  • We need to clarify what the expectations are for a deployment running without a cloud provider.
  • We need to decide whether a node should be able to delete itself, thereby potentially removing taints from itself.
  • We may need to consider how we might need to adjust kubelet starts up.

@liggitt
Copy link
Member

liggitt commented Sep 22, 2017

this also came up in the GCE cloud provider trying to determine what zones exist by looking at the node API objects and trusting whatever zone they reported they were in (kubernetes/kubernetes#52322 (comment))

@tallclair
Copy link
Member

/cc @davidopp @mtaufen

@roberthbailey
Copy link
Contributor

@mikedanese - are you aiming to get this merged and something implemented for 1.9? Or is this a longer term proposal?

@liggitt
Copy link
Member

liggitt commented Oct 9, 2017

are you aiming to get this merged and something implemented for 1.9? Or is this a longer term proposal?

I'd like to see the labeling/tainting approach agreed on and implemented for 1.9

The node addresses are more involved and have more ties to cloud providers and variance between cloud/non-cloud environments.

@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 8, 2018
(e.g. `foo/dedicated=customer-info-app`) on the node and to select these
dedicated nodes in the workload controller running `customer-info-app`.

Since the nodes self reports labels upon registration, an intruder can easily
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm all for belts and suspenders, but is "an intruder registers a node into our cluster" a high-prio attack vector?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It turns "can launch a VM inside a given infrastructure account" into "can root the entire infrastructure account" when you take into account that the masters need certain privileges on the infrastructure account. So I would say yes.


```
kubernetes.io/hostname
failure-domain.[beta.]kubernetes.io/zone
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe simpler to say:

kubernetes.io/hostname
kubernetes.io/os
kubernetes.io/arch
kubernetes.io/instance-type
[*.]beta.kubernetes.io/* (deprecated)
failure-domain.kubernetes.io/*
[*.]kubelet.kubernetes.io/*
[*.]node.kubernetes.io/*

Could we maybe argue to allow all of naked kubernetes.io ?

Or at least make it clear that this list may change in the future. Concretely, we might add more top-level things, we might enable new prefixes, and we might even provide policy rules to users.

Copy link
Member

@liggitt liggitt Nov 12, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe simpler to say
...

Updated to just enumerate allowed labels

Could we maybe argue to allow all of naked kubernetes.io ?

I'd prefer to start with this specific set, and document the allowed set may grow or shrink in the future.

Or at least make it clear that this list may change in the future. Concretely, we might add more top-level things, we might enable new prefixes, and we might even provide policy rules to users.

+1

[*.]node.kubernetes.io/*
```

2. Reserve/recommend the `node-restriction.kubernetes.io/*` label prefix for users
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this name already codified somewhere? I don't like the length or specificity of it.

spitballing:

protected.kubernetes.io
user.kubernetes.io
admin.kubernetes.io
local.kubernetes.io
site.kubernetes.io
my.kubernetes.io
x.kubernetes.io

Or maybe a distinct TLD?

[.]local/
[.].k8s/

Think through whether this prefix will be applicable in any other context (one advantage of node-restriction is that it is pretty clearly node-related).

Naming is hard, but I am OK with the rest of this proposal

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this name already codified somewhere? I don't like the length or specificity of it.

No, it's new to this proposal. I agree on the length, but I like the specificity.

Think through whether this prefix will be applicable in any other context (one advantage of node-restriction is that it is pretty clearly node-related).

having stewed on it for a few days, I actually like this prefix for this use:

  • it is clearly node-related
  • it only reserves a single specific label prefix
  • it connects the label to the admission plugin that enforces it
  • it connotes both a restriction on the node itself, and reads ok as a node selector (node-restriction.kubernetes.io/fips=true, node-restriction.kubernetes.io/pci-dss=true, etc)

@thockin
Copy link
Member

thockin commented Nov 8, 2018

I'll LGTM for merge now, but would appreciate just a BIT more thought on naming and scope.

/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 8, 2018
@krmayankk
Copy link

who is working on the implementation ?

@liggitt
Copy link
Member

liggitt commented Nov 9, 2018

I am, will update the PR today after thinking through Tim's last comments

@thockin
Copy link
Member

thockin commented Nov 9, 2018 via email

@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 12, 2018
…within the kubernetes.io/k8s.io label namespace
@smarterclayton
Copy link
Contributor

/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 12, 2018
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: smarterclayton, thockin

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Nov 12, 2018
@liggitt
Copy link
Member

liggitt commented Nov 12, 2018

/hold cancel

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Nov 12, 2018
@k8s-ci-robot k8s-ci-robot merged commit 732907c into kubernetes:master Nov 12, 2018
@thockin
Copy link
Member

thockin commented Nov 19, 2018 via email

@mikedanese mikedanese deleted the limit-node branch November 19, 2018 20:14
justaugustus pushed a commit to justaugustus/community that referenced this pull request Dec 1, 2018
design: reduce scope of node on node object
MadhavJivrajani pushed a commit to MadhavJivrajani/community that referenced this pull request Nov 30, 2021
design: reduce scope of node on node object
danehans pushed a commit to danehans/community that referenced this pull request Jul 18, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/design Categorizes issue or PR as related to design. kind/feature Categorizes issue or PR as related to a new feature. lgtm "Looks good to me", indicates that a PR is ready to be merged. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. sig/architecture Categorizes an issue or PR as relevant to SIG Architecture. sig/auth Categorizes an issue or PR as relevant to SIG Auth. sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. sig/node Categorizes an issue or PR as relevant to SIG Node. sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.