server: Support denying serving Ignition to active nodes and pods #784

cgwalters · 2019-05-21T15:31:34Z

Ignition may contain secret data; pods running on the cluster
shouldn't have access.

This PR closes of access to any IP that responds on port 22, as that
is a port that is:

Known to be active by default
Not firewalled

A previous attempt at this was to have an auth token;
but this fix doesn't require changing the installer and people's PXE setups.

In the future we may reserve a port in the 9xxx range and have the
MCD respond on it so that admins who disable/firewall SSH don't
have indirectly reduced security.

openshift-ci-robot · 2019-05-21T15:34:12Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: cgwalters

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [cgwalters]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

cgwalters · 2019-05-21T15:35:11Z

To test this, I did:
oc debug node/<worker> and then did chroot /host iptables -I OPENSHIFT-BLOCK-OUTPUT -p tcp --dport 22623 -j ACCEPT to undo the SDN filtering, then from that same pod, I see:

# curl -k --head https://10.0.131.197:22623/config/master
HTTP/2 403 
content-length: 0
date: Tue, 21 May 2019 15:34:35 GMT

And in the MCS logs:

oc logs pods/machine-config-server-6snf2
I0521 15:27:25.572401       1 start.go:37] Version: 4.0.0-alpha.0-437-g307e6bea-dirty (307e6bea9ed1025202d431be21184fe9ea4f6066)
I0521 15:27:25.574374       1 api.go:52] launching server
I0521 15:27:25.574434       1 api.go:52] launching server
I0521 15:27:42.202818       1 api.go:106] Denying unauthorized request: Node ip-10-0-134-102.us-east-2.compute.internal with address 10.0.134.102 is already provisioned

ashcrow · 2019-05-21T15:38:01Z

 func manifestsMachineconfigserverClusterroleYamlBytes() ([]byte, error) {
make: *** [verify] Error 1
hack/../pkg is out of date. Please run make update

cgwalters · 2019-05-21T15:51:56Z

From a disaster recovery perspective, I think if you have e.g. master node with a static IP that you want to reprovision, in order to allow access you'd need to edit the node object to drop its status/addresses data.

Or we could add a config flag to allow this.

In practice I think most people are going to start with trying to recover a master in-place rather than completely re-set it, and that path won't be affected by this.

cgwalters · 2019-05-21T17:39:16Z

OK, passing the main tests now, though upgrade looks stuck but I doubt it's related.

I also verified that scaling up the worker machineset still works, i.e. it doesn't deny the request.

cgwalters · 2019-05-21T17:54:24Z

Maybe a better architecture would be for the MCS to make an HTTP request to the MCC asking if a given IP is OK? Would increase the availability requirement for the MCC. Eh, this is probably fine. If there's some transient apiserver issue it will mean a node's Ignition request will fail, but Ignition will keep retrying.

cgwalters · 2019-05-21T18:49:08Z

/retest

abhinavdahiya · 2019-05-22T07:04:56Z

In practice I think most people are going to start with trying to recover a master in-place rather than completely re-set it, and that path won't be affected by this.

Do we have concrete data from Dr team on this. Seems like Amazon taking a VM away might also be common....

How do we know that the interface that acts as source ip is the one node is reporting as it's internal/public IP?

cgwalters · 2019-05-22T14:14:08Z

How do we know that the interface that acts as source ip is the one node is reporting as it's internal/public IP?

Bear in mind this PR is not claiming to be a complete solution to the problem. It's adding a layer of defense, much like the other layers we added. For example, it does nothing about external access. The auth key approach would be much stronger.

I'll check with the network team about this but remember the primary thing we're trying to prevent with this is in-cluster pods accessing the endpoint. I have some confidence that that access will appear to come from the IP address associated with the node that the kubelet reports. But again let's see what the SDN team says.

Seems like Amazon taking a VM away might also be common....

Right, instance retirement definitely happens. As I said, in that case if a newly provisioned master happens to get the same IP you'd need to explictly drop out the node object.

Or alternatively, we could tweak this PR to only disallow reachable nodes which would be pretty easy.

danwinship · 2019-05-22T14:44:05Z

the primary thing we're trying to prevent with this is in-cluster pods accessing the endpoint. I have some confidence that that access will appear to come from the IP address associated with the node that the kubelet reports.

Assuming the pod isn't using an egress IP or egress router, then traffic from a pod on node A addressed to node B's primary node IP will appear to come from node A's primary node IP.

But we didn't change the MCS to not listen on 0.0.0.0 did we? So a pod on node A could connect to node B's tun0 IP instead, and that would appear to come from the pod's IP directly. (So you'd want to filter out all connections with source IPs in the pod network as well. Or more simply, filter out connections if the destination IP is the tun0 IP.)

Also, in some environments, if nodes have multiple IPs, then a connection from a pod on node A to a non-primary IP on node B might appear to come from node A's non-primary IP rather than its primary IP.

if a newly provisioned master happens to get the same IP you'd need to explictly drop out the node object.

VMware at least definitely leaves stale node objects around when dynamically scaling nodes. (I don't know if that would ever effect masters though.)

cgwalters · 2019-05-22T14:48:33Z

(So you'd want to filter out all connections with source IPs in the pod network as well. Or more simply, filter out connections if the destination IP is the tun0 IP.)

Ah, OK. Hmm though I'm not sure we can get that from the golang side HTTP request... but thinking about this, why don't we just add iptables rules to the masters that require that the destination IP for the MCS is not tun0?

danwinship · 2019-05-22T15:04:55Z

iptables doesn't see the packet because it's delivered by OVS

cgwalters · 2019-05-22T18:47:21Z

/hold
Per discussion above, this needs more work to implement the suggestions in #784 (comment)

cgwalters · 2019-05-24T18:38:52Z

/hold cancel

OK, updated 🆕 to also deny requests coming from tun0. I verified both approaches work, doing a request to the main master IP directly, as well as targeting its tun0.

oc logs pods/machine-config-server-l7wd2
I0523 22:27:09.952731       1 start.go:37] Version: 4.0.0-alpha.0-444-g9eebd1d4-dirty (9eebd1d4a17eb2d26ae74709252cf6ea77330703)
I0523 22:27:09.955394       1 api.go:52] launching server
I0523 22:27:09.955474       1 api.go:52] launching server
I0524 18:00:24.919941       1 api.go:99] Pool master requested by 10.0.137.206:32870
I0524 18:00:24.936254       1 api.go:106] Denying unauthorized request: Node ip-10-0-137-206.us-east-2.compute.internal with address 10.0.137.206 is already provisioned
I0524 18:06:39.321439       1 api.go:99] Pool master requested by 10.131.0.1:59360
I0524 18:06:39.330104       1 api.go:106] Denying unauthorized request: Requesting host 10.131.0.1 is within pod network CIDR 10.128.0.0/14

We also now only deny nodes that are NodeReady=true.

And I verified with this patch that scaling up a machineset still works.

abhinavdahiya · 2019-05-24T18:41:52Z

pkg/server/cluster_server.go

+	}
+
+	for _, node := range nodes.Items {
+		// Ignore NodeReady=false nodes; this allows reprovisioning them.


Node not ready doesn't mean node is getting re-provisioned. where did we get that assumption?

abhinavdahiya · 2019-05-24T18:43:49Z

Also, in some environments, if nodes have multiple IPs, then a connection from a pod on node A to a non-primary IP on node B might appear to come from node A's non-primary IP rather than its primary IP.

This is still not solved.

cgwalters · 2019-05-24T18:48:34Z

[multi-NIC] is still not solved.

Right, but...again, not claiming a comprehensive solution here. The issue is important enough to have layers of defenses.

That said...one approach that we could take (derived from an approach @ericavonb mentioned) is run a daemonset service on each node (probably as part of the MCD) that listens on a well-known port and provides a static reply. Then we could have the MCS "call back" to the requesting IP - if it gets a reply it knows to deny the request.

cgwalters · 2019-05-24T18:58:45Z

Or...maybe way simpler, just try to connect to port 22.

cgwalters · 2020-04-22T19:34:07Z

OK this came up in a conversation again; rebased 🏄 - only compile tested now though.

cgwalters · 2020-04-27T16:06:10Z

Since this is now informational only, I think what we need to do now is export a Prometheus metric for denials, and the roll that up into telemetry ?

jamescassell · 2020-05-12T16:11:24Z

This PR closes of access to any IP that responds on port 22, as that
is a port that is:
Known to be active by default
Not firewalled

I think the way this works is broken. All I (as an attacker) need to do to get the ignition is to block SSH on my local machine, then request the ignition.

cgwalters · 2020-05-12T16:34:36Z

All I (as an attacker) need to do to get the ignition is to block SSH on my local machine, then request the ignition.

This isn't intending to block access external to the cluster. I think one should use IaaS firewalling for that to start, and the default OpenShift installs do so.

But I know this is a "gotcha" in various UPI scenarios - we should absolutely consider a way to address that better. Maybe a simple "provisioning enabled" boolean that would turn off the MCS entirely - in non-machineAPI scenarios booting new machines is an unusual scenario.

(Only privileged code running on the cluster could block SSH, that's why it works in-cluster)

michaelgugino · 2020-05-12T17:49:56Z

What I think we should do is secure the endpoint with authentication. A firewall or iptables rule is not a subsitute for authentication. The machine-api will be available in baremetal clusters soon, as well as vmware, and there's no telling what the network topology of those is going to look like. It's very well possible that the MCS is behind a routable IP.

cgwalters · 2020-05-12T18:33:40Z

What I think we should do is secure the endpoint with authentication.

Yep, that was #736 but where it died is requiring it would be a huge UX issue for many baremetal flows.

That said...I'm increasingly feeling like a "good enough" mitigation would be something like switching to requiring an auth token after the cluster is initialized.

michaelgugino · 2020-05-12T21:42:11Z

What I think we should do is secure the endpoint with authentication.

Yep, that was #736 but where it died is requiring it would be a huge UX issue for many baremetal flows.

That said...I'm increasingly feeling like a "good enough" mitigation would be something like switching to requiring an auth token after the cluster is initialized.

Consider the reverse of optionally disabling it for baremetal flows.

jomeier · 2020-05-13T09:11:33Z

We can access the ignition files from any PC in our network that doesn't belong to the vSphere UPI OKD 4 cluster:

okd-project/okd#176

The curl command described there doesn't work out of the box in pods running in the cluster but I'm not sure if this can be enabled somehow by a potential attacker.

I assume that also vSphere credentials are inside of the ignition files? This would be a major security leak. As proposed earlier in this PR: is it possible to secure the API endpoint?

If that's not possible in short term what is the proposed workaround to secure the ignition files with a firewall? Could you provide a best practice network layout for that? Our loadbalancer for port 22623 is in a different network than our cluster VMs. And it might be a little bit cumbersome to configure that. So any best practice setup hint is welcome.

crawford · 2020-05-20T14:59:21Z

/hold

This needs an enhancement. Casually skimming the history, it's clear that there are still open questions.

cgwalters · 2020-05-20T21:14:41Z

So in the middle of this epic PR discussion, the change turned from "deny" to "warn and add opt-in mechanism to deny". I forgot to change the PR title which probably led to a lot of confusion.

I completely agree we need an enhancement if we try to do anything that would deny (and particularly anything that ties together machineAPI and MCO or affects bare metal provisioning flow, disaster recovery etc.) I'm less in agreement that we need an enhancement to log by default. If it helps I can remove the ability to deny.

(But probably instead of logging to the pod we really want saner observability like an event and prometheus metrics, I need to look at that)

jomeier · 2020-05-20T21:38:20Z

Is there any hint in the docs that a firewall should be set up to prevent anyone from pulling the ignition files with cloud credentials contained in it? Or can this deny Switch be configured during installation of the cluster? By providing a switch in the install-config.yaml for example.

cgwalters · 2020-06-25T16:42:03Z

After openshift/enhancements#368 lands we'll be in a better place to enforce an auth token for MAO managed setups.

I do agree with Alex we want an enhancement for this but it should basically be:

installer generates oc -n openshift-config create secret generic provisioning token=<random token>
installer injects that into its user data
MCO also does so for the pointer configs it manages
MCO denies requests which don't have a header with the token

It's harder to do better than that unless we go to per machine user data, but this would suffice to start. In MAO managed scenarios we should be able to iteratively upgrade later. But if we start requiring this token e.g. to be specified at the PXE commandline on UPI metal then it will become a bit of an "API".

cgwalters · 2020-08-07T15:16:35Z

Working on an enhancement for this: https://hackmd.io/k7Mfb1lpSIWTzRFvo6o9ig

See openshift/machine-config-operator#784 The Ignition configuration can contain secrets, and we want to avoid it being accessible both inside and outside the cluster.

openshift-ci-robot · 2020-09-14T18:17:59Z

@cgwalters: The following tests failed, say /retest to rerun all failed tests:

Test name	Commit	Details	Rerun command
ci/prow/e2e-etcd-quorum-loss	3d295414e006ce429707e0abb37b254be87162b2	link	`/test e2e-etcd-quorum-loss`
ci/prow/e2e-aws-disruptive	fc2b74b6a3f41d1a80c7d37e2c5a6ebe781de532	link	`/test e2e-aws-disruptive`
ci/prow/e2e-vsphere	24cecd593e9e31843dddd35db1381928b695271e	link	`/test e2e-vsphere`
ci/prow/e2e-aws-proxy	`75db5c2`	link	`/test e2e-aws-proxy`
ci/prow/e2e-upgrade	`75db5c2`	link	`/test e2e-upgrade`

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

See openshift/machine-config-operator#784 The Ignition configuration can contain secrets, and we want to avoid it being accessible both inside and outside the cluster.

openshift-merge-robot · 2020-11-05T21:57:44Z

@cgwalters: The following tests failed, say /retest to rerun all failed tests:

Test name	Commit	Details	Rerun command
ci/prow/e2e-agnostic-upgrade	`75db5c2`	link	`/test e2e-agnostic-upgrade`
ci/prow/e2e-aws-serial	`75db5c2`	link	`/test e2e-aws-serial`

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

cgwalters · 2020-11-12T22:54:17Z

Will be obsoleted by #2223

See openshift/machine-config-operator#784 The Ignition configuration can contain secrets, and we want to avoid it being accessible both inside and outside the cluster.

openshift-ci-robot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label May 21, 2019

openshift-ci-robot requested review from jlebon and runcom May 21, 2019 15:33

openshift-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label May 21, 2019

cgwalters force-pushed the mcs-check-machines branch from 8f0585d to 3693948 Compare May 21, 2019 15:43

openshift-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label May 22, 2019

cgwalters force-pushed the mcs-check-machines branch from 3693948 to 1cc5062 Compare May 24, 2019 18:12

openshift-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels May 24, 2019

cgwalters force-pushed the mcs-check-machines branch from 1cc5062 to d0fcc92 Compare May 24, 2019 18:33

openshift-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label May 24, 2019

cgwalters force-pushed the mcs-check-machines branch from d0fcc92 to d408e61 Compare May 24, 2019 18:39

abhinavdahiya reviewed May 24, 2019

View reviewed changes

cgwalters changed the title ~~server: Deny serving Ignition to provisioned nodes~~ server: Support denying serving Ignition to active nodes and pods May 20, 2020

mfojtik mentioned this pull request Jun 19, 2020

Bug 1847419: [release-4.4] UPSTREAM: 92166: fix: GetLabelsForVolume panic issue for azure disk PV openshift/origin#25159

Merged

cgwalters mentioned this pull request Aug 19, 2020

Support a provisioning token for the Machine Config Server openshift/enhancements#443

Closed

vrutkovs mentioned this pull request Sep 19, 2020

[vSphere, UPI] I can read the ignition files from outside of the cluster without providing credentials okd-project/okd#176

Closed

cgwalters closed this Nov 12, 2020

cgwalters mentioned this pull request Mar 19, 2021

WIP: enhancements/machine-config: securing MCS openshift/enhancements#626

Closed

cgwalters mentioned this pull request Oct 5, 2022

MCO-286: Add mode for template controller to write to /usr, spike on building a container #3137

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

server: Support denying serving Ignition to active nodes and pods #784

server: Support denying serving Ignition to active nodes and pods #784

cgwalters commented May 21, 2019 •

edited

Loading

openshift-ci-robot commented May 21, 2019

cgwalters commented May 21, 2019 •

edited

Loading

ashcrow commented May 21, 2019

cgwalters commented May 21, 2019

cgwalters commented May 21, 2019

cgwalters commented May 21, 2019

cgwalters commented May 21, 2019

abhinavdahiya commented May 22, 2019

cgwalters commented May 22, 2019

danwinship commented May 22, 2019

cgwalters commented May 22, 2019

danwinship commented May 22, 2019

cgwalters commented May 22, 2019

cgwalters commented May 24, 2019

abhinavdahiya May 24, 2019

abhinavdahiya commented May 24, 2019

cgwalters commented May 24, 2019

cgwalters commented May 24, 2019

cgwalters commented Apr 22, 2020

cgwalters commented Apr 27, 2020

jamescassell commented May 12, 2020

cgwalters commented May 12, 2020

michaelgugino commented May 12, 2020

cgwalters commented May 12, 2020

michaelgugino commented May 12, 2020

jomeier commented May 13, 2020 •

edited

Loading

crawford commented May 20, 2020

cgwalters commented May 20, 2020

jomeier commented May 20, 2020

cgwalters commented Jun 25, 2020

cgwalters commented Aug 7, 2020

openshift-ci-robot commented Sep 14, 2020

openshift-merge-robot commented Nov 5, 2020

cgwalters commented Nov 12, 2020

server: Support denying serving Ignition to active nodes and pods #784

server: Support denying serving Ignition to active nodes and pods #784

Conversation

cgwalters commented May 21, 2019 • edited Loading

openshift-ci-robot commented May 21, 2019

cgwalters commented May 21, 2019 • edited Loading

ashcrow commented May 21, 2019

cgwalters commented May 21, 2019

cgwalters commented May 21, 2019

cgwalters commented May 21, 2019

cgwalters commented May 21, 2019

abhinavdahiya commented May 22, 2019

cgwalters commented May 22, 2019

danwinship commented May 22, 2019

cgwalters commented May 22, 2019

danwinship commented May 22, 2019

cgwalters commented May 22, 2019

cgwalters commented May 24, 2019

abhinavdahiya May 24, 2019

Choose a reason for hiding this comment

abhinavdahiya commented May 24, 2019

cgwalters commented May 24, 2019

cgwalters commented May 24, 2019

cgwalters commented Apr 22, 2020

cgwalters commented Apr 27, 2020

jamescassell commented May 12, 2020

cgwalters commented May 12, 2020

michaelgugino commented May 12, 2020

cgwalters commented May 12, 2020

michaelgugino commented May 12, 2020

jomeier commented May 13, 2020 • edited Loading

crawford commented May 20, 2020

cgwalters commented May 20, 2020

jomeier commented May 20, 2020

cgwalters commented Jun 25, 2020

cgwalters commented Aug 7, 2020

openshift-ci-robot commented Sep 14, 2020

openshift-merge-robot commented Nov 5, 2020

cgwalters commented Nov 12, 2020

cgwalters commented May 21, 2019 •

edited

Loading

cgwalters commented May 21, 2019 •

edited

Loading

jomeier commented May 13, 2020 •

edited

Loading