Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[occm] PROXY protocol breaks connections from inside the cluster #1287

Closed
bgagnon opened this issue Oct 23, 2020 · 18 comments · Fixed by #1449
Closed

[occm] PROXY protocol breaks connections from inside the cluster #1287

bgagnon opened this issue Oct 23, 2020 · 18 comments · Fixed by #1449
Labels
kind/bug Categorizes issue or PR as related to a bug.

Comments

@bgagnon
Copy link

bgagnon commented Oct 23, 2020

/kind bug

This is a manifestation of kubernetes/kubernetes#66607. When a service of type LoadBalancer is exposed via an IP address (which is the case for OCCM), iptables rules "optimize" the traffic and send the packets directly to the underlying pods.

The problem is that:

  • the client isn't aware it needs to send the PROXY header (it may not even be capable of doing that)
  • the server, per the specification, must refuse the connection if the header is missing

The workaround used by other cloud providers is to expose the LB's external endpoint via the .Hostname field instead of the .IP field.

@k8s-ci-robot k8s-ci-robot added the kind/bug Categorizes issue or PR as related to a bug. label Oct 23, 2020
@bgagnon
Copy link
Author

bgagnon commented Oct 26, 2020

Digital Ocean experienced the same issue in its own cloud provider. Their workaround was to introduce a new annotation, service.beta.kubernetes.io/do-loadbalancer-hostname, which allows the user to specify a domain under which to expose the LB. By filling the .Hostname field of the status instead of .IP, they effectively bypass this problem.

I propose the Openstack provider should have a similar escape hatch: a new annotation that switches the controller from returning a .Hostname instead of .IP.

Some considerations for the design:

  • this may only be absolutely needed when PROXY protocol is active
  • for integration with other systems, the IP address must, ultimately, still be discoverable by components inside Kubernetes (ex: external-dns)
  • to save the user from having to provisioning a domain record, the ability to use a "magic" domain such as x.x.x.x.xip.io or x.x.x.x.nip.io might prove useful

KEP-1860, once implemented, will introduce an official way for the cloud provider to inform kube-proxy of the desired "ingress mode". With this, the .IP field will no longer be a problem, as long as it is accompanied by ingressMode set to Proxy instead of VIP (the default). Implementation is ongoing in kubernetes/kubernetes#92312.

@jichenjc
Copy link
Contributor

good suggestion , should we consider to wait for KEP or refer to
https://github.com/digitalocean/digitalocean-cloud-controller-manager/blob/master/cloud-controller-manager/do/loadbalancers.go#L334 and implement our own?

seems reasonable to implement it first...

@jichenjc
Copy link
Contributor

@lingxiankong do you think we can add this into ensureLoadBalancer function according to above comments?
especially refer to DO implementation linked above..

@lingxiankong
Copy link
Contributor

lingxiankong commented Oct 29, 2020

Any PoC available for OCCM? If it could prove to work, we could have it.

@zetaab
Copy link
Member

zetaab commented Oct 30, 2020

I can try to do this later today

@zetaab
Copy link
Member

zetaab commented Oct 30, 2020

Ok here it goes:

Starting situation, we do have ingress-controller exposed using proxy protocol (octavia). Traffic from outside cluster works fine to ingress:

% curl https://kaas-proxy.foobar.com -I
HTTP/2 204
date: Fri, 30 Oct 2020 06:14:50 GMT
strict-transport-security: max-age=15724800; includeSubDomains
server: hide

However, when trying from inside the cluster it does not work, like this issue describes:

% kubectl run curl-test4 --image=radial/busyboxplus:curl -i --tty --rm
If you don't see a command prompt, try pressing enter.
[ root@curl-test4:/ ]$ curl https://kaas-proxy.foobar.com -I
curl: (35) Unknown SSL protocol error in connection to kaas-proxy.foobar.com:443

First I tried could we just copy ip address to hostname field, it do have validation:

I1030 06:05:54.850119       1 event.go:291] "Event occurred" object="ingress-nginx-internal/ingress-nginx-internal" kind="Service" apiVersion="v1" type="Warning" reason="SyncLoadBalancerFailed" message="Error syncing load balancer: failed to update load balancer status: Service \"ingress-nginx-internal\" is invalid: status.loadBalancer.ingress[0].hostname: Invalid value: \"10.222.139.88\": must be a DNS name, not an IP address"

The next thing that I tried is to use those "magical" ips, which do actually work!

% kubectl get svc
NAME                             TYPE           CLUSTER-IP      EXTERNAL-IP             PORT(S)                      AGE
ingress-nginx-internal           LoadBalancer   100.69.161.9    10.222.138.185.nip.io   80:30968/TCP,443:30785/TCP   41d

Then trying again from inside the cluster:

% kubectl run curl-test4 --image=radial/busyboxplus:curl -i --tty --rm
If you don't see a command prompt, try pressing enter.
[ root@curl-test4:/ ]$ curl https://kaas-proxy.foobar.com -I
HTTP/1.1 204 No Content
Date: Fri, 30 Oct 2020 06:19:21 GMT
Connection: keep-alive
Strict-Transport-Security: max-age=15724800; includeSubDomains
server: hide

The only question which now remains: can we really create these things using magical ips?

@jichenjc
Copy link
Contributor

I think there is KEP-1860 mentioned above which will solve the problem
so the solution will be a temporary solution ,guess reasonable to have it then remove it once
KEP is implemented?

@bgagnon
Copy link
Author

bgagnon commented Oct 30, 2020

I should have mentioned this earlier... we have a working patch in our internal fork of this project. We've been running it in production for a couple weeks now with success. It uses the .nip.io approach, controlled by an annotation:

loadbalancer.openstack.org/proxy-protocol-hostname-suffix: nip.io

And then changes the status object based on that:

	// If the load balancer is using the PROXY protocol, expose its IP address via
	// the Hostname field to prevent kube-proxy from injecting an iptables bypass.
	// This is a workaround for kubernetes#66607 that imitates a similar patch done
	// on the Digital Ocean cloud provider.
	if useProxyProtocol {
		// By default, the universal nip.io service is used to provide a valid domain that is guaranteed
		// to resolve to the LB's IP address. This can be changed via the following (optional) annotation.
		hostnameSuffix := getStringFromServiceAnnotation(apiService, ServiceAnnotationLoadBalancerProxyHostnameSuffix,
			defaultProxyHostnameSuffix)
		hostnameSuffix = strings.TrimPrefix(hostnameSuffix, ".")
		hostname := fmt.Sprintf("%s.%s", status.Ingress[0].IP, hostnameSuffix)
		status.Ingress = []corev1.LoadBalancerIngress{{Hostname: hostname}}
		if errs := validation.IsDNS1123Subdomain(hostnameSuffix); len(errs) > 0 {
			joinedErrors := strings.Join(errs, ",")
			return status, fmt.Errorf("invalid value \"%s\" for annotation %s: %s", hostnameSuffix,
				ServiceAnnotationLoadBalancerProxyHostnameSuffix, joinedErrors)
		}
	}

I just wanted to validate the design with the community before submitting a PR. Some improvements I can think of:

  • the behavior does not need to be directly linked to PROXY protocol
  • like Digital Ocean, the ability to specify a fully qualified domain name instead of a suffix

The reason we wanted the .nip.io approach here is that we need to communicate the IP to another controller who is watching the service/status object. Using a magic domain, the receiving side can either resolve the A record or strip the suffix and get the same answer. An unaware component can treat the Hostname normally and resolve it.

Depending on community feedback for the points above, I can submit the PR as is or make a few improvements.

@bgagnon
Copy link
Author

bgagnon commented Oct 30, 2020

@jichenjc yes, ideally we go for a design that will gracefully evolve to accommodate KEP-1860 once it is implemented

@lingxiankong
Copy link
Contributor

@zetaab thanks for testing, @bgagnon thanks for the PoC code. They are very helpful.

Instead of using loadbalancer.openstack.org/proxy-protocol-hostname-suffix: nip.io, can we just use something like loadbalancer.openstack.org/proxy-protocol-enable-hostname: "true" and use nip.io implicitly? I don't think the user has many other options to define the suffix.

@bgagnon
Copy link
Author

bgagnon commented Dec 10, 2020

@lingxiankong there's nip.io, xip.io and it's also possible to run the nip.io service locally (and use a different domain). Not sure it's wise to use a public Internet-facing service as a default.

@bgagnon
Copy link
Author

bgagnon commented Dec 10, 2020

Note that kubernetes/kubernetes#92312 contains the definitive fix for this -- a way for the cloud provider to communicate how the IP should be handled.

It is included in k8s 1.20 behind a feature gate: https://kubernetes.io/docs/setup/release/notes/#api-change-1

But it looks like it might have been reverted for 1.20 in the end... to be continued.

@lingxiankong
Copy link
Contributor

@lingxiankong there's nip.io, xip.io and it's also possible to run the nip.io service locally (and use a different domain). Not sure it's wise to use a public Internet-facing service as a default.

I know there are other options, I'm thinking about does that make much sense for the end user to choose a different one, given all they need is just a workaround to access PROXY load balancer from within the cluster? If providing an annotation can't bring too much value for the user, I'd rather we don't do that.

Another possible reason is, considering the k8s community will eventually solve the issue by itself, we'd better not to introduce an annotation that we all know will be deprecated and removed in the future.

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Mar 11, 2021
@sbueringer
Copy link
Member

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Mar 11, 2021
@lingxiankong
Copy link
Contributor

@bgagnon Hi, are you still working on solving this issue? If yes, could you please work on the master branch first?

@bgagnon
Copy link
Author

bgagnon commented Mar 12, 2021

Since opening #1345 I have changed jobs and no longer have the ability to test on Openstack. Maybe @davidovich or someone from his team wants to pick it up. Sorry about that!

@lingxiankong
Copy link
Contributor

no worries @bgagnon, I will see how I can help.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants