-
Notifications
You must be signed in to change notification settings - Fork 522
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
settings.network: add new proxy settings #1204
Conversation
5d88a79
to
f232a11
Compare
I should look into supporting CIDR notations for the |
f232a11
to
d3d804d
Compare
d3d804d
to
613af4b
Compare
613af4b
to
327c650
Compare
Push above and below addresses @tjkirch 's comments.
|
327c650
to
469c09c
Compare
Updated PR description with additional ECS testing info. |
469c09c
to
c66edfb
Compare
Push above addresses @tjkirch 's comments. |
packages/release/proxy-env
Outdated
HTTPS_PROXY={{settings.network.https-proxy}} | ||
https_proxy={{settings.network.https-proxy}} | ||
{{/if}} | ||
NO_PROXY={{#each settings.network.no-proxy}}{{this}},{{/each}}localhost,127.0.0.1,169.254.169.254,169.254.170.2,/var/run/docker.sock{{#if settings.kubernetes.api-server}},{{host settings.kubernetes.api-server}},.svc,.default,.local,.cluster.local{{/if}} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My only remaining concern is that we list /var/run/docker.sock but not /run/docker.sock, even though our docker.socket unit lists /run/docker.sock. Is there a chance we wouldn't proxy requests because we're using one instead of the other, even though they're symlinked? Do we need to list both?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All I know is ECS agent explicitly dials /var/run/docker.sock
. As for how that gets resolved is unclear to me. I do know from testing that the ECS variant currently works with what's currently there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@samuelkarp any thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't seen any problems with using /var/run/docker.sock
on other platforms with symlinks (like Amazon Linux 2).
We should have test coverage for the ECS agent in the agent's test suite, so I'm comfortable relying on that for validation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know that we're seeing the basics work, I guess my concern is that I'm not sure whether there could be any other components called by Docker (or whatever) that are assuming /run/docker.sock
, and they've just worked for us because of the symlink, but suddenly they'd fail (or do horrible things) because they're being proxied. (I don't know of anything, and we can fix it if we find it, hence my ship-it)
README.md
Outdated
* [host-containerd.service](packages/host-ctr/host-containerd.service) | ||
* For Kubernetes variants: | ||
* [kubelet.service](packages/kubernetes-1.18/kubelet.service) | ||
* For ECS variants: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since there's only one ECS variant, should we say "For the ECS variant"?
README.md
Outdated
* `settings.network.http-proxy`: The HTTP proxy server to be used by services listed above. | ||
* `settings.network.https-proxy`: The HTTPS proxy server to be used by services listed above. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want these to be separately-configurable at this point?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't really think of a use-case for configuring http_proxy
and https_proxy
differently. But then again there really is no harm in letting them be separately-configurable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's more overhead for someone who wants to enable a proxy; now they have to configure both. My preference is to have just one right now and we can always add a second one later if it becomes desirable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would we still be setting the HTTPS_PROXY
env var? Setting HTTPS_PROXY
to the same contents as what's being set for HTTP_PROXY
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would we still be setting the HTTPS_PROXY env var? Setting HTTPS_PROXY to the same contents as what's being set for HTTP_PROXY?
Yep.
Perhaps the hardest question - what to call the setting? :P
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, same value for both. I'd probably call the setting "https-proxy" to imply that the proxy needs to be able to transparently handle TLS traffic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if the user wants to use a http proxy that can't handle https traffic though? Wouldn't https-proxy
be misleading and discourage the user from using the setting?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we're using the same setting for both, the proxy needs to be able to handle HTTPS traffic. I think the naming should reflect that.
Most of the traffic that will be generated from the software in Bottlerocket (pulling container images, log forwarding, talking to API endpoints) is going to be over HTTPS. cri-containerd and Docker both require HTTPS when pulling images (without manual overrides that we don't enable in Bottlerocket), and the Kubelet and ECS agent talk to their respective API endpoints over HTTPS as well.
c66edfb
to
edee0a9
Compare
@@ -10,6 +10,7 @@ StartLimitIntervalSec=60s | |||
|
|||
[Service] | |||
Type=notify | |||
EnvironmentFile=/etc/network/proxy.env |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just realized that if docker is now depending on settings, we need to restart it on changes, i.e. make a docker
service in the model.
I guess the service has to be defined separately in the aws-dev and aws-ecs-1 defaults-overrides, since we probably shouldn't have it in the defaults..?
It should probably be restarted before ecs, in the ecs model, though, and we don't have dependencies between services. Maybe in the ecs model, docker is just a restart-command in the ecs service, rather than being its own?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should probably be restarted before ecs, in the ecs model, though, and we don't have dependencies between services.
There's maybe a race while you're configuring/restarting, but it shouldn't really matter? The ECS agent should tolerate Docker going down and coming back up already. There's also a dependency listed in the unit itself, though I know that our model isn't reading that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ugh, this also applies to containerd, host-containerd, and kubelet, I believe. We haven't dynamically affected the configuration of {host-,}containerd before. kubelet we technically could, and it seems like it wouldn't have worked without a reboot, but they're mostly dynamic settings generated at boot that you wouldn't change. (Maybe labels/taints are the worst offender?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@samuelkarp would containerd be similarly safe to restart? No lost containers, logs, etc.?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bcressey found that restarting host-containerd kills SSH sessions, implying that the admin container is stopped. Not sure if it's a necessary kill, or if our setup is wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would containerd be similarly safe to restart? No lost containers, logs, etc.?
Containerd when used either directly through its API or with Docker does no networking and wouldn't need to be restarted.
Containerd when used with Kubernetes (through cri-containerd) does networking to pull images, and would need to be restarted.
By default, containerd will not stop containers when it exits, but systemd might be killing them.
edee0a9
to
e1b380c
Compare
Push above adds restart commands for all services ingesting the templated Here's how I tested the changes for Kubernetes:
For testing ECS, I did the following:
Let me know if there's anything else I should test. |
Adds a new rendering-template helper that trims a fully qualified URL setting down to just its host portion.
e1b380c
to
60ebdda
Compare
Push above rebases onto develop |
5ea2642
to
97bf5a6
Compare
Pushes above removes PR description updated with the latest testing information. |
97bf5a6
to
b4983cd
Compare
Push above addresses #1204 (comment) and #1204 (comment) |
Add new settings for configuring HTTPS_PROXY/https_proxy and NO_PROXY/no_proxy environment variables. These environment variables will only be exposed to a limited set of services that need them.
b4983cd
to
e60b2fb
Compare
Push above recovers the restart commands for containerd and host-containerd |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🦠
Issue number:
Addresses #1178
Description of changes:
Testing done:
Set up a simple HTTPS proxy server with tinyproxy with mostly default settings.
Built aws-k8s-1.17/arm64 AMI and launched several instances with the following settings in my userdata:
Host containers, both admin and control, come up fine.
All of the nodes came up fine in my EKS cluster. Can pull images and run pods.
Checked the HTTPS proxy server access logs and observed traffic from the instances I launched:
Ran Kubernetes certified-conformance tests and all of them pass.
To make sure that no_proxy'ing the Kubernetes API server works, I limited the API endpoint access to within the cluster's private VPC and the nodes fail to connect to the cluster as expected if I do not no_proxy the API server endpoint. Once I do (by rendering the
proxy-env
template withsettings.kubernetes.apiserver
), the node comes up fine.Launched ECS variant with https-proxy set to my https proxy server. Started a simple nginx task and it runs fine. I'm able curl localhost:80 and get the nginx welcome page.
Observed traffic on my https server:
Launched instance without specifying proxy settings.
After host comes up, change proxy settings to route traffic through my https proxy server:
apiclient -u /settings -m PATCH -d '{"network": {"https-proxy": "myhttpsserver:9898", "no-proxy":[".local"]}}' apiclient -u /tx/commit_and_apply -m POST
Observed that containerd, kubelet, host-containerd all restarted successfully and pod workloads continued without problems.
Please let me know if there is anything else that needs to be tested.
Note: The migration for the new settings will fast-follow this PR.
Terms of contribution:
By submitting this pull request, I agree that this contribution is dual-licensed under the terms of both the Apache License, version 2.0, and the MIT license.