-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
overlay network cannot be applied when host is behind a proxy #136
Comments
Hi @senthilrch , thank you for filling this issue. As for your uestion about installing weave net, you can read more about it here. |
Yep, as @alejandrox1 noted, the base64 encoding is from their guide. The reason for this is to pass it as an HTTP query parameter to weave so that their site can serve the appropriate weave version based on your Kubernetes version. In the future we might use fixed weave versions, but this is the correct and normal way to install it per their upstream documentation.
It would be this verbatim, but we need to specify the admin kubeconfig location. Regarding the failure, is this happening reliably, or did it just happen once? |
It is happening repeatedly. I'll get the debug log gist and add it here.
…On Mon, 26 Nov 2018 at 23:45, Benjamin Elder ***@***.***> wrote:
Yep, as @alejandrox1 <https://github.com/alejandrox1> noted, the base64
encoding is from their guide. The reason for this is to pass it as an HTTP
query parameter to weave so that their site can serve the appropriate weave
version based on your Kubernetes version.
In the future we might use fixed weave versions, but this is the correct
and normal way to install it per their upstream documentation.
kubectl apply -f "https://cloud.weave.works/k8s/net?k8s-version=$(kubectl
version | base64 | tr -d '\n')"
It would be this verbatim, but we need to specify the admin kubeconfig
location.
------------------------------
Regarding the failure, is this happening reliably, or did it just happen
once?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#136 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AXK38nUHs1fqqiZhSdvjV1QyYevzK6Y6ks5uzC_MgaJpZM4Yy9kh>
.
|
https://gist.github.com/senthilrch/70eb56cfeee38e311c13f6898791121a The host in which I am creating the kind cluster is behind a proxy. Perhaps that's the reason it fails. Will kind honor http_proxy and https_proxy env variables set on the host? |
Ah, that's almost definitely it!
We can either try to get these packed into the image ahead of time (which is probably quite doable, and possibly desirable, but maybe a little tricky), or we can try to make this step respect proxy information on the host machine. It looks like Both approaches are probably worth doing. I'll update this issue to track. |
/kind bug |
Within the last 1-2 weeks Kind broke for me with the same error (I believe).
I didn't change anything on my system and simply do a |
@metalmatze is it possible that you're behind a proxy as well? we've not fixed that yet. |
I don't think so. |
hmm. I don't think we've made any functional changes to this step in that time frame. FWIW making this step not depend on the internet is very high on my todo 😕 other known issues I've seen that can cause similar problems:
|
Pulling the latest master now fixed KinD for me again. I'm not entirely sure what happened. I can't see any changes related to my problem. I'm on the same machine and the same WiFi as first reported from. Additionally my machine was suspended most of the weekend and I didn't run any updates during the time (like updating Docker for exmaple) |
Huh. I can't spot anything relevant in there 🤔 the plot thickens 🙃
I think this week I'll take a stab at pre-loading the CNI images and using
a fixed manifest which should help avoid this sort of issue entirely 🤞
…On Mon, Jan 14, 2019, 05:25 Matthias Loibl ***@***.*** wrote:
Pulling the latest master now fixed KinD for me again. I'm not entirely
sure what happened. I can't see any changes related to my problem. I'm on
the same machine and the same WiFi as first reported from. Additionally my
machine was suspended most of the weekend and I didn't run any updates
during the time (like updating Docker for exmaple)
302bb7d...4a348e0
<302bb7d...4a348e0>
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#136 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AA4Bq4RTGijVedj4rKKCU0RBQeSxMPgmks5vDIVIgaJpZM4Yy9kh>
.
|
I'm facing the same issue. In my case apply overlay network fails because cloud.weave.works is not resolvable from |
Upgraded docker from 18.09 to 18.09.1 and problem went away 🎉. |
Huh, I wonder if there was a regression in docker somehow. What docker distribution are you using? |
Interesting. For me it works since 10 days ago and I just checked that I'm on Docker 18.09.1 as well. I should have checked the version when it didn't work. 18.09.1 was pushed on Jan 10th: |
@BenTheElder , I was running dind container |
Thanks for confirming, I'm going to file another issue to create a "known issues" section in our docs and highlhight this as one of the first ones! |
+1 adding the option to pre-bake the overlay network and also provide air-gapped support will help users that don't want their kind cluster to talk to the internet. for the rest we might have to still expose the proxy env vars. |
i wonder what was fixed. |
so docker itself supports HTTP_PROXY / HTTPS_PROXY https://docs.docker.com/network/proxy/ 🤔 |
it makes sense, especially if it's a fix. |
@BenTheElder I really need this as my company has a corporate proxy... will you be working on it, or should I jump in? |
@BenTheElder Security context is set to allow privileged execution. I am using official docker:dind as a base image and docker itself is running in the container. I did not have to mount anything when running it locally and it was working correctly. Only when running in a k8s environment there is an issue. Here is my test yaml: apiVersion: v1
kind: Namespace
metadata:
name: test-floreks
---
apiVersion: apps/v1 # for versions before 1.9.0 use apps/v1beta2
kind: Deployment
metadata:
namespace: test-floreks
name: dind
spec:
selector:
matchLabels:
app: dind
replicas: 1 # tells deployment to run 2 pods matching the template
template:
metadata:
labels:
app: dind
spec:
containers:
- name: dind
image: floreks/dind-with-kind:v1.0.0
securityContext:
privileged: true |
so our actual podspec is ~ the contents of the apiVersion: prow.k8s.io/v1
kind: ProwJob
metadata:
annotations:
prow.k8s.io/job: ci-kubernetes-kind-conformance
creationTimestamp: null
labels:
created-by-prow: "true"
preset-bazel-remote-cache-enabled: "true"
preset-bazel-scratch-dir: "true"
preset-dind-enabled: "true"
preset-service-account: "true"
prow.k8s.io/id: bc7c7a72-2b06-11e9-8fd7-0a580a6c037c
prow.k8s.io/job: ci-kubernetes-kind-conformance
prow.k8s.io/type: periodic
name: f8f7ed86-2b0d-11e9-bfc2-0a580a6c0297
spec:
agent: kubernetes
cluster: default
job: ci-kubernetes-kind-conformance
namespace: test-pods
pod_spec:
containers:
- args:
- --job=$(JOB_NAME)
- --root=/go/src
- --repo=k8s.io/kubernetes=master
- --repo=sigs.k8s.io/kind=master
- --service-account=/etc/service-account/service-account.json
- --upload=gs://kubernetes-jenkins/logs
- --scenario=execute
- --
- ./../../sigs.k8s.io/kind/hack/ci/e2e.sh
env:
- name: GOOGLE_APPLICATION_CREDENTIALS
value: /etc/service-account/service-account.json
- name: E2E_GOOGLE_APPLICATION_CREDENTIALS
value: /etc/service-account/service-account.json
- name: TEST_TMPDIR
value: /bazel-scratch/.cache/bazel
- name: BAZEL_REMOTE_CACHE_ENABLED
value: "true"
- name: DOCKER_IN_DOCKER_ENABLED
value: "true"
image: gcr.io/k8s-testimages/kubekins-e2e:v20190205-d83780367-master
name: ""
resources:
requests:
cpu: "2"
memory: 9000Mi
securityContext:
privileged: true
volumeMounts:
- mountPath: /lib/modules
name: modules
readOnly: true
- mountPath: /sys/fs/cgroup
name: cgroup
- mountPath: /etc/service-account
name: service
readOnly: true
- mountPath: /bazel-scratch/.cache
name: bazel-scratch
- mountPath: /docker-graph
name: docker-graph
dnsConfig:
options:
- name: ndots
value: "1"
volumes:
- hostPath:
path: /lib/modules
type: Directory
name: modules
- hostPath:
path: /sys/fs/cgroup
type: Directory
name: cgroup
- name: service
secret:
secretName: service-account
- emptyDir: {}
name: bazel-scratch
- emptyDir: {}
name: docker-graph
type: periodic
status:
startTime: "2019-02-07T19:24:22Z"
state: triggered |
#275 just merged to pass through HTTPS_PROXY and HTTP_PROXY from the host to the nodes, thanks @pablochacin! We should be getting the 0.2 release soon with this change, but right now you can obtain it by building from the current master branch sources. hopefully this should resolve this issue, I am finalizing the design for handling CNIs as well, plan to bring up at the next meeting. We've additionally uncovered #284 which may affect some configurations. |
Thanks @BenTheElder I now have a new issue: You can find the full debug log here. |
@matthyx from the log I see that the proxy has been set to |
Ok, I feel so stupid indeed... so after setting the proxy to something reachable from containers, I get the following (full log here): |
@matthyx you set the proxy to My suggestion is that you either test it disabling your firewall (at your own risk ;-) ) or try with a proxy running on a public address. |
@pablochacin thanks for the suggestion, I have just checked and my proxy works from inside docker, as confirmed by a small Dockerfile like:
|
I'm not following here @matthyx This is a Docker file, right? It is applied at build time, the issue you have is at run time. Not sure if this two situations are comparable. What I suggest you is to start a container with ubuntu, and from inside the container try an update:
|
Yes, this works:
However, I have reached IT and it seems our corporate proxy (which requires a local cntlm for AD authentication) uses an old protocol for the man in the middle... and for this reason we cannot upgrade our Docker past 18.06.1-ce |
Some updates on this. I have the privilege to work with extremely bright people here, and the problem seems to lie on TLS negotiation (although not 1.3) because our proxy policy hasn't been updated in a while, and none of the algorithm proposed by the go tls client is supported atm... We're working with network and security to update this policy, and I will keep you posted if that solves our problem! |
Just to confirm the problem from my side persists even after Docker upgrade (I don't have any HTTP proxy): I get this error with Docker 18.06.1 from the official Ubuntu 18.04 LTS repository:
The problem persists for me after upgrading to
|
the next release will contain this fix, but in the meantime it can be installed from the current source 😬 |
Should be actually fixed now, additionally new node images do not require pulling the overlay image at all. |
I will test on Monday since I don't have our corporate proxy at home... thanks for the update! |
@BenTheElder doesn't seem to work better... I did You can read the debug logs here. |
hey @matthyx, can you run with I suspect this is something else with your environment, at the latest source zero internet connectivity should be required after pulling the "node" image. (which I and one other user have been able to verify). |
Should I open another issue once I have the logs? |
that would be good, thanks! |
I think this is good now... looking at the logs before sending them, I have noticed that:
And so I decided to give it a try by unsetting all my *_proxy env variables and suddently it worked! I can finally enjoy kind on my pro workstation. Thanks a lot @pablochacin and @BenTheElder ! |
Hi @BenTheElder, I'm also interested in helping with this issue as it relates to support for air gapped testing. I'm currently in-flight back to Austin, and I thought I'd get some Kind-based dev-work done. However, without a good internet connection things are just not working. I finally got past the above error, but now, due to a flaky network, the node is never ready due to the inability to initialize CNI. |
hey @akutz -- on the latest code in master airgapped clusters should work, the CNI does not need to be pulled, is it possible you're using an older version? |
[0.17.0-0.1] DOC Restructuration and review
Environment
Host OS: RHEL 7.4
Host Docker version: 18.09.0
Host go version: go1.11.2
Node Image: kindest/node:v1.12.2
kind create cluster
Code below in pkg/cluster/context.go is trying to extract k8s version using kubectl version command in order to download the version-specific weave net.yaml. The code is not ok:-
Why is the output of kubectl version command, base64 encoded?
The text was updated successfully, but these errors were encountered: