-
Notifications
You must be signed in to change notification settings - Fork 672
Weave with AWS EKS is not working #3335
Comments
Thanks for this report, @redi-vinogradov . Could you add a little more detail to "cannot interact with each other" ? What did you try? One thing I'm aware of is that pods on the Weave network cannot talk to the Kubernetes api-server, because it is on an EKS-specific network. And since kube-dns is in that set, it cannot resolve any service addresses, which will break lots of things. Happy to receive tips or PRs to let us connect in the api-server(s). |
A few examples:
Pod is trying to connect to kubernetes-dashboard service via port 443:
Pod is trying to connect to kubernetes-dashboard directly (via pod's IP):
Not sure why EKS-specific network should be an issue since you have to define it first (before EKS cluster creation) and in our case worker nodes are on the same network as EKS master nodes. |
@errordeveloper Any idea how to access the api-server? |
@redi-vinogradov Wondering how do you bypass the default use of amazon-vpc-cni-k8s as CNI for EKS, Could not find a way in the guides. |
Related eksctl-io/eksctl#109 |
@murali-reddy Not sure if that was a correct way of doing this but basically |
@redi-vinogradov @murali-reddy yes, deleting |
@brb @bboreham so from my most recent findings (eksctl-io/eksctl#109), that doesn't seem an issue any more. Although, I only did limited testing, so there may exist conditions under which the API server become inaccessible (we should be able to clarify it by talking to the AWS team). As you can see from eksctl-io/eksctl#109, what is certainly broken is the DNS, so I suggest we should get to the bottom of that first, and then test more extensively. To be clear, I'd like to see Weave Net as eksctl add-on, otherwise it seems like the process of swapping out the network is not simple enough for someone to do manually. |
+1 While deleting the DS will result in the deletion of |
I agree, it is very likely the iptables lunger around after native network is removed. |
Thank you, gents. I had to reset iptables rules and reboot worker nodes. Now I can confirm that pods are able to communicate successfully however, pods are still not able (no route to host error) to communicate with services. |
How exactly did you reset the iptables? |
I am able to consistently run Weave on EKS. I am following below steps. Can some one please try and confirm if it works?
I am able to test below scenarios:
EDIT: Note that the api-server for your cluster will not be connected to Weave Net (it runs elsewhere, managed by EKS) so will not be able to connect to pods. |
@murali-reddy thanks a lot, great to see it didn't require too crazy work-arounds! As your list seems to exclude it, I have to ask - did you test egress to internet, and have you looked into whether pods can connect to the API server using default in-cluster endpoint ( |
Should have added that. Yes. connection to API server works fine. Both kube-dns and weave pods using service cluster IP of To me everything seems to be working fine. Just need some one to try out and either confirm or report any issue. |
Thank you @murali-reddy! I followed your steps and can confirm that the following is working:
Not sure how can you do pod-service to pod-ip connectivity test but presumably it is. |
@redi-vinogradov thanks for confirming it works Note that these are fairly easy steps to automate. If you just start with master nodes provisioned by EKS, then perform some of the steps (skip the steps needed on the nodes as there are none) one time, then there is nothing to be done on the newer nodes. As the new nodes starts straight away with Weave as CNI |
Is the /etc/cni/net.d/10-aws.conflist written by the default driver at
runtime?
…On Wed, 28 Nov 2018, 5:35 am Murali Reddy, ***@***.***> wrote:
@redi-vinogradov <https://github.com/redi-vinogradov> thanks for
confirming it works
Note that these are fairly easy steps to automate. If you just start with
master nodes provisioned by EKS, then perform some of the steps (skip the
steps needed on the nodes as there are none) one time, then *there is
nothing to be done* on the newer nodes. As the new nodes starts straight
away with Weave as CNI
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#3335 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAPWS0gYL8Pjy7Z4eehA_K9ybrKEuGHSks5uzgRjgaJpZM4UylQ2>
.
|
Yes, EKS's |
@murali-reddy what was the source you specified for the TCP and UDP SG changes? I was testing with access from an node in the SG itself and it didn't appear to like that. I resolved to anywhere access which is too open IMO. Any tips? |
@christianberg you mean this step? I picked up the security group that applies to the node and added a inbound rule with custom TCP type with range 6873-6874 and source as custom with same security group set as value. Yes, anywhere would be too open, restrict to the nodes only. |
Interesting, that's what I had. I'll give it another shot. Thanks @murali-reddy |
Hi, I'm setting up EKS cluster with Terraform EKS module. Pod communication seems to work nicely after removing aws-node ds, applying Weave ds and recycling nodes. However, I cannot access to services using kubectl proxy 1, like http://localhost:8001/api/v1/namespaces/mynamespace/services/my-nginx/proxy/. I get the below error:
The problem seems to be that AWS managed api server cannot get access pods in the overlay network. Should I be able to access the pods with this proxy method at all when using Weave as CNI? |
I am afraid you wont be able to. Weave overlay network only extend on the non-master nodes. |
+- we don't know a way to tell the api-server how to route to the Weave Network. ("Address is not allowed" is interesting - I haven't seen that before. Could it be an ICMPv6 Destination Unreachable code 5?) |
Ok, thanks! I believe this proxy access works only with native aws-cni currently, and not with any other cnis. |
Tried to get to the root of this message, and closest I found was part of kubernetes proxy implementation: https://github.com/kubernetes/kubernetes/blob/a3ccea9d8743f2ff82e41b6c2af6dc2c41dc7b10/staging/src/k8s.io/apimachinery/pkg/util/proxy/transport.go#L103 which calls go http module RoundTripper interface (https://github.com/golang/go/blob/fdefabadf0a2cb99accb2afe49eafce0eaeb53a7/src/net/http/roundtrip.go). But could not find a trace of the message from go library, or from Linux kernel sources. |
Thanks to @murali-reddy - everything works, and especially multicast, which is key for us! |
So this means if you deploy Net on EKS you just won't have access to |
@christopherhein not really. So only case where API server/master node directly needs to access the pod IP you will have problem. I am not sure what could be such cases where direct access to pod IP's are needed. For port-forward, exec, logs etc requests go through the kubelet on the node. So there should not be any problem. I am able to successfully perform exec, logs, port-foraward etc with Weave CNI on EKS. |
@murali-reddy seems i faced this issue deploying metrics-server on top of EKS + weave net. API server tries to access metrics-server POD ip directly instead of metrics-server cluster IP: and what i have in logs:
10.32.0.5 this is POD address of metrics server. Is there any way to change API server behaviour? |
@alec-v IMO Kubernetes control plane/master not able to reach pod IP's is not necessarily a bad thing from security perspective and make sense for hosted Kubernetes solution. But it does seem to have impact on any extension API using aggregation layer. Please see this comment you should be able to use |
I am closing this bug. Instructions provided seems to work for most cases. Please reopen if you feel this issue need to be addressed. |
Hello, this is still an issues and the instructions are not fully working.
But if i try to rerun the same command, after about 30 tries or so it works. then after about 5 minutes, it stops working again. It seem its working then not, then working, then not, etc.
|
@BlackBsd it looks like you are having a stability problem, please open another issue with details of logs that you are seeing and make sure to check if the weave-net pod is restarting. |
It seems that I am having issues with the API server being able to talk to pods in my EKS cluster since I installed Weave Net. The issue I am seeing is: I start a proxy:
I try to access the installed kubernetes-dashboard instance at the following url: http://localhost:8001/api/v1/namespaces/kubernetes-dashboard/services/https:kubernetes-dashboard:/proxy/ I get the following displayed on the page:
Seems that the issue is that the API server does not know how to access The strange thing is that I am able to access the dashboard (and other services) if I use Install details:
The install works and all of my pods are running as I would expect. I know that they are able to communicate out of the cluster and that pod to pod connectivity is functioning. I do not have any pods trying to access the API server, so I don't know if that works or not. |
@jwenz723 yes this does not work. As you figured master nodes are not in the weave overaly they can not connect.
https://kubernetes.io/docs/concepts/architecture/master-node-communication/#master-to-cluster API server goes through kubelet and then to the pod in case of port-forward |
Is it expected that a service of type I am getting the following error on my service when I try to set the type to
|
This particular error is nothing do with Weav-net and service type LoadBalancer should work when using EKS. Please check if you are following the guidelines https://docs.aws.amazon.com/eks/latest/userguide/network_reqs.html |
@jwenz723 This issue has nothing to do with your error. P.S. You should have selected some public subnets for your EKS control plane or use special annotation in your pod definition to create an internal LB: https://docs.aws.amazon.com/eks/latest/userguide/load-balancing.html |
Thanks, adding the proper tags on my subnets and vpc fixed my loadbalancer issue. |
For the most part works but problem comes when istio is installed , istio/istio#16434 <- take a look |
What you expected to happen?
EKS pods are able to interact with each other vi Weave network
What happened?
Deployed Weave daemonset on a new AWS EKS cluster with updated CIDR range. Pods are able to get proper IP from Weave but cannot interact with each other.
How to reproduce it?
Create AWS EKS cluster, define
IPALLOC_RANGE
environment variable in your daemonset file to172.20.0.0/16
. Apply Weave daemonset. Now pods are able to get an IP but can't interact.Anything else we need to know?
AWS EKS, no internet access, using proxy for external access.
Versions:
EKS v1.10
Logs:
Network:
The text was updated successfully, but these errors were encountered: