Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't reach a pod from an eks node when using security group for pods #1260

Closed
scardena opened this issue Oct 13, 2020 · 2 comments
Closed
Labels

Comments

@scardena
Copy link

scardena commented Oct 13, 2020

What happened:
I am testing to use security group for pods following the tutorial in here: https://docs.aws.amazon.com/eks/latest/userguide/security-groups-for-pods.html
I was able to deploy it successfully, as I can see my pods annotated with: vpc.amazonaws.com/pod-eni:[eni etc], and I can successfully confirm in the AWS console that a new ENI is created with the same private IP as the pods, and the selected security group is attached to the created ENI.

For testing purposes, I have this security group to accept all traffic. This means that all my pods can reach each other under any port. I can also confirm that DNS resolution can be done from any pod as I can reach services outside AWS (namely curl google / facebook etc) My only problem is that I can't seem to be able to reach the pod from the same node is being executed (on any port). The weird part is that I can reach the pod from any other node in which the pod doesn't live (So in a 3 node EKS cluster, if pod "pod-A" runs in node1, then I can reach only "pod-A" from node2 and node3, but not from node1).

This is a problem because the kubelet in that node is failing to pass all the http liveness/readiness checks, and my statefulset never comes up (I would assume this will also be a problem for deployments, although I haven't tried)
Like I said, I do get the security groups for pod successfully deployed but I am having a hard time understanding why I can't reach a pod from the same node, even though, I have set All Traffic for that security group.

What you expected to happen:
Since I have the security group for pods to accept all traffic, I should be able to reach the pod from within the node it's running.

How to reproduce it (as minimally and precisely as possible):

  • install the podsecurity crd
  • apply the following sgp:
apiVersion: vpcresources.k8s.aws/v1beta1
kind: SecurityGroupPolicy
metadata:
  name: sd-sg-pods
  namespace: some-ns
spec:
  podSelector:
    matchLabels: {}
  securityGroups:
    groupIds:
      - sg-xxxxxxx # where this security group has access to All traffic.
  • run a single pod like the following:
apiVersion: v1
kind: Pod
metadata:
  name: netuitls-sd1
  namespace: some-ns
spec:
  containers:
    - name: my-node
      image: amouat/network-utils
      command:
      - /bin/sh
      - -c
      - |
        tail -f /dev/null
  • Get into the pod and listen for any connection on any port:
kubectl exec -it netuitls-sd1 bash
nc -nlvp 9999
  • ssh into the EKS node that runs this pod, and try to reach this pod:
nc -z ip-of-the-netutils-sd1-pod 9999

This will hang forever or time out depending on the nc version you are running on the node. Note that this will success for any other node that is not where netuitls-sd1 is running.

Anything else we need to know?:
It's worth mention that if I remove the sgp resource from k8s, the problem goes away, so this is highly likely a problem with security group for pods. I don't run into this problem when using regular network policies.

Environment:

eks version: eks.3
running managed node group:  1.17.11-20201002 
kubernetes version: 1.17
cni: amazon-k8s-cni-init:v1.7.4
      amazon-k8s-cni:v1.7.4
@scardena scardena added the bug label Oct 13, 2020
@SaranBalaji90
Copy link
Contributor

SaranBalaji90 commented Oct 13, 2020

Sorry for the confusion @scardena. #1221 has been released few days back to address this concern. I will make sure our doc gets updated ASAP. To mitigate the issue you have to install 1.7.3 and set Disable_tcp_early_demux to true for init container.

@scardena
Copy link
Author

that solved it @SaranBalaji90 thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants