Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

First scheduled pod starts using the wrong CNI #219

Open
2ZZ opened this issue Jan 28, 2021 · 2 comments
Open

First scheduled pod starts using the wrong CNI #219

2ZZ opened this issue Jan 28, 2021 · 2 comments

Comments

@2ZZ
Copy link

2ZZ commented Jan 28, 2021

Hi,

The first pods scheduled on a node can sometimes get an IP from the non-default CNI.
I think this is because CNI-Genie daemonset pod can start too late maybe due to delays in pulling the image.
I added the system-node-critical priorityClass to the CNI-Genie daemonset but it has not helped.

Example timeline:

  • Nginx app pod triggers cluster scale up
  • AWS-CNI, Calico and CNI-Genie daemonsets are scheduled on the new node
  • Nginx pod starts up before GNI-Genie pod has finished starting so the config is not in /etc/cni/net.d at this point
  • Nginx pod gets an IP from AWS-CNI instead of the default set in CNI-Genie
  • Future pods on that new node are given correct IPs once CNI-Genie has started

Setup:
Cluster: Amazon EKS 1.18
Calico version: 3.16.3
CNI-Genie version: latest

@shinebayar-g
Copy link

That sounds possible scenario. We're using CNI-Genie for running Cilium + AWS VPC.

Fortunately we didn't observe this issue when upgrading worker versions.

@2ZZ
Copy link
Author

2ZZ commented Mar 18, 2021

Hi, I modified AWS CNI to wait for CNI-Genie to be present in /etc/cni and haven't seen this issue since.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants