Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cache pause, vpc-cni, and kube-proxy images in the AMI #938

Merged
merged 1 commit into from
Nov 18, 2022

Conversation

bwagner5
Copy link
Contributor

@bwagner5 bwagner5 commented Jun 3, 2022

Issue #, if available:
#1034
#990
#1099

Description of changes:

  • Cache the following images to improve node bootstrapping time:
    • pause
    • aws-node images (init and aws-node)
    • kube-proxy images (minimal and normal)

kubelet logs with image pull timings before caching (v=4):

Oct  7 17:05:07 ip-192-168-101-146 pull-sandbox-image.sh: done: 45.698568ms
Oct  7 17:05:17 ip-192-168-101-146 kubelet: I1007 17:05:17.995376    4028 event.go:291] "Event occurred" object="kube-system/kube-proxy-z7s74" kind="Pod" apiVersion="v1" type="Normal" reason="Pulled" message="Successfully pulled image \"602401143452.dkr.ecr.us-west-2.amazonaws.com/eks/kube-proxy:v1.22.12-minimal-eksbuild.1\" in 1.712432387s"
Oct  7 17:05:20 ip-192-168-101-146 kubelet: I1007 17:05:20.322544    4028 event.go:291] "Event occurred" object="kube-system/aws-node-c24g9" kind="Pod" apiVersion="v1" type="Normal" reason="Pulled" message="Successfully pulled image \"602401143452.dkr.ecr.us-west-2.amazonaws.com/amazon-k8s-cni-init:v1.11.4-eksbuild.1\" in 3.960642579s"
Oct  7 17:05:26 ip-192-168-101-146 kubelet: I1007 17:05:26.913619    4028 event.go:291] "Event occurred" object="kube-system/aws-node-c24g9" kind="Pod" apiVersion="v1" type="Normal" reason="Pulled" message="Successfully pulled image \"60240
1143452.dkr.ecr.us-west-2.amazonaws.com/amazon-k8s-cni:v1.11.4-eksbuild.1\" in 1.390664746s"

The total latency in these serial image pulls is ~7 seconds.

After caching all are used from the local host:

Oct  7 17:05:09 ip-192-168-97-141 kubelet: I1007 17:05:09.666150    4912 event.go:291] "Event occurred" object="kube-system/aws-node-ldvfh" kind="Pod" apiVersion="v1" type="Normal" reason="Pulled" message="Container image \"602401143452.dkr.ecr.us-west-2.amazonaws.com/amazon-k8s-cni-init:v1.11.4-eksbuild.1\" already present on machine"
Oct  7 17:05:12 ip-192-168-97-141 kubelet: I1007 17:05:12.375142    4912 event.go:291] "Event occurred" object="kube-system/aws-node-ldvfh" kind="Pod" apiVersion="v1" type="Normal" reason="Pulled" message="Container image \"602401143452.dkr.ecr.us-west-2.amazonaws.com/amazon-k8s-cni:v1.11.4-eksbuild.1\" already present on machine"
Oct  7 17:05:12 ip-192-168-97-141 kubelet: I1007 17:05:12.493594    4912 event.go:291] "Event occurred" object="kube-system/kube-proxy-79tp5" kind="Pod" apiVersion="v1" type="Normal" reason="Pulled" message="Container image \"602401143452.dkr.ecr.us-west-2.amazonaws.com/eks/kube-proxy:v1.22.12-minimal-eksbuild.1\" already present on machine"

Since this PR caches the sandbox image (pause), this means we can also enable containerd to start on boot rather than in the bootstrap script which removes the containerd startup time which results in ~3 second latency reduction for when kubelet starts (P50).

Screen Shot 2022-10-07 at 14 23 42

This graph shows the combined reduction of the node getting to Ready, starting w/ kube-proxy & pause cached and then adding the vpc cni images to the cache (P50):

Screen Shot 2022-10-07 at 14 17 08

Size of the containerd image cache:

du -h /var/lib/containerd/io.containerd.snapshotter.v1.overlayfs
...
1.1G	/var/lib/containerd/io.containerd.snapshotter.v1.overlayfs/

Size of a running node without any cached images (so the minimal images pulled needed to start the node):

du -h /var/lib/containerd/io.containerd.snapshotter.v1.overlayfs/
...
470M	/var/lib/containerd/io.containerd.snapshotter.v1.overlayfs/

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@bwagner5 bwagner5 marked this pull request as draft June 3, 2022 23:49
@cartermckinnon
Copy link
Member

I'm interested in where this is headed, but what kind of speedup are we talking about? These images are fairly small, and they're coming from within the region. I'm seeing <5 seconds total for image pulls in us-west-2:

> sudo ctr --namespace k8s.io image pull 602401143452.dkr.ecr.us-west-2.amazonaws.com/eks/pause:3.1-eksbuild.1 --user AWS:$ECR_PASSWORD
602401143452.dkr.ecr.us-west-2.amazonaws.com/eks/pause:3.1-eksbuild.1:            resolved       |++++++++++++++++++++++++++++++++++++++| 
index-sha256:1cb4ab85a3480446f9243178395e6bee7350f0d71296daeb6a9fdd221e23aea6:    done           |++++++++++++++++++++++++++++++++++++++| 
manifest-sha256:234b8785dd78afc0fbb27edad009e7eb253e5685fb7387d4f0145f65c00873ac: done           |++++++++++++++++++++++++++++++++++++++| 
config-sha256:106a8e54d5eb3f70fcd1ed46255bdf232b3f169e89e68e13e4e67b25f59c1315:   done           |++++++++++++++++++++++++++++++++++++++| 
layer-sha256:41d8806bd3d23e1ffb7e9825fa56a0c2e851dfeeb405477ab1d6bc3a34bc0da2:    done           |++++++++++++++++++++++++++++++++++++++| 
elapsed: 0.4 s                                                                    total:  1.2 Ki (3.1 KiB/s)                                       
unpacking linux/amd64 sha256:1cb4ab85a3480446f9243178395e6bee7350f0d71296daeb6a9fdd221e23aea6...
done

> sudo ctr --namespace k8s.io image pull 602401143452.dkr.ecr.us-west-2.amazonaws.com/amazon-k8s-cni:v1.11.2 --user AWS:$ECR_PASSWORD
602401143452.dkr.ecr.us-west-2.amazonaws.com/amazon-k8s-cni:v1.11.2:              resolved       |++++++++++++++++++++++++++++++++++++++| 
index-sha256:9da3824d4b058462912d6e781714c4faa30be0a306cc48790cc042f51ca70651:    done           |++++++++++++++++++++++++++++++++++++++| 
manifest-sha256:5778a70db82f9ed9fb3ed3cb1d882f7211637c064589cd894c261759ea77265a: done           |++++++++++++++++++++++++++++++++++++++| 
layer-sha256:85f9b0d1a1fdbd43edbe8f98bbbf3b11d250fcfb05155e43fcb21d0d04822b0d:    done           |++++++++++++++++++++++++++++++++++++++| 
layer-sha256:8de5b65bd171294b1e04e0df439f4ea11ce923b642eddf3b3d76d297bfd2670c:    done           |++++++++++++++++++++++++++++++++++++++| 
layer-sha256:f18cfec559f5efc0bc336da4bb168e5a850082ad64e090db084e04f6cd65f7c8:    done           |++++++++++++++++++++++++++++++++++++++| 
config-sha256:4e9a8bf255bb85e99421a7108d8306ba0cacf43145478e5ab9d07e60ba9177ec:   done           |++++++++++++++++++++++++++++++++++++++| 
layer-sha256:1b862f5f9fa1a496dc2cb43edfeedd44e25a2c227d8b8796e223910853d3339b:    done           |++++++++++++++++++++++++++++++++++++++| 
elapsed: 2.8 s                                                                    total:  94.1 M (33.6 MiB/s)                                      
unpacking linux/amd64 sha256:9da3824d4b058462912d6e781714c4faa30be0a306cc48790cc042f51ca70651...
done

> sudo ctr --namespace k8s.io image pull 602401143452.dkr.ecr.us-west-2.amazonaws.com/amazon-k8s-cni-init:v1.11.2 --user AWS:$ECR_PASSWORD
602401143452.dkr.ecr.us-west-2.amazonaws.com/amazon-k8s-cni-init:v1.11.2:         resolved       |++++++++++++++++++++++++++++++++++++++| 
index-sha256:aa5bd2e6b21e7f4167e9db19c239f2288159ca171558ec3a60a7c5f9ce0650d3:    done           |++++++++++++++++++++++++++++++++++++++| 
manifest-sha256:c9fab251a20bf364315f54d4e43708e8a839d6e2d24156d70462beea0567e756: done           |++++++++++++++++++++++++++++++++++++++| 
layer-sha256:edc3ce4cb2326060f5e32631030f0ce65c688b745b40e51b0f18effb7fc2618a:    done           |++++++++++++++++++++++++++++++++++++++| 
config-sha256:5abf9331a3a190e0d8c5cde96d540e39b737e63e4d89d864694717c8ab9607eb:   done           |++++++++++++++++++++++++++++++++++++++| 
layer-sha256:8de5b65bd171294b1e04e0df439f4ea11ce923b642eddf3b3d76d297bfd2670c:    exists         |++++++++++++++++++++++++++++++++++++++| 
layer-sha256:e5481bb799cb068f202c75d6d276f186821d31b7783b5e3e8627362b1d4ced02:    done           |++++++++++++++++++++++++++++++++++++++| 
layer-sha256:85ab13e61b22b5dd72a416bcd42bf052d8a1d4565b3229b551043ba74343d7f5:    done           |++++++++++++++++++++++++++++++++++++++| 
elapsed: 0.8 s                                                                    total:  34.6 M (43.2 MiB/s)                                      
unpacking linux/amd64 sha256:aa5bd2e6b21e7f4167e9db19c239f2288159ca171558ec3a60a7c5f9ce0650d3...
done

> sudo ctr --namespace k8s.io image pull 602401143452.dkr.ecr.us-west-2.amazonaws.com/eks/kube-proxy:v1.22.6-eksbuild.1 --user AWS:$ECR_PASSWORD
602401143452.dkr.ecr.us-west-2.amazonaws.com/eks/kube-proxy:v1.22.6-eksbuild.1:   resolved       |++++++++++++++++++++++++++++++++++++++| 
index-sha256:c8abb4b8efc94090458f34e5f456791d9f7f57b5c99517b6b4e197305c1f10f6:    done           |++++++++++++++++++++++++++++++++++++++| 
manifest-sha256:0256f9a055e60fb40bccb5e2f743eb0eea13dcf38952058ceaba8e1fd1e6c096: done           |++++++++++++++++++++++++++++++++++++++| 
config-sha256:c8c9982c9d03789fe4d09993eaa54b11acd7b9bc6ebbfa60a223696172ca7507:   done           |++++++++++++++++++++++++++++++++++++++| 
layer-sha256:20b09fbd30377e1315a8bc9e15b5f8393a1090a7ec3f714ba5fce0c9b82a42f2:    done           |++++++++++++++++++++++++++++++++++++++| 
layer-sha256:8e20184c86519fabc2c1c075723074fc9c3fd780fb4c3fa2722b99db746b6db1:    done           |++++++++++++++++++++++++++++++++++++++| 
elapsed: 0.7 s                                                                    total:  30.6 M (43.6 MiB/s)                                      
unpacking linux/amd64 sha256:c8abb4b8efc94090458f34e5f456791d9f7f57b5c99517b6b4e197305c1f10f6...
done

Maybe this is more of a benefit for Outpost?

@bwagner5
Copy link
Contributor Author

bwagner5 commented Jun 13, 2022

This provides a marginal speed-up of around 8-10 seconds. The reason is that not all images are pulled concurrently. Additionally, caching the images on the node helps with large scale outs where a container registry may start throttling depending on the number of nodes, on-going pulls, and how many layers the images have. The images cached in this PR are 10 layers, which means with ECR's 3,000 TPS GetDownloadUrlForLayer limit, a 300 node scale-up would result in throttling and likely a smaller scale-up since the actual images that need to be pulled would also be taking up that TPS. Since we're fairly confident (although not guaranteed in the case of the vpc-cni) in what images will be used in the AMI, I think it makes sense to cache these directly.

@cartermckinnon
Copy link
Member

with ECR's 3,000 TPS GetDownloadUrlForLayer limit, a 300 node scale-up would result in throttling and likely a smaller scale-up since the actual images that need to be pulled would also be taking up that TPS

This limit is per second, though -- it seems unlikely that 300 nodes would attempt to download all 10 of these layers within the same second, right? containerd also only downloads 3 layers in parallel by default. Users might configure a higher concurrency, but they can also request an increased quota from ECR, too.

Anyway, not against the general idea; we just need to make sure we're caching the "right" image(s) for this to benefit most users.

@bwagner5
Copy link
Contributor Author

with ECR's 3,000 TPS GetDownloadUrlForLayer limit, a 300 node scale-up would result in throttling and likely a smaller scale-up since the actual images that need to be pulled would also be taking up that TPS

This limit is per second, though -- it seems unlikely that 300 nodes would attempt to download all 10 of these layers within the same second, right? containerd also only downloads 3 layers in parallel by default. Users might configure a higher concurrency, but they can also request an increased quota from ECR, too.

Anyway, not against the general idea; we just need to make sure we're caching the "right" image(s) for this to benefit most users.

It might be unlikely to download them all in the same second but not impossible. As the scale outs get larger and faster (which is what we're trying to do with Karpenter), things will start to bottleneck.

scripts/install-worker.sh Outdated Show resolved Hide resolved
files/bootstrap.sh Outdated Show resolved Hide resolved
scripts/install-worker.sh Outdated Show resolved Hide resolved
files/pull-image.sh Outdated Show resolved Hide resolved
scripts/install-worker.sh Outdated Show resolved Hide resolved
scripts/install-worker.sh Show resolved Hide resolved
scripts/install-worker.sh Outdated Show resolved Hide resolved
scripts/install-worker.sh Outdated Show resolved Hide resolved
@cartermckinnon
Copy link
Member

Can you add the size of the image cache after this change?

@bwagner5
Copy link
Contributor Author

Can you add the size of the image cache after this change?

du -h /var/lib/containerd/io.containerd.snapshotter.v1.overlayfs
...
1.1G	/var/lib/containerd/io.containerd.snapshotter.v1.overlayfs/

@bwagner5 bwagner5 force-pushed the cache-images branch 4 times, most recently from 55a28e9 to f2928f9 Compare November 5, 2022 17:10
@armenr
Copy link

armenr commented Nov 8, 2022

Folks - I stumbled onto this Issue, personally, because I've been working on improving "NodeReady Latency" time on my own personal cluster as well. I want to get as far below 60s as possible, since our use-case is pretty latency-sensitive, and because Fargate pricing doesn't suit our use-case.

I have an AMI (non-BottleRocket/non-AL2/non-Ubuntu) that boots from "Instance Create" --> kernel --> user space --> fully "booted" in ~20 seconds (+/- 1.5 seconds)

That includes dynamically bootstrapping the additional configs/limits/info that the bootstrap.sh script goes through, but skipping any reliance or dependency on the cloud-init.

Note: traditional (Python-based) cloud-init is a GIANT time-suck for instance boot times.

Full transparency: I cannibalized the bootstrap script and worked on optimizing a few additional bits. I'm happy to share my findings/strategies. They are stable, and have been repeatably reliable in testing.

Using Karpenter, I'm able to schedule a new Pod (inflate test pod) that goes from Pending state (waiting for additional compute/new EKS worker node) to Running in about 58 seconds (comparable with BottleRocket boot/bootstrap times).

Suggestion

One additional area of improvement is to relocate the pause/sandbox image to a public ECR Repo.

The reason? You lose about ~5-6 seconds JUST waiting for retrieval of an ECR credential/token, BEST-CASE. That is to say, best-case times are around 5-6 seconds, and worst-case times are around 10-11 seconds of waiting, based on my testing...and I've been testing like a mad-man.

Check this out:

root@ip-10-0-11-210.us-west-2.compute.internal~ # systemd-analyze blame
13.462s sandbox-image.service

I added a few silly echo <where_in_the_script_are_we> lines to the pull-sandbox-image.sh script to get a sense of what the hold-up is.

On this run, it looks like we spent almost 10 seconds (~9.5-ish seconds) just waiting for the ECR password/token.

Nov 08 08:00:26 clr-ec2d84812bc92bcab8f37c572fae8f06 systemd[1]: Starting pull sandbox image defined in containerd config.toml...
Nov 08 08:00:26 clr-ec2d84812bc92bcab8f37c572fae8f06 pull-sandbox-image.sh[197]: FETCHING PASSWORD
Nov 08 08:00:35 ip-10-0-11-210.us-west-2.compute.internal pull-sandbox-image.sh[197]: FETCHED PASSWORD
Nov 08 08:00:35 ip-10-0-11-210.us-west-2.compute.internal pull-sandbox-image.sh[197]: Pulling Image
Nov 08 08:00:35 ip-10-0-11-210.us-west-2.compute.internal sudo[453]:     root : PWD=/ ; USER=root ; COMMAND=/sbin/ctr --namespace k8s.io image pull 602401143452.dkr.ecr.us-west-2.amazonaws.com/eks/pause:3.5 --user
Nov 08 08:00:35 ip-10-0-11-210.us-west-2.compute.internal sudo[453]:     root : (command continued) AWS:<SOME_LONG_TOKEN_STRING>
Nov 08 08:00:35 ip-10-0-11-210.us-west-2.compute.internal sudo[453]:     root : (command continued) <SOME_LONG_TOKEN_STRING_CONTINUED>
Nov 08 08:00:35 ip-10-0-11-210.us-west-2.compute.internal sudo[453]:     root : (command continued) <SOME_LONG_TOKEN_STRING_PART_THREE>
Nov 08 08:00:35 ip-10-0-11-210.us-west-2.compute.internal sudo[453]: pam_unix(sudo:session): session opened for user root by (uid=0)
Nov 08 08:00:39 ip-10-0-11-210.us-west-2.compute.internal pull-sandbox-image.sh[454]: 602401143452.dkr.ecr.us-west-2.amazonaws.com/eks/pause:3.5:                       resolved       |++++++++++++++++++++++++++++++
++++++++|
Nov 08 08:00:39 ip-10-0-11-210.us-west-2.compute.internal pull-sandbox-image.sh[454]: index-sha256:529cf6b1b6e5b76e901abc43aee825badbd93f9c5ee5f1e316d46a83abbce5a2:    exists         |++++++++++++++++++++++++++++++
++++++++|
Nov 08 08:00:39 ip-10-0-11-210.us-west-2.compute.internal pull-sandbox-image.sh[454]: manifest-sha256:666eebd093e91212426aeba3b89002911d2c981fefd8806b1a0ccb4f1b639a60: exists         |++++++++++++++++++++++++++++++
++++++++|
Nov 08 08:00:39 ip-10-0-11-210.us-west-2.compute.internal pull-sandbox-image.sh[454]: elapsed: 0.1 s                                                                    total:   0.0 B (0.0 B/s)
Nov 08 08:00:39 ip-10-0-11-210.us-west-2.compute.internal pull-sandbox-image.sh[454]: 602401143452.dkr.ecr.us-west-2.amazonaws.com/eks/pause:3.5:                       resolved       |++++++++++++++++++++++++++++++
++++++++|
Nov 08 08:00:39 ip-10-0-11-210.us-west-2.compute.internal pull-sandbox-image.sh[454]: index-sha256:529cf6b1b6e5b76e901abc43aee825badbd93f9c5ee5f1e316d46a83abbce5a2:    exists         |++++++++++++++++++++++++++++++
++++++++|
Nov 08 08:00:39 ip-10-0-11-210.us-west-2.compute.internal pull-sandbox-image.sh[454]: manifest-sha256:666eebd093e91212426aeba3b89002911d2c981fefd8806b1a0ccb4f1b639a60: exists         |++++++++++++++++++++++++++++++
++++++++|
Nov 08 08:00:39 ip-10-0-11-210.us-west-2.compute.internal pull-sandbox-image.sh[454]: layer-sha256:0692f38991d53a0c28679148f99de26a44d630fda984b41f63c5e19f839d15a6:    done           |++++++++++++++++++++++++++++++
++++++++|
Nov 08 08:00:39 ip-10-0-11-210.us-west-2.compute.internal pull-sandbox-image.sh[454]: config-sha256:6996f8da07bd405c6f82a549ef041deda57d1d658ec20a78584f9f436c9a3bb7:   done           |++++++++++++++++++++++++++++++
++++++++|
Nov 08 08:00:39 ip-10-0-11-210.us-west-2.compute.internal pull-sandbox-image.sh[454]: elapsed: 0.2 s                                                                    total:   0.0 B (0.0 B/s)
Nov 08 08:00:39 ip-10-0-11-210.us-west-2.compute.internal pull-sandbox-image.sh[454]: unpacking linux/amd64 sha256:529cf6b1b6e5b76e901abc43aee825badbd93f9c5ee5f1e316d46a83abbce5a2...
Nov 08 08:00:39 ip-10-0-11-210.us-west-2.compute.internal pull-sandbox-image.sh[454]: done: 7.90311ms
Nov 08 08:00:39 ip-10-0-11-210.us-west-2.compute.internal sudo[453]: pam_unix(sudo:session): session closed for user root
Nov 08 08:00:39 ip-10-0-11-210.us-west-2.compute.internal pull-sandbox-image.sh[197]: In BREAK
Nov 08 08:00:39 ip-10-0-11-210.us-west-2.compute.internal systemd[1]: sandbox-image.service: Deactivated successfully.
Nov 08 08:00:39 ip-10-0-11-210.us-west-2.compute.internal systemd[1]: Finished pull sandbox image defined in containerd config.toml.

The troublesome bit from the logs above ^^

Nov 08 08:00:26 clr-ec2d84812bc92bcab8f37c572fae8f06 pull-sandbox-image.sh[197]: FETCHING PASSWORD
Nov 08 08:00:35 ip-10-0-11-210.us-west-2.compute.internal pull-sandbox-image.sh[197]: FETCHED PASSWORD

My Point

Aside from caching (which I attempted in my own setup as well, btw), I'd suggest maybe considering:

  1. putting whatever is possible into a public, read-only ECR repo and skipping the credential wait times altogether
  2. or at least decoupling the fetching of ECR credentials from the first-boot sequence of the node by utilizing the locally cached/warmed images, and fetching ECR credentials in a non-blocking/non-gating way, in the background somewhere.

@bwagner5
Copy link
Contributor Author

bwagner5 commented Nov 8, 2022

My Point
Aside from caching (which I attempted in my own setup as well, btw), I'd suggest maybe considering:

  1. putting whatever is possible into a public, read-only ECR repo and skipping the credential wait times altogether
  2. or at least decoupling the fetching of ECR credentials from the first-boot sequence of the node by utilizing the locally cached/warmed images, and fetching ECR credentials in a non-blocking/non-gating way, in the background somewhere.

This PR mitigates the credential fetching by utilizing the cached pause image. The sandbox-image service still runs but it is not a blocking operation for containerd.

Also, checkout this PR that removes the cloud-init update packages module which blocks user-data execution: #1074 . The updates were dangerous anyways since you could have a cluster of instances with the same AMI ID but operating different software versions.

I experimented with removing cloud-init altogether, but it was difficult to justify removing it with an image distributed this widely. I found that removing the package updates reduced the majority of the cloud-init latency. There's still a few seconds of overhead with cloud-init but in my measurements it seems to only be 2-3 secs.

Using this PR, #1074, and some VPC CNI startup improvements, Karpenter is able to bring a node to "Ready" in 30 seconds at P50 and some nodes going Ready in 25 seconds. These timings also account for the launch latency in the EC2 control plane which isn't visible looking at node creation time with Karpenter since Karpenter creates the node resource after the EC2 Fleet call returns and the instance hostname is looked up via DescribeInstances.

I'd be curious on any other optimizations you implemented where you think we might be able to get the EKS Optimized AL2 launch time lower!

@armenr
Copy link

armenr commented Nov 9, 2022

@bwagner5 - Thank you for the reply! I'll experiment with the referenced PR. Let me gather up some notes/things I had done in my repo, and see what might be relevant. :) Happy to share.

Additionally, you mentioned VPC CNI startup improvements - I'm kinda intrigued by this. Would you be able to point me in the direction of any info, or maybe share what you mean?

Thank you! :)

@bwagner5
Copy link
Contributor Author

@bwagner5 - Thank you for the reply! I'll experiment with the referenced PR. Let me gather up some notes/things I had done in my repo, and see what might be relevant. :) Happy to share.

Additionally, you mentioned VPC CNI startup improvements - I'm kinda intrigued by this. Would you be able to point me in the direction of any info, or maybe share what you mean?

Thank you! :)

  • There's this PR which speeds up VPC CNI by about 2 seconds just due to some unnecessary sleeping: Reduce startup latency by removing some unneeded sleeps aws/amazon-vpc-cni-k8s#2104
  • I'm working on speeding up the VPC CNI's init container to main container switch over latency which, in my tests, shows about a 3 second latency. I'm experimenting with removing the init container and running it as a regular container in the pod and synchronizing the init stage with a shared emptyDir volume. This will allow the container start times to be parallelized and reduce any switching latency within the kubelet.

@armenr
Copy link

armenr commented Nov 10, 2022

@bwagner5 - In order to respect the purpose of this thread (which is a PR, and not even an "Issue"), would there be any way for us to connect via Discord or Slack...to keep the exchange of info/ideas + discussion going?

I recently spent 2 years @ AWS (US Startups Org) as an SA Manager. As a fellow (former) Amazonian, I'd be down to dig into this a bit with you... this is me: https://www.linkedin.com/in/armenr/

@stevehipwell
Copy link
Contributor

@bwagner5 I'm also interested in this discussion; could the VPC CNI start-up speed be addressed by using an eBPF pattern either directly or via chaining another CNI such as Cillium?

@bwagner5
Copy link
Contributor Author

@stevehipwell @armenr I started a group message with you both on K8s slack. We can continue discussion there. If anyone else is interested that stumbles across this issue, ping me on K8s slack (slack username: brandon.wagner)

@maximethebault
Copy link

Startup latency is of major interest to our organization as well. Not sure I would bring much to the conversation, but being kept up-to-date of the different initiatives and "tuning tips" to make node startup faster is definitely something we'd appreciate.

@bwagner5
Copy link
Contributor Author

Startup latency is of major interest to our organization as well. Not sure I would bring much to the conversation, but being kept up-to-date of the different initiatives and "tuning tips" to make node startup faster is definitely something we'd appreciate.

Created this issue to track and get feedback: #1099

README.md Outdated Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants