Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

static-pods: write manifest to tempfile before persisting it #1409

Merged

Conversation

etungsten
Copy link
Contributor

@etungsten etungsten commented Mar 24, 2021

Issue number:
Fixes #1396

Description of changes:

Author: Erikson Tung <etung@amazon.com>
Date:   Tue Mar 23 16:52:49 2021 -0700

    static-pods: write manifest to tempfile before persisting it
    
    When writing static pod manifest, create them as temporary files and
    finish writing to them before persisting them in the destination file
    path.

Testing done:
Built aws-k8s-1.19 image and launched 4 instances

For each instance I was able to set 3 static pods repeatedly like so:

[ec2-user@ip ~]$ for pod in a b c ; do
apiclient set \
   "kubernetes.static-pods.${pod}.manifest"="$(base64 -w 0 manifests/$pod.yaml)" \
   "kubernetes.static-pods.${pod}.enabled"=true
done

Then sudo sheltied to check kubelet logs. I no longer see any error logs pertaining to empty config files. All static pods gets created successfully and runs on the node. I repeated the for loop a few more times to make sure.

Note that the three static pods I created uses a generic manifest for nginx pods, that's why the image is docker.io/library/nginx:1.14.2

bash-5.0# cat /etc/kubernetes/static-pods/*  
apiVersion: v1
kind: Pod
metadata:
   name: a
spec:
   containers:
      - name: nginx
        image: nginx:1.14.2
        ports:
           - containerPort: 80
apiVersion: v1
kind: Pod
metadata:
   name: b
spec:
   containers:
      - name: nginx
        image: nginx:1.14.2
        ports:
           - containerPort: 82
apiVersion: v1
kind: Pod
metadata:
   name: c
spec:
   containers:
      - name: nginx
        image: nginx:1.14.2
        ports:
           - containerPort: 81
bash-5.0# ctr -a /run/dockershim.sock -n k8s.io containers ls
CONTAINER                                                           IMAGE                                                                                                                                      
18a40cad51210af34b8f6d98920980b9eb5ca487c7b69d8c7c762deaf2b9faa8    docker.io/library/nginx:1.14.2                                                                                                         
280cdd55cc9fdd60fd3b145880480d65953427f0d7870a75c909ee51604e8530    docker.io/library/nginx:1.14.2                                                                                                         
f1f31d9d25523bfb110fd1f050fa7feaa9689762a119831f11b5306e43f266d6    docker.io/library/nginx:1.14.2            
...

I also tried launching instances with static pods specified in the userdata, those pods also get created successfully.

Terms of contribution:

By submitting this pull request, I agree that this contribution is dual-licensed under the terms of both the Apache License, version 2.0, and the MIT license.

@etungsten
Copy link
Contributor Author

etungsten commented Mar 24, 2021

One thing to note, kubelet is emitting error logs for static pods created after the node becomes Ready:

E0324 00:46:36.333789    3307 kubelet.go:1690] Failed creating a mirror pod for "c-ip-192-168-23-17.us-west-2.compute.internal_default(18df3f29b7ba9816f25786dcb540583a)": admission webhook "mpod.vpc.k8s.aws" denied the request: Webhood encountered error to Get or List object from k8s cache.
E0324 00:46:43.330796    3307 kubelet.go:1690] Failed creating a mirror pod for "a-ip-192-168-23-17.us-west-2.compute.internal_default(7a59cd048f86ed600cb60ce139d6c6a7)": admission webhook "mpod.vpc.k8s.aws" denied the request: Webhood encountered error to Get or List object from k8s cache.
E0324 00:47:13.324773    3307 kubelet.go:1690] Failed creating a mirror pod for "b-ip-192-168-23-17.us-west-2.compute.internal_default(c642be10819ecae0d36bdd4f6efe8dbf)": admission webhook "mpod.vpc.k8s.aws" denied the request: Webhood encountered error to Get or List object from k8s cache.

This has no impact on the static-pods themselves. They are just not visible via the kubectl client. The pods are actually running fine on the node (as shown in the containerd tasks list). kubectl get events also shows the static pods starting up:

LAST SEEN   TYPE      REASON                   OBJECT                                                                   MESSAGE
20m         Normal    Pulling                  pod/a-ip-192-168-3-162.us-west-2.compute.internal                        Pulling image "nginx:1.14.1"
19m         Normal    Pulled                   pod/a-ip-192-168-3-162.us-west-2.compute.internal                        Successfully pulled image "nginx:1.14.1"
19m         Normal    Created                  pod/a-ip-192-168-3-162.us-west-2.compute.internal                        Created container nginx
19m         Normal    Started                  pod/a-ip-192-168-3-162.us-west-2.compute.internal                        Started container nginx
20m         Normal    Pulling                  pod/b-ip-192-168-3-162.us-west-2.compute.internal                        Pulling image "nginx:1.14.1"
19m         Normal    Pulled                   pod/b-ip-192-168-3-162.us-west-2.compute.internal                        Successfully pulled image "nginx:1.14.1"
19m         Normal    Created                  pod/b-ip-192-168-3-162.us-west-2.compute.internal                        Created container nginx
19m         Normal    Started                  pod/b-ip-192-168-3-162.us-west-2.compute.internal                        Started container nginx
20m         Normal    Pulling                  pod/c-ip-192-168-3-162.us-west-2.compute.internal                        Pulling image "nginx:1.14.1"
19m         Normal    Pulled                   pod/c-ip-192-168-3-162.us-west-2.compute.internal                        Successfully pulled image "nginx:1.14.1"
19m         Normal    Created                  pod/c-ip-192-168-3-162.us-west-2.compute.internal                        Created container nginx
19m         Normal    Started                  pod/c-ip-192-168-3-162.us-west-2.compute.internal                        Started container nginx

This happens for images before this fix as well. But I do not remember seeing this error when I was testing the static-pods feature for the 1.0.6 release.

The weird thing is, static-pods created via user-data are fine; kubelet is able to create a mirror pod for them without problem. This and the fact that the logs say "mpod.vpc.k8s.aws" denied the request makes me think a recent version of the VPC CNI plugin is causing this. I'm not entirely sure why though.

@etungsten etungsten requested a review from bcressey March 24, 2021 01:00
sources/api/static-pods/src/static_pods.rs Show resolved Hide resolved
sources/api/static-pods/src/static_pods.rs Outdated Show resolved Hide resolved
sources/api/static-pods/src/static_pods.rs Outdated Show resolved Hide resolved
@etungsten etungsten force-pushed the static-pod-file-creation branch from 4645ccb to eb41bc5 Compare March 24, 2021 18:06
@etungsten
Copy link
Contributor Author

Push above addresses @tjkirch 's comments. Tested the changes, results are the same as described in the PR testing description.

@etungsten etungsten requested a review from tjkirch March 24, 2021 18:07
Comment on lines 63 to 74
// Create the pod manifest file as a temporary file first and finish writing to it
// before swapping any files out
let mut temp_manifest_file = NamedTempFile::new_in(dir).context(error::CreateTempfile)?;
temp_manifest_file
.write(manifest)
.context(error::ManifestWrite { name })?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This fixes the case where a manifest exists that kubelet is watching, but it seems like it has the same potential problem for a newly created manifest: kubelet could try to load it before we've written any content, or before we've finished writing it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It loads any file in that directory regardless of name? I was assuming it'd look for at least some kind of pattern. If that's the case, we could put the temp file in a parent directory, or a new temp directory we create next to the target directory..?

@etungsten etungsten force-pushed the static-pod-file-creation branch from eb41bc5 to c9793f4 Compare March 24, 2021 19:42
@etungsten
Copy link
Contributor Author

Push above addresses @bcressey 's comment.
Now static pod manifests are first created and written to in an adjacent tmp directory before getting persisted in the target file path.

Tested with the same steps in the PR description, and things work as expected.

@etungsten etungsten requested a review from bcressey March 24, 2021 19:43
When writing static pod manifest, create them as temporary files and
finish writing to them before persisting them in the destination file
path.
@etungsten etungsten force-pushed the static-pod-file-creation branch from c9793f4 to bf996cb Compare March 24, 2021 21:27
@etungsten
Copy link
Contributor Author

Push above addresses #1409 (comment)

Tested things and they work as described in the PR description still. The temporary directory gets cleaned-up correctly when it goes out of scope.

@etungsten etungsten merged commit 003021a into bottlerocket-os:develop Mar 24, 2021
@etungsten etungsten deleted the static-pod-file-creation branch March 24, 2021 22:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

concurrent creation of static pods is error prone
4 participants