-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
document how to run kind in a kubernetes pod #303
Comments
Issues go stale after 90d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
/remove-lifecycle stale |
this came up again in #677 and again today in another deployment |
see this about possibly inotify watch limits on the host and a work around #717 (comment) this issue may also apply to other linux hosts (non-kubernetes) |
For future reference, here's a working pod spec for running That being said, there should also be documentation for:
apiVersion: v1
kind: Pod
metadata:
name: dind-k8s
spec:
containers:
- name: dind
image: <image>
securityContext:
privileged: true
volumeMounts:
- mountPath: /lib/modules
name: modules
readOnly: true
- mountPath: /sys/fs/cgroup
name: cgroup
- name: dind-storage
mountPath: /var/lib/docker
volumes:
- name: modules
hostPath:
path: /lib/modules
type: Directory
- name: cgroup
hostPath:
path: /sys/fs/cgroup
type: Directory
- name: dind-storage
emptyDir: {}
|
Make sure you do |
That's pretty sane. As @howardjohn notes please make sure you clean up the top level containers in that pod (IE
It depends on your setup, with these mounts IIRC the processes / containers can leak. Don't do this. Have an exit handler, deleting the containers should happen within the grace period.
You shouldn't need this in CI, kind clusters should be ephemeral. Please, please use them ephemerally. There are a number of ways kind is not optimized for production long lived clusters. For temporary clusters used during a test this is a non-issue. Also note that turning on disk eviction risks your pods being evicted based on the disk usage of the host. There's a reason this is off by default. Eventually we will ship an alternative to make long lived clusters better, but for now it's best to not depend on long lived clusters or image GC.
DNS (see above). Your outer cluster's in-cluster DNS servers are typically on a clusterIP which won't necessarily be visible to the containers in the inner cluster. Ideally configure the "host machine" Pod's DNS to your preferred upstream DNS provider (see above). |
@BenTheElder thank you for pointing me in this issue - I am trying to see how we would fit @radu-matei's example into the testing automation we are introducing for our kubernetes project. Right now we want to trigger the creation of the cluster and the commands within that cluster from within a pod. I've tried creating a container that has docker and kind installed. I've tried creating a pod with the instructions provided above, but I still can't seem to run the
For testing I am currently creating the container, running The current pod specification I have is the following:
For explicitness, the way that I am installing Kind in the Dockerfile is as follows:
For explicitness, the way that I am installing Kubectl in the Dockerfile is as follows:
For explicitness, the way that I am installing Docker in the Dockerfile is as follows:
What should i make sure I take into account to make sure this works? |
Just wondering: won't mounting cgroups affect the pod's memoryRequests/memoryLimits? |
Also, would not https://kubernetes.io/docs/concepts/services-networking/dns-pod-service/#pod-s-dns-policy UPDATE: I'm running with |
Another question that I have, would cgroups v2 help to alleviate these requirements, especially on mounting |
Again, another question: would the regular docker cleanup be sufficient over $ docker ps --all --quiet | xargs --no-run-if-empty -- docker rm --force
$ docker system prune --all --volumes --all I'm planning to setup this as a post-hook for all my dind pods (because I can't control what people do inside them, so I can't force them to call Also, running EDIT: yes, it should be sufficient since this is what is being done in Kubernetes CI. |
@BenTheElder by your last sentence, do you mean that even though changes are in-place for the |
I would also mention another countermeasure, which is required in my case:
If using bash in the entrypoint, this can be achieved with: unset "${!KUBERNETES_@} |
DNS Default is a good choice. cgroupsv2 should have actual nesting but we've not had the opportunity to test for any kind issues nesting in this way. I still don't particularly recommend Kubernetes in Kubernetes. We have layers including cleanup hook to delete docker resources and an inner hook to delete the cluster. Deleting the clusters should generally be sufficient but the other hook is more general than kind. You can test deletion behaviors locally and observe pod removal. Yes re: dind. We disable service account automount for CI pods. I highly recommend this. |
So please add: automountServiceAccountToken: false To the issue description. And, still, these environment variables:
Gets injected no matter what (and there isn't a way to disable it): So, I still wonder, can't them still cause conflicts? |
Ok, funnily enough, if I add @jieyu's
If I understood well, as long as we properly cleanup any leftover containers/clusters before deleting the main pod, we don't need such adjustment. Right? In either case, I would really appreciate to hear if anyone has anything to say about the error I mentioned. The exact same thing also happens if I don't mount I discarded kubernetes/kubernetes#72878 because I'm running with kernel 5.4. But there is kubernetes/kubeadm#2335 which mentions something about fixes included in K8s 1.21. It could be the case, as I'm running K8s 1.20 (but I'm using the default 1.21 with kind 0.11.1 inside of the pod). |
As explained in kubernetes-sigs/kind#303 (comment).
It's not necessary to run kind in Kubernetes, however I recommend it for running any sort of CI in Kubernetes if you're going to do that, YMMV depending on exact use case for running kind in Kubernetes.
No, not as long as the service account credentials are not mounted at the well-known path. |
As explained in kubernetes-sigs/kind#303 (comment).
- Use `s6-setuidgid` instead of building a custom entrypoint with `shc` to use suid - By dropping the custom entrypoint, we also drop the automatic removal of `KUBERNETES_` variables, which [should not needed](kubernetes-sigs/kind#303 (comment))
Has anyone gotten this running on a host that uses cgroupsv2? edit: will post more details in a few minutes but I think I have a solution. |
I have to say I switched to k3d, which required no additional volume mounts nor any specific workarounds to work. I'm still using cgroupsv1 in the nodes, but maybe it's worth trying k3d on a cgroupsv2 one. |
It shouldn't require any additional mounts these days. This issue is quite old. It still remains a huge pile of footguns to run any sort of nested Kubernetes kind, k3d or othwerise:
... |
|
That's what I do: https://github.com/felipecrs/jenkins-agent-dind#:~:text=Then%20you%20can%20use%20the%20following%20Pod%20Template%3A (in my case docker data dir is remapped to the agent workspace). Additionally I use https://github.com/cndoit18/lxcfs-on-kubernetes to fake |
/help We should start putting together a new site under the docs where we can at least clearly keep this updated and just put a warning note at the top about the security implications aside from the footguns. We can iterate better on a markdown page than this thread, it's long overdue. |
@BenTheElder: GuidelinesPlease ensure that the issue body includes answers to the following questions:
For more details on the requirements of such an issue, please see here and ensure that they are met. If this request no longer meets these requirements, the label can be removed In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
I can give it a shot if you guys don't mind, I can start next week, and I will mostly need some help reviewing, is that ok? @BenTheElder |
@howardjohn we found some problems with cgroupv2 only nodes: kubernetes/kubernetes#119853 |
NOTE: We do NOT recommend doing this if it is at all avoidable. We don't have another option so we do it ourselves, but it has many footguns.
xref: #284
additionally these mounts are known to be needed:
thanks to @maratoid
/kind documentation
/priority important-longterm
We probably need a new page in the user guide for this.
EDIT: Additionally, for any docker in docker usage the docker storage (typically
/var/lib/docker
) should be a volume. A lot of attempts at using kind in Kubernetes seem to miss this one. Typically anemptyDir
is suitable for this.EDIT2: you also probably want to set a pod DNS config to some upstream resolvers so as not to have your inner cluster pods trying to talk to the outer cluster's DNS which is probably on a clusterIP and not necessarily reachable.
EDIT3: Loop devices are not namespaced, follow from #1248 to find our current workaround
The text was updated successfully, but these errors were encountered: