-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cluster doesn't restart when docker restarts #148
Comments
Isn't this standard behavior with docker containers? (remaining stopped?) kind does not run any daemon to mange the cluster, the commands create / delete "nodes" (containers) and run some tasks in them (like
What is the use case for this? These are meant to be transient test-clusters and it's probably not a good idea to restart the host daemon during testing. "Restarting" a cluster is probably going to just look like delete + create of the cluster. I'm not sure I'd even consider supporting this so much of a bug as a feature, "node" restarts are not really intended functionality currently. |
+1 to this question. docker restart in this case will act like a power grid restart on a bunch of bare metal machines. |
I've been using kind locally (using Docker for Mac) and when docker reboots or stops, the cluster has to be deleted and recreated. I'm perfectly fine with it, just thought this might be something we should look into. The use case was to keep the cluster around even after I reboot or shut down my machine / docker. |
Thanks for clarifying - this is certainly a hole in the usability but I'd hoped that clusters would be cheap enough to [create, use, delete] regularly. This might be a little non-trivial to resolve but is probably do-able. |
@BenTheElder: Please ensure the request meets the requirements listed here. If this request no longer meets these requirements, the label can be removed In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
I think I know how we can do this effectively, but I have no idea what to call the command that will fit with the rest of the CLI 🙃 cc @munnerz Something like |
|
It should roughly be:
It'll look similar to create but skip a lot of steps and swap creating the containers for list & {re}start We can also eventually have a very similar command like |
I like that approach, and the node restart also sounds nice and could cover other use cases. |
/remove-kind bug |
@BenTheElder I want to try it. /assign |
@tao12345666333: GitHub didn't allow me to assign the following users: tao12345666333. Note that only kubernetes-sigs members and repo collaborators can be assigned. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/lifecycle active |
for the impatient, this seems to work for now after docker restarts:
|
The make shared may not be required anymore, those are related to mount propagation functionality in kubelet / storage. It looks like with a tweak to how docker runs on the nodes we might not need those. We should check with
I've also been thinking about ways we can make things like kind/pkg/cluster/nodes/node.go Lines 177 to 181 in 6991cdc
|
👍 for the new |
The |
I will sent a PR next week. ( |
Looking forward! Is there any ticket for that, for tracking purposes? |
not yet. I will update to the progress here. |
@BenTheElder -- Many thanks! this will make our lives easier!!!. I was troubleshooting a weird Azure issue for the last couple of weeks, so had no time for anything else. But this is awesome news |
Do I understand correctly that I'll have to |
No, that's not the case, kind v0.8.0+ supports restarts for single node clusters (#148 (comment)). I've had clusters for months across many restarts. There is non data loss. There's a different tracking issue with problems around multi-node #1689 |
I have Kind 0.11.1 but a 3-node cluster (1 control-plane and 2 workers). It's not a HA cluster in the sense that it has only 1 control-plane node, but it's not single-node either. Should it survive a reboot? |
I can't figure out a way to restart my cluster: ╰─λ kind --help
kind creates and manages local Kubernetes clusters using Docker container 'nodes'
Usage:
kind [command]
Available Commands:
build Build one of [node-image]
completion Output shell completion code for the specified shell (bash, zsh or fish)
create Creates one of [cluster]
delete Deletes one of [cluster]
export Exports one of [kubeconfig, logs]
get Gets one of [clusters, nodes, kubeconfig]
help Help about any command
load Loads images into nodes
version Prints the kind CLI version
Flags:
-h, --help help for kind
--loglevel string DEPRECATED: see -v instead
-q, --quiet silence all stderr output
-v, --verbosity int32 info log verbosity, higher value produces more output
--version version for kind
Use "kind [command] --help" for more information about a command. I am on Arch Linux. |
As i understand it, the project has never supported multi-node clusters (only single nodes) but the documentation should really clearly specify this so that we aren't spending a lot of time doing complex multi-node work to find it doesn't survive a reboot or restart of docker. #1689 (comment) |
Was the "restart" functionality ever shipped? I am using version 0.14.0 and dont see "restart" option in help message. |
Multi-node restart has major fixes and should hopefully work fine in v0.15.0 ~soon when we get a release cut. #1689 is still open for a different subset problem around multiple control plane nodes specifically, which needs further investigation. You can test it now by using kind v0.14.0 with a recent image https://github.com/kubernetes-sigs/kind/pull/2874/files#diff-643b1e9d9e446aa30da4407354de0098f24c947ac985213a06f73188c3e8e3fcR21 or by installing kind from HEAD. We're still working on one or two remaining unrelated fixes in flight before dropping another tagged release
I think that's a misunderstanding, the functionality is if you restart docker / your host the cluster should restart (as in this issue), there's no need for a command, docker handles starting the containers again. Experimental Podman support does not have this support and likely won't because it doesn't have the related functionality due to design differences in podman. |
Same for:
This issue was to track:
And predates the limited podman support. which again is limited by lack of functionality in podman (proper hostname / domain name resolution for containers on the network, any way to specify container restart policy) #2715 would be the best place to keep up with discussing manually triggering a restart, though I suspect most potential use cases for manually causing a restart (e.g. testing your application during disruptions) have better alternatives than restarting the cluster. |
FTR: The latest releases should have clusters that come back up on docker restart, always, including multi node. |
On Kind with rootless podman, I don't have access to my cluster after I restart my computer either. |
This issue predates podman support, which is still considered experimental due to this feature gap and other stability issues. #2272 has more context on podman reboot. |
along with the "restart" functionality can we not also have a "suspend" capability which serializes the cluster and its state to what is effectively a hibernation file. That seems to be more effective than trying to patch up a cluster that may have had the rug pulled from under its feet. The serialization could be placed in a directory like ~/.local/share/containers/kind Velero backup can do this it saves the running state of not just the cluster, but the workloads running on it. It may work as a workaround... |
#2715 is the issue for stopping / starting. this issue is complete and closed issues are not closely monitored. |
* Add AWs Unmanaged Perms * Add DeleteInternetGateway * Remove comments * Add info
When docker restarts or stop/start (for any reason), the kind node containers remain stopped and aren't restarted properly. When I tried to run
docker restart <node container id>
the cluster didn't start either.The only solution seems to recreate the cluster at this point.
/kind bug
The text was updated successfully, but these errors were encountered: