-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support to checkpoint and restore into external network namespaces #1849
Conversation
This fixes issues when the container is running in the host namespace right? |
I guess no. If starting a container with
the container is running in the specified network namespace. If I checkpoint and restore the container it is no longer running in that specified network namespace, but in some network namespace CRIU created. So if the network namespace is configured before starting (or restoring) runc the restored container is running somewhere else and all the settings in the specified network namespace are not used. According to the code (I have not verified it) CRIU can restore the network adapter correctly in the network namespace, but it will be a different namespace than defined in The use case is, I am setting up a network namespace, and starting a container. After restore the container is still running in exactly that namespace and not some random namespace created by CRIU. I also had a look how LXC does it and it seems LXC does not really care about the name of the network namespace. But as runc has the option to specify an external network namespace it makes sense for the restored container to run again in that network namespace. Is that a better explanation? |
Please do not merge yet, we are still discussing the details on the CRIU level. |
I force-pushed the final version. Sorry for the confusion. All necessary patches are now part of CRIU and as soon as runc updates its CRIU version the new test case will be run. Depending on the outstanding reviews this is now ready to be merged. CC @avagin (as he was heavily involved in getting the interface between CRIU and runc right. Thanks.) |
ping @avagin |
libcontainer/container_linux.go
Outdated
if err != nil { | ||
return err | ||
} | ||
criuExternal := fmt.Sprintf("net[%d]:%s", netns.Ino, path.Base(nsPath)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should not use path.Base(nsPath) as a key, it doesn't contain any useful information and it probably can be changed, if we will restore a container on another host. I think we can use a constant string for this. For example, it can be "extRootNetNS".
libcontainer/container_linux.go
Outdated
netns, err := os.Open(nsPath) | ||
defer netns.Close() | ||
if err != nil { | ||
logrus.Error("If a specific network namespace is defined it must exist.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can print an error in this message
logrus.Error("If a specific network namespace is defined it must exist: %s", err)
libcontainer/container_linux.go
Outdated
} | ||
inheritFd := new(criurpc.InheritFd) | ||
inheritFd.Key = proto.String(path.Base(nsPath)) | ||
inheritFd.Fd = proto.Int32(int32(4 + len(extraFiles))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to add a comment here, which explains what this 4 means. 0-2 is the standard file descriptors, 3 is a swrk socket.
Using CRIU to checkpoint and restore a container into an existing network namespace is not possible. If the network namespace is defined like { "type": "network", "path": "/run/netns/test" } there is the expectation that the restored container is again running in the network namespace specified with 'path'. This adds the new CRIU 'external namespace' feature to runc, where during checkpointing that specific namespace is referenced and during restore CRIU tries to restore the container in exactly that namespace. This breaks/fixes current runc behavior. If, without this patch, runc restores a container with such a network namespace definition, it is ignored and CRIU recreates a network namespace without a name. With this patch runc uses the network namespace path (if available) to checkpoint and restore the container in just that network namespace. Restore will now fail if a container was checkpointed with a network namespace path set and if that network namespace path does not exist during restore. runc still falls back to the old behavior if CRIU older than 3.11 is installed. Fixes opencontainers#1786 Related to containers/podman#469 Thanks to Andrei Vagin for all the help in getting the interface between CRIU and runc right! Signed-off-by: Adrian Reber <areber@redhat.com>
This adds a new CRIU based checkpoint/restore test to check if the restored container runs in the same network namespace as before. Signed-off-by: Adrian Reber <areber@redhat.com>
@avagin With your suggested changes force pushed. |
With the initial CRIU patches merged to support checkpointing and restoring a container into a external defined network namespace, these are the necessary runc changes to honour an external network namespace (something like
"path": "/run/netns/test"
).Details about the existing problem are described at #1786
There are still two CRIU patches under review to complete CRIU's RPC interface which is used by runc.
Once all the necessary patches are merged and a new CRIU release is available with these patches the CRIU version in runc's travis test can be updated and from that point on the newly added test case in
checkpoint.bats
should be automatically activated. Currently it detects the missing CRIU functionality and is skipped. (@avagin FYI)