Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support to checkpoint and restore into external network namespaces #1849

Merged
merged 2 commits into from
Aug 23, 2018

Conversation

adrianreber
Copy link
Contributor

With the initial CRIU patches merged to support checkpointing and restoring a container into a external defined network namespace, these are the necessary runc changes to honour an external network namespace (something like "path": "/run/netns/test").

Details about the existing problem are described at #1786

There are still two CRIU patches under review to complete CRIU's RPC interface which is used by runc.

Once all the necessary patches are merged and a new CRIU release is available with these patches the CRIU version in runc's travis test can be updated and from that point on the newly added test case in checkpoint.bats should be automatically activated. Currently it detects the missing CRIU functionality and is skipped. (@avagin FYI)

@crosbymichael
Copy link
Member

This fixes issues when the container is running in the host namespace right?

@adrianreber
Copy link
Contributor Author

I guess no. If starting a container with

	{
		"type": "network",
		"path": "/run/netns/test"
	},

the container is running in the specified network namespace. If I checkpoint and restore the container it is no longer running in that specified network namespace, but in some network namespace CRIU created. So if the network namespace is configured before starting (or restoring) runc the restored container is running somewhere else and all the settings in the specified network namespace are not used.

According to the code (I have not verified it) CRIU can restore the network adapter correctly in the network namespace, but it will be a different namespace than defined in config.json.

The use case is, I am setting up a network namespace, and starting a container. After restore the container is still running in exactly that namespace and not some random namespace created by CRIU.

I also had a look how LXC does it and it seems LXC does not really care about the name of the network namespace. But as runc has the option to specify an external network namespace it makes sense for the restored container to run again in that network namespace.

Is that a better explanation?

@adrianreber
Copy link
Contributor Author

Please do not merge yet, we are still discussing the details on the CRIU level.

@adrianreber
Copy link
Contributor Author

I force-pushed the final version. Sorry for the confusion. All necessary patches are now part of CRIU and as soon as runc updates its CRIU version the new test case will be run. Depending on the outstanding reviews this is now ready to be merged.

CC @avagin (as he was heavily involved in getting the interface between CRIU and runc right. Thanks.)

@crosbymichael
Copy link
Member

crosbymichael commented Aug 21, 2018

LGTM

Approved with PullApprove

@crosbymichael
Copy link
Member

ping @avagin

if err != nil {
return err
}
criuExternal := fmt.Sprintf("net[%d]:%s", netns.Ino, path.Base(nsPath))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should not use path.Base(nsPath) as a key, it doesn't contain any useful information and it probably can be changed, if we will restore a container on another host. I think we can use a constant string for this. For example, it can be "extRootNetNS".

netns, err := os.Open(nsPath)
defer netns.Close()
if err != nil {
logrus.Error("If a specific network namespace is defined it must exist.")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can print an error in this message
logrus.Error("If a specific network namespace is defined it must exist: %s", err)

}
inheritFd := new(criurpc.InheritFd)
inheritFd.Key = proto.String(path.Base(nsPath))
inheritFd.Fd = proto.Int32(int32(4 + len(extraFiles)))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to add a comment here, which explains what this 4 means. 0-2 is the standard file descriptors, 3 is a swrk socket.

Using CRIU to checkpoint and restore a container into an existing
network namespace is not possible.

If the network namespace is defined like

	{
		"type": "network",
		"path": "/run/netns/test"
	}

there is the expectation that the restored container is again running in
the network namespace specified with 'path'.

This adds the new CRIU 'external namespace' feature to runc, where
during checkpointing that specific namespace is referenced and during
restore CRIU tries to restore the container in exactly that
namespace.

This breaks/fixes current runc behavior. If, without this patch, runc
restores a container with such a network namespace definition, it is
ignored and CRIU recreates a network namespace without a name.

With this patch runc uses the network namespace path (if available) to
checkpoint and restore the container in just that network namespace.

Restore will now fail if a container was checkpointed with a network
namespace path set and if that network namespace path does not exist
during restore.

runc still falls back to the old behavior if CRIU older than 3.11 is
installed.

Fixes opencontainers#1786

Related to containers/podman#469

Thanks to Andrei Vagin for all the help in getting the interface between
CRIU and runc right!

Signed-off-by: Adrian Reber <areber@redhat.com>
This adds a new CRIU based checkpoint/restore test to check if
the restored container runs in the same network namespace as before.

Signed-off-by: Adrian Reber <areber@redhat.com>
@adrianreber
Copy link
Contributor Author

@avagin With your suggested changes force pushed.

@crosbymichael
Copy link
Member

crosbymichael commented Aug 22, 2018

LGTM

Approved with PullApprove

1 similar comment
@avagin
Copy link
Contributor

avagin commented Aug 23, 2018

LGTM

Approved with PullApprove

@crosbymichael crosbymichael merged commit 459bfae into opencontainers:master Aug 23, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants