Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Kata Containers networking support #817

Merged
merged 2 commits into from
Feb 16, 2022

Conversation

sameo
Copy link

@sameo sameo commented Feb 15, 2022

This PR defines a new annotation label for getting the networking namespace to use when calling the CNI plugins.
When set by the calling runtime (through spec.State.Annotations), this label takes precedence over the PID based networking namespace resolution.

This fixes the Kata Containers networking support, when running Kata with the following PR:
kata-containers/kata-containers#3670

// spec.State.Pid.
// This is mostly used for VM based runtime, where the spec.State PID does not
// necessarily lives in the created container networking namespace.
NetworkNamespace = Prefix + "network-namespace"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC, this doesn't work with nerdctl run --restart=always containers when the host OS rebooted, because the specified netns no longer exists

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not familiar with that part of the code base, so please correct me if/where I'm wrong:

After a reboot, nerdctl will create a new containerd task, which will trigger kata to create a new netns and update the spec.State.Annotations accordingly. The OCI hooks will be called with that newly created netns, and nerdctl will use it to run the CNI plugins.

What am I missing?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nerdctl will create a new containerd task

The new task is created by the restart monitor, not by nerdctl (unless user executes nerdctl manually): https://github.com/containerd/containerd/tree/v1.6.0-rc.4/runtime/restart/monitor

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you tested it with the restart monitor and it works, this PR LGTM, but the annotation probably shouldn't be defined in pkg/labels, as it seems pure OCI annotation, not containerd label.

Copy link
Author

@sameo sameo Feb 15, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I started a container with --restart always and verified that it was up and running after a reboot, with networking working as well.

the annotation probably shouldn't be defined in pkg/labels, as it seems pure OCI annotation, not containerd label.

You mean I should use a org.opencontainers prefixed annotation?
AFAICS there are not runtime-spec annotations that I could re-use, and I must not use this namespace as I wish.

Copy link
Member

@AkihiroSuda AkihiroSuda Feb 15, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I started a container with --restart always and verified that it was up and running after a reboot, with networking working as well.

👍

the annotation probably shouldn't be defined in pkg/labels, as it seems pure OCI annotation, not containerd label.

You mean I should use a org.opencontainers prefixed annotation? AFAICS there are not runtime-spec annotations that I could re-use, and I must not use this namespace as I wish.

No, I’m talking about containerd labels vs OCI annotations.
nerdctl propagates containerd labels to OCI annotations, but not vice versa.

nerdctl/cmd/nerdctl/run.go

Lines 916 to 920 in f1aab17

func propagateContainerdLabelsToOCIAnnotations() oci.SpecOpts {
return func(ctx context.Context, oc oci.Client, c *containers.Container, s *oci.Spec) error {
return oci.WithAnnotations(c.Labels)(ctx, oc, c, s)
}
}

The proposed annotation is not a containerd label, so it shouldn’t be defined in pkg/labels package, but should be probably defined in pkg/ocihook package.

The nerdctl/ prefix is fine, and it must NOT be org.opencontainers/, but io.katacontainers/ might be even better?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The proposed annotation is not a containerd label, so it shouldn’t be defined in pkg/labels package, but should be probably defined in pkg/ocihook package.

The nerdctl/ prefix is fine, and it must NOT be org.opencontainers/, but io.katacontainers/ might be even better?

Thanks for the explanation, that makes sense now.
Since the main and single consumer of this annotation is nerdctl, I'd rather keep the nerdctl/ prefix. I'll move the definition to the ocihook package.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed the first commit to move the annotation key under ocihook.go. Let me know if that's fine with you.

Samuel Ortiz added 2 commits February 15, 2022 17:49
With VM based runtimes (e.g. Kata), the spec.State.Pid value does not
necessarily runs in the container/pod networking namespace, but can also
be in the host one.

With those runtime, using the passed Pid to resolve the networking
namespace for OCI plugins to be used result in setting the container
network entirely in the host namespace. We add a
"nerdct/network-namespace" for runtimes to explictly tell nerdctl which
netns path to use instead of deriving it from the runtime PID.

Signed-off-by: Samuel Ortiz <s.ortiz@apple.com>
When a runtime specify the labels.NetworkNamespace annotation, we use it
over the passed Pid. That allows VM based runtimes to explictly use a
networking namespace path they create.

Fixes containerd#787

Signed-off-by: Samuel Ortiz <s.ortiz@apple.com>
Copy link
Member

@AkihiroSuda AkihiroSuda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks

@sameo
Copy link
Author

sameo commented Feb 15, 2022

@AkihiroSuda There's one rootless test that's failing. Should I be worried about it?

@AkihiroSuda
Copy link
Member

@AkihiroSuda There's one rootless test that's failing. Should I be worried about it?

Probably unrelated

@AkihiroSuda
Copy link
Member

@liubin LGTY?

@AkihiroSuda AkihiroSuda added this to the v0.17.0 milestone Feb 16, 2022
@AkihiroSuda AkihiroSuda merged commit 655f213 into containerd:master Feb 16, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants