-
Notifications
You must be signed in to change notification settings - Fork 611
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix Kata Containers networking support #817
Conversation
fc60055
to
a44730f
Compare
pkg/labels/labels.go
Outdated
// spec.State.Pid. | ||
// This is mostly used for VM based runtime, where the spec.State PID does not | ||
// necessarily lives in the created container networking namespace. | ||
NetworkNamespace = Prefix + "network-namespace" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIUC, this doesn't work with nerdctl run --restart=always
containers when the host OS rebooted, because the specified netns no longer exists
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not familiar with that part of the code base, so please correct me if/where I'm wrong:
After a reboot, nerdctl will create a new containerd task, which will trigger kata to create a new netns and update the spec.State.Annotations
accordingly. The OCI hooks will be called with that newly created netns, and nerdctl will use it to run the CNI plugins.
What am I missing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nerdctl will create a new containerd task
The new task is created by the restart monitor, not by nerdctl (unless user executes nerdctl manually): https://github.com/containerd/containerd/tree/v1.6.0-rc.4/runtime/restart/monitor
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you tested it with the restart monitor and it works, this PR LGTM, but the annotation probably shouldn't be defined in pkg/labels
, as it seems pure OCI annotation, not containerd label.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I started a container with --restart always
and verified that it was up and running after a reboot, with networking working as well.
the annotation probably shouldn't be defined in pkg/labels, as it seems pure OCI annotation, not containerd label.
You mean I should use a org.opencontainers
prefixed annotation?
AFAICS there are not runtime-spec annotations that I could re-use, and I must not use this namespace as I wish.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I started a container with
--restart always
and verified that it was up and running after a reboot, with networking working as well.
👍
the annotation probably shouldn't be defined in pkg/labels, as it seems pure OCI annotation, not containerd label.
You mean I should use a
org.opencontainers
prefixed annotation? AFAICS there are not runtime-spec annotations that I could re-use, and I must not use this namespace as I wish.
No, I’m talking about containerd labels vs OCI annotations.
nerdctl propagates containerd labels to OCI annotations, but not vice versa.
Lines 916 to 920 in f1aab17
func propagateContainerdLabelsToOCIAnnotations() oci.SpecOpts { | |
return func(ctx context.Context, oc oci.Client, c *containers.Container, s *oci.Spec) error { | |
return oci.WithAnnotations(c.Labels)(ctx, oc, c, s) | |
} | |
} |
The proposed annotation is not a containerd label, so it shouldn’t be defined in pkg/labels
package, but should be probably defined in pkg/ocihook
package.
The nerdctl/
prefix is fine, and it must NOT be org.opencontainers/
, but io.katacontainers/
might be even better?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The proposed annotation is not a containerd label, so it shouldn’t be defined in
pkg/labels
package, but should be probably defined inpkg/ocihook
package.The
nerdctl/
prefix is fine, and it must NOT beorg.opencontainers/
, butio.katacontainers/
might be even better?
Thanks for the explanation, that makes sense now.
Since the main and single consumer of this annotation is nerdctl, I'd rather keep the nerdctl/
prefix. I'll move the definition to the ocihook
package.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed the first commit to move the annotation key under ocihook.go
. Let me know if that's fine with you.
a44730f
to
ede470f
Compare
With VM based runtimes (e.g. Kata), the spec.State.Pid value does not necessarily runs in the container/pod networking namespace, but can also be in the host one. With those runtime, using the passed Pid to resolve the networking namespace for OCI plugins to be used result in setting the container network entirely in the host namespace. We add a "nerdct/network-namespace" for runtimes to explictly tell nerdctl which netns path to use instead of deriving it from the runtime PID. Signed-off-by: Samuel Ortiz <s.ortiz@apple.com>
When a runtime specify the labels.NetworkNamespace annotation, we use it over the passed Pid. That allows VM based runtimes to explictly use a networking namespace path they create. Fixes containerd#787 Signed-off-by: Samuel Ortiz <s.ortiz@apple.com>
ede470f
to
17d8c00
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks
@AkihiroSuda There's one rootless test that's failing. Should I be worried about it? |
Probably unrelated |
@liubin LGTY? |
This PR defines a new annotation label for getting the networking namespace to use when calling the CNI plugins.
When set by the calling runtime (through
spec.State.Annotations
), this label takes precedence over the PID based networking namespace resolution.This fixes the Kata Containers networking support, when running Kata with the following PR:
kata-containers/kata-containers#3670