Skip to content
This repository has been archived by the owner on Dec 7, 2023. It is now read-only.

Detect available containerd-shim versions defaulting to legacy linux runtime #411

Merged
merged 3 commits into from
Sep 10, 2019

Conversation

stealthybox
Copy link
Contributor

closes #390
/kind bug

Docker ships with containerd, but even the newest versions of docker-ce ship a version of containerd.io-1.2.6 that is lacking the matching containerd-shim-runc-v1 binary for plugin.RuntimeRuncV1.
This client creation code calculates the matching binary names for our supported runtimes and attempts to do a fallback to the newest supported runtime by using the existence of that shim binary in the ignite-host's PATH as a heuristic for that runtime actually working.
It also adds support for the upcoming plugin.RuntimeRuncV2 which supports multiple containers per shim.

This solves a bug where our previous hard-coded default of RuncV1 causes ignite to fail to start a vm when using containerd packages that do not have the matching shim binary:

sudo ignite-0.6.0 run weaveworks/ignite-ubuntu
INFO[0000] Created VM with ID "1dbc72beaced7e96" and name "delicate-firefly" 
FATA[0000] failed to start container for VM "1dbc72beaced7e96": runtime "io.containerd.runc.v1" binary not installed "containerd-shim-runc-v1": file does not exist: unknown 

When the heuristic fails, we consider this a non-fatal error -- containerd may be running with a different PATH and mount namespace.
The U/X for that failure mode as of this patch looks like this:

sudo ignite run weaveworks/ignite-ubuntu
INFO[0000] Created VM with ID "ec8371f59d595017" and name "sparkling-wave" 
INFO[0001] Networking is handled by "cni"               
INFO[0001] Started Firecracker VM "ec8371f59d595017" in a container with ID "ignite-ec8371f59d595017" 

sudo mv /usr/bin/containerd-shim{,.disabled}

sudo ignite run weaveworks/ignite-ubuntu
ERRO[0000] a containerd-shim could not be found for runtimes: [io.containerd.runc.v2 io.containerd.runc.v1], io.containerd.runtime.v1.linux 
INFO[0000] Created VM with ID "5ee35502c3736f02" and name "dark-firefly" 
FATA[0000] failed to start container for VM "5ee35502c3736f02": failed to start shim: exec: "containerd-shim": executable file not found in $PATH: unknown 

Future Work:

  • Functions to check the runtimes should be added to containerd libraries to prevent coupling clients to containerd's filesystem and environment dependencies
  • A pre-flight check using code from preflight before start operation #360 could wrap this error.
  • A user-facing config struct for the containerd runtime string and options could be added.

@chanwit
Copy link
Member

chanwit commented Sep 8, 2019

ignite still not run out-of-the-box with docker's containerd somehow.
I think we also need to detect if it's gonna be vanilla containerd or docker's containerd, so we point to the correct sock.

$ sudo bin/ignite run weaveworks/ignite-ubuntu
FATA[0010] failed to dial "/run/containerd/containerd.sock": context deadline exceeded 
$ sudo service containerd start
$ sudo bin/ignite run weaveworks/ignite-ubuntu
FATA[0010] failed to dial "/run/containerd/containerd.sock": context deadline exceeded 
$ sudo bin/ignite version
FATA[0010] failed to dial "/run/containerd/containerd.sock": context deadline exceeded 
$ docker version
Client: Docker Engine - Community
 Version:           19.03.1
 API version:       1.40
 Go version:        go1.12.5
 Git commit:        74b1e89e8a
 Built:             Thu Jul 25 21:21:35 2019
 OS/Arch:           linux/amd64
 Experimental:      true

Server: Docker Engine - Community
 Engine:
  Version:          19.03.2
  API version:      1.40 (minimum version 1.12)
  Go version:       go1.12.8
  Git commit:       6a30dfc
  Built:            Thu Aug 29 05:26:54 2019
  OS/Arch:          linux/amd64
  Experimental:     true
 containerd:
  Version:          1.2.6
  GitCommit:        894b81a4b802e4eb2a91d1ce216b8817763c29fb
 runc:
  Version:          1.0.0-rc5
  GitCommit:        
 docker-init:
  Version:          0.18.0
  GitCommit:        fec3683

@chanwit
Copy link
Member

chanwit commented Sep 8, 2019

here's the docker's containerd config:

[grpc]
  address = "/var/run/docker/containerd/containerd.sock"
  tcp_address = ""
  tcp_tls_cert = ""
  tcp_tls_key = ""
  uid = 0
  gid = 0
  max_recv_message_size = 16777216
  max_send_message_size = 16777216

@chanwit chanwit mentioned this pull request Sep 8, 2019
@stealthybox
Copy link
Contributor Author

stealthybox commented Sep 9, 2019

@chanwit Are we using the same docker-ce package?
What OS are you using?
What config file is that?

We can write a heuristic by statting for commit containerd socket locations, but I'm curious how your system got configured with that path.

Here's socket info from my systemd units on Ubuntu 19.04.
You can see dockerd is just configured with the --containerd flag:

find /etc/systemd/ | egrep 'docker|container' | xargs cat | grep /run
ListenStream=/var/run/docker.sock
ExecStart=/usr/bin/dockerd -H fd:// --containerd=/run/containerd/containerd.sock

apt list docker-ce
Listing... Done
docker-ce/disco,disco,now 5:19.03.2~3-0~ubuntu-disco amd64 [installed]

@stealthybox stealthybox force-pushed the shim-detect branch 3 times, most recently from 21761c8 to 600f8eb Compare September 9, 2019 20:04
@chanwit chanwit requested review from chanwit and removed request for twelho September 10, 2019 12:05
@chanwit chanwit added kind/bug Categorizes issue or PR as related to a bug. kind/enhancement Categorizes issue or PR as related to improving an existing feature. priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. labels Sep 10, 2019
@chanwit chanwit added this to the v0.6.1 milestone Sep 10, 2019
@chanwit
Copy link
Member

chanwit commented Sep 10, 2019

Note from what we have discussed.

If we upgrade docker-ce from the very old versions, it's likely to have the different location of containerd than the newer ones.

pkg/runtime/containerd/client.go Outdated Show resolved Hide resolved
pkg/runtime/containerd/client.go Show resolved Hide resolved
@chanwit
Copy link
Member

chanwit commented Sep 10, 2019

Get blocked by preflight checks. Working on it.

@chanwit
Copy link
Member

chanwit commented Sep 10, 2019

@stealthybox shim detection is working very well on my machine. I think this is ready to be merged.
LGTM

@chanwit chanwit merged commit f58bdbc into weaveworks:master Sep 10, 2019
@morph027
Copy link

quick test with master build on linux:

19:51 $ sudo ignite version
Ignite version: version.Info{Major:"0", Minor:"6+", GitVersion:"v0.6.0-44+f58bdbc9433830", GitCommit:"f58bdbc943383050f3256785421bb0463cfbc807", GitTreeState:"clean", BuildDate:"2019-09-10T17:15:49Z", GoVersion:"go1.12.9", Compiler:"gc", Platform:"linux/amd64"}
Firecracker version: v0.18.0
Runtime: containerd
19:51 $ sudo ignite run weaveworks/ignite-ubuntu 
INFO[0001] Created VM with ID "3b36e896c2a8983c" and name "restless-pine" 
FATA[0001] [ERROR ]: Bin unmount is not in your PATH
diff --git a/pkg/constants/dependencies.go b/pkg/constants/dependencies.go
index ed76dc9..64ad872 100644
--- a/pkg/constants/dependencies.go
+++ b/pkg/constants/dependencies.go
@@ -2,7 +2,7 @@ package constants
 
 var Dependencies = [...]string{
        "mount",
-       "unmount",
+       "umount",
        "tar",
        "mkfs.ext4",
        "e2fsck",

should make it work ;)

@morph027
Copy link

morph027 commented Sep 10, 2019

my patched version fails with:

$ sudo ignite run weaveworks/ignite-ubuntu:latest
INFO[0001] Created VM with ID "b1216d96876e020c" and name "spring-bird" 
INFO[0001] Pulling image "weaveworks/ignite:dev"...     
FATA[0002] failed to resolve reference "docker.io/weaveworks/ignite:dev": docker.io/weaveworks/ignite:dev: not found 

@twelho
Copy link
Contributor

twelho commented Sep 10, 2019

Try running docker save weaveworks/ignite:dev | sudo ctr -n firecracker image import -
That should export the development image from Docker to containerd.

@morph027
Copy link

Okay, thank, that seems to have worked as another error is popping up ;)

FATA[0001] failed to start container for VM "b4e0f539cc6beecc": io.containerd.runc.v2: failed to connect: dial unix: missing address
: exit status 1: unknown

@morph027
Copy link

morph027 commented Sep 10, 2019

containerd was the one from docker-ce packages. Fetched latest rc containerd-1.3.0-rc.0.linux-amd64.tar.gz and just dirty unpacked into /usr/bin/ (stopped containerd before, started afterwards) and it works now.

@stealthybox
Copy link
Contributor Author

@morph027 I'm guessing you had containerd-shim-runc-v2 on your PATH while you were using the containerd.io@1.2.6 package (from docker-ce deps).

The runc v1 and v2 shims are not normally present with that package version, and the v2 shim is not compatible. If you want to use the apt package for containerd.io@1.2.6 again, just delete the v2 shim :)

@stealthybox stealthybox deleted the shim-detect branch September 10, 2019 23:16
@stealthybox
Copy link
Contributor Author

RE: #411 (comment)
Makefile updates addressing this in #417

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
kind/bug Categorizes issue or PR as related to a bug. kind/enhancement Categorizes issue or PR as related to improving an existing feature. priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Keep docker as the default runtime for a while
4 participants