Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[question] Right value for capabilities.FSIsolation in containerd-driver #9790

Closed
shishir-a412ed opened this issue Jan 13, 2021 · 10 comments
Closed

Comments

@shishir-a412ed
Copy link
Contributor

Hi,

We recently received a request on our containerd-driver repo to change filesystem isolation from None to Image

It seems like if I change FSIsolation from FSIsolationNone to FSIsolationImage, I start running into this error

I believe the reason for that is containerd-driver is responsible for mounting /alloc, /local and /secrets from the host filesystem into the container rootfs. If I change FSIsolation from FSIsolationNone to FSIsolationImage, containerd-driver no longer has access to host filesystem and I start seeing this error:

root@ip-10-102-12-129:~/go/src/github.com/Roblox/nomad-driver-containerd/tests# ctr task start 53d70354_cocky_mestorf0
ctr: OCI runtime create failed: container_linux.go:349: starting container process caused "process_linux.go:449: container init caused \"rootfs_linux.go:58: mounting \\\"/secrets\\\" to rootfs \\\"/run/containerd/io.containerd.runtime.v2.task/nomad/53d70354_cocky_mestorf0/rootfs\\\" at \\\"/secrets\\\" caused \\\"stat /secrets: no such file or directory\\\"\"": unknown

Questions:

  1. Is FSIsolationNone the right value? It seems to be working fine for us, and IIUC only the containerd-driver will have access to the host filesystem and not the allocations (containers). Is my understanding correct?

  2. The docker driver seems to be setting it to FSIsolationImage. How does the docker driver have access to the host filesystem?

TIA

@shishir-a412ed
Copy link
Contributor Author

ping @tgross @notnoop

@tgross
Copy link
Member

tgross commented Jan 13, 2021

Hi @shishir-a412ed! The FSIsolation field refers to the isolation of the running task, not the task driver. From Nomad's perspective it controls the behavior of the task directory build in the task_dir_hook. So if the containerd plugin creates the bind mounts then I'd expect it would want FSIsolationImage as well, similar to the docker and podman drivers. (Although I'm now seeing that the task driver docs for podman haven't been updated with the image isolation 😊. Will fix.)

From the plugin docs (ref):

The file system isolation options are:

  • FSIsolationImage: The task driver isolates tasks as machine images.
  • FSIsolationChroot: The task driver isolates tasks with chroot or pivot_root.
  • FSIsolationNone: The task driver has no filesystem isolation.

@shishir-a412ed
Copy link
Contributor Author

shishir-a412ed commented Jan 13, 2021

So if the containerd plugin creates the bind mounts then I'd expect it would want FSIsolationImage as well, similar to the docker and podman drivers.

@tgross Thanks for the clarification. So when I switch FSIsolation to FSIsolationImage it starts blowing up (See error above).
When I switch back to FSIsolationNone it goes back to ✅

The secrets mountpoint is setup here in the containerd-driver. I will continue debugging and see what's going on. Please let me know if you have any insights, or see something out of the blue that might be causing this error.

@tgross
Copy link
Member

tgross commented Jan 13, 2021

@shishir-a412ed I'm looking at this line containerd.go#L169 and I think the issue is that the source for the bind mount doesn't exist. The client creates the allocation working directory and the various child directories (secrets, task, alloc), and then those directories are used as the source of the bind by the driver.

It's worth comparing the containerd implementation to how it's done for the docker or podman driver.

@shishir-a412ed
Copy link
Contributor Author

@tgross I see, maybe some more changes are needed in the containerd-driver to support FSIsolationImage mode.

SM-smahajan628:example smahajan$ git diff
diff --git a/containerd/driver.go b/containerd/driver.go
index 72d46ae..ade1b99 100644
--- a/containerd/driver.go
+++ b/containerd/driver.go
@@ -118,8 +118,8 @@ var (
        capabilities = &drivers.Capabilities{
                SendSignals: true,
                Exec:        true,
-               FSIsolation: drivers.FSIsolationNone,
-               //FSIsolation:       drivers.FSIsolationImage,
+               //FSIsolation: drivers.FSIsolationNone,
+               FSIsolation:       drivers.FSIsolationImage,
                NetIsolationModes: []drivers.NetIsolationMode{drivers.NetIsolationModeGroup, drivers.NetIsolationModeTask},
        }
 )

Just switching the flag results in the error

Recent Events:
Time                       Type             Description
2021-01-13T10:32:30-08:00  Killing          Sent interrupt. Waiting 5s before force killing
2021-01-13T10:32:28-08:00  Alloc Unhealthy  Unhealthy because of failed task
2021-01-13T10:32:28-08:00  Not Restarting   Error was unrecoverable
2021-01-13T10:32:28-08:00  Driver Failure   rpc error: code = Unknown desc = Error in creating task: OCI runtime create failed: container_linux.go:349: starting container process caused "process_linux.go:449: container init caused \"rootfs_linux.go:58: mounting \\\"/secrets\\\" to rootfs \\\"/run/containerd/io.containerd.runtime.v2.task/nomad/77f1ddff_gracious_mcclintock4/rootfs\\\" at \\\"/secrets\\\" caused \\\"stat /secrets: no such file or directory\\\"\"": unknown
2021-01-13T10:32:27-08:00  Task Setup       Building Task Directory
2021-01-13T10:32:27-08:00  Received         Task received by client

@shishir-a412ed
Copy link
Contributor Author

Just to be clear 🙂 It works perfectly fine when the mode is set to FSIsolationNone (with the same codebase for containerd-driver).

From within the allocation (container)

root@ip-10-102-98-114:/# ls local secrets alloc -lt
secrets:
total 16
-rw-r--r-- 1 root root 83 Jan 13 18:55 secret-filename
-rw-r--r-- 1 root root 43 Jan 13 18:55 secret-filename.env
-rw-r--r-- 1 root root 64 Jan 13 18:55 secret-filename-policy
-rw-r--r-- 1 root root 26 Jan 13 18:55 vault_token

local:
total 0

alloc:
total 12
drwxrwxrwx 2 nobody nogroup 4096 Jan 13 18:55 logs
drwxrwxrwx 2 nobody nogroup 4096 Jan 13 18:55 data
drwxrwxrwx 2 nobody nogroup 4096 Jan 13 18:55 tmp
root@ip-10-102-98-114:/# mountpoint local
local is a mountpoint
root@ip-10-102-98-114:/# mountpoint secrets
secrets is a mountpoint
root@ip-10-102-98-114:/# mountpoint alloc
alloc is a mountpoint

@tgross
Copy link
Member

tgross commented Jan 13, 2021

I can verify I see the same error building from current HEAD of master on the Vagrant box provided with the following patch:

diff --git a/containerd/driver.go b/containerd/driver.go
index 336452b..22a3e10 100644
--- a/containerd/driver.go
+++ b/containerd/driver.go
@@ -26,7 +26,6 @@ import (
        "github.com/containerd/containerd"
        "github.com/containerd/containerd/namespaces"
        "github.com/hashicorp/consul-template/signals"
-       "github.com/hashicorp/go-hclog"
        log "github.com/hashicorp/go-hclog"
        "github.com/hashicorp/nomad/client/stats"
        "github.com/hashicorp/nomad/drivers/shared/eventer"
@@ -118,7 +117,7 @@ var (
        capabilities = &drivers.Capabilities{
                SendSignals:       true,
                Exec:              true,
-               FSIsolation:       drivers.FSIsolationNone,
+               FSIsolation:       drivers.FSIsolationImage,
                NetIsolationModes: []drivers.NetIsolationMode{drivers.NetIsolationModeGroup, drivers.NetIsolationModeTask},
        }
 )
@@ -363,7 +362,7 @@ func (d *Driver) StartTask(cfg *drivers.TaskConfig) (*drivers.TaskHandle, *drive

        driverConfig.setVolumeMounts(cfg)

-       d.logger.Info("starting task", "driver_cfg", hclog.Fmt("%+v", driverConfig))
+       d.logger.Info("starting task", "driver_cfg", log.Fmt("%+v", driverConfig))
        handle := drivers.NewTaskHandle(taskHandleVersion)
        handle.Config = cfg

But the paths that the driver gets are different for FSIsolationNone than they are for other drivers -- FSIsolationNone is for things like raw_exec. The paths are set in the task environment in task_dir_hook.go#L81-L91, and that task environment gets passed into buildTaskConfig, which in turns gets handed to the driver. So that's why you'll find the docker and podman drivers anchor the source paths on the task directory -- the driver is not being passed /var/nomad/alloc/:alloc_id/local, but just /local.

Unrelated, but I notice that secrets, local, and alloc directories are being bind-mounted read-only? Is that intentional?

@shishir-a412ed
Copy link
Contributor Author

@tgross Thanks for all the explanation and the links!
You are right! When I set capabilities.FSIsolation to FSIsolationNone I get the full paths.

Jan 14 17:11:32 vagrant nomad[28337]:     2021-01-14T17:11:32.877Z [INFO]  client.driver_mgr.containerd-driver: HELLO HELLO START TASK: driver=containerd-driver @module=containerd-driver timestamp=2021-01-14T17:11:32.877Z
Jan 14 17:11:32 vagrant nomad[28337]:     2021-01-14T17:11:32.877Z [INFO]  client.driver_mgr.containerd-driver: SECRETS DIR: /tmp/NomadClient210331364/142c46c6-20df-16b9-7079-ec74508f2d7a/redis-task/secrets
Jan 14 17:11:32 vagrant nomad[28337]: : driver=containerd-driver @module=containerd-driver timestamp=2021-01-14T17:11:32.877Z
Jan 14 17:11:32 vagrant nomad[28337]:     2021-01-14T17:11:32.877Z [INFO]  client.driver_mgr.containerd-driver: TASK DIR: /tmp/NomadClient210331364/142c46c6-20df-16b9-7079-ec74508f2d7a/redis-task/local
Jan 14 17:11:32 vagrant nomad[28337]: : driver=containerd-driver @module=containerd-driver timestamp=2021-01-14T17:11:32.877Z
Jan 14 17:11:32 vagrant nomad[28337]:     2021-01-14T17:11:32.877Z [INFO]  client.driver_mgr.containerd-driver: ALLOC DIR: /tmp/NomadClient210331364/142c46c6-20df-16b9-7079-ec74508f2d7a/alloc

However, when FSIsolation is set to FSIsolationImage I only get /secrets, /alloc, and /local which is causing the problem.

Jan 14 17:09:24 vagrant nomad[28035]:     2021-01-14T17:09:24.560Z [INFO]  client.driver_mgr.containerd-driver: HELLO HELLO START TASK: driver=containerd-driver @module=containerd-driver timestamp=2021-01-14T17:09:24.560Z
Jan 14 17:09:24 vagrant nomad[28035]:     2021-01-14T17:09:24.560Z [INFO]  client.driver_mgr.containerd-driver: SECRETS DIR: /secrets
Jan 14 17:09:24 vagrant nomad[28035]: : driver=containerd-driver @module=containerd-driver timestamp=2021-01-14T17:09:24.560Z
Jan 14 17:09:24 vagrant nomad[28035]:     2021-01-14T17:09:24.560Z [INFO]  client.driver_mgr.containerd-driver: TASK DIR: /local
Jan 14 17:09:24 vagrant nomad[28035]: : driver=containerd-driver @module=containerd-driver timestamp=2021-01-14T17:09:24.560Z
Jan 14 17:09:24 vagrant nomad[28035]:     2021-01-14T17:09:24.561Z [INFO]  client.driver_mgr.containerd-driver: ALLOC DIR: /alloc

I am using the podman driver as a reference and working on a patch, that should fix the problem.

Unrelated, but I notice that secrets, local, and alloc directories are being bind-mounted read-only? Is that intentional?

This seems to be a bug, and it should not be ro. I will change it rw. Thanks for the catch!

@tgross
Copy link
Member

tgross commented Jan 14, 2021

Good to hear. I'm going to close this issue but please let us know if you run into any more issues!

@github-actions
Copy link

I'm going to lock this issue because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Oct 25, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

2 participants