Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Require the mount destination to be created before hand #591

Closed
wants to merge 1 commit into from

Conversation

mrunalp
Copy link
Contributor

@mrunalp mrunalp commented Oct 18, 2016

This will allow the rootfs to be truly read only (like a NFS read only mount for e.g.)

cc: @rhvgoyal @philips
Signed-off-by: Mrunal Patel mrunalp@gmail.com

Signed-off-by: Mrunal Patel <mrunalp@gmail.com>
@mrunalp
Copy link
Contributor Author

mrunalp commented Oct 18, 2016

Another alternative would be to add a flag to CLI API to make this optional.

@@ -47,6 +47,7 @@ The parameters are similar to the ones in [the Linux mount system call](http://m

* **`destination`** (string, REQUIRED) Destination of mount point: path inside container.
For the Windows operating system, one mount destination MUST NOT be nested within another mount (e.g., c:\\foo and c:\\foo\\bar).
The destination MUST be present in the root filesystem. Runtime MUST NOT create the destination.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

“Runtime” → “The runtime”.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, one sentence per line.

@wking
Copy link
Contributor

wking commented Oct 18, 2016 via email

@@ -47,6 +47,7 @@ The parameters are similar to the ones in [the Linux mount system call](http://m

* **`destination`** (string, REQUIRED) Destination of mount point: path inside container.
For the Windows operating system, one mount destination MUST NOT be nested within another mount (e.g., c:\\foo and c:\\foo\\bar).
The destination MUST be present in the root filesystem. Runtime MUST NOT create the destination.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why can't we just make this a restriction which only applies if readonly is set to true for the rootfs. I get why you'd want a sweeping statement like this, but as @wking mentioned it does make nested tmpfs mounts quite hard -- and in fact would make runC's method of handling cgroups not really possible anymore.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, the "MUST be present in the root filesystem" means that you can't really have nested tmpfs mounts anyway. Because /tmp/a/b must already have existed in the host root filesystem in order for you to mount it (which would require making the directory). Maybe we should make it something like:

The destination MUST already be present if the destination mountpoint is on the root filesystem.
Runtimes MUST NOT modify the container's root filesystem to fulfil destination requirements.

@rhatdan
Copy link
Contributor

rhatdan commented Oct 19, 2016

I believe the goal here is that the base rootfs is readonly. Mounting multiple directories on top of a tmpfs should be allowed. Getting the wording correct on this is paramount.
We want to use runc on the same rootfs for multiple containers, so it can not be scribbling all over the rootfs.

@cyphar
Copy link
Member

cyphar commented Oct 19, 2016

@rhatdan What do you think of my proposed wording in #591 (comment)?

@rhvgoyal
Copy link

May be make it per mount property where callers can specify whether to create destination file/dir or not if it is not present. That should be generic enough to take care of wide variety of use cases.

@rhatdan
Copy link
Contributor

rhatdan commented Oct 19, 2016

@cyphar I like your wording.

@wking
Copy link
Contributor

wking commented Oct 19, 2016

Can we get more details on how this would break runC's cgroup
handling 1? It would be unfortunate to completely break runC
containers where the rootfs is readonly and the only mount.

On Wed, Oct 19, 2016 at 05:33:48AM -0700, Vivek Goyal wrote:

May be make it per mount property where callers can specify whether
to create destination file/dir or not if it is not present.

I like this better than @cyphar's wording 2, because it lets folks
safely share filesystems besides their root one (e.g. maybe they want
to mount the same /etc into a bunch of containers). Even
Windows uses ‘ro’ to represent read-only for other mounts [3,4], so we
could lean on that if we do end up tying this restriction to
read-only-ness.

@mrunalp
Copy link
Contributor Author

mrunalp commented Oct 19, 2016

I think making it a per mount config may give most flexibility as @rhvgoyal suggested.

@crosbymichael
Copy link
Member

I don't like having the option. Why can the runtime not just set to create it if it does not exist and if it cannot it errors. That still leaves it up to the high level to create these directories if its going to have an RO rootfs and if it does not the runtime gives a error.

I think an additional field per mount just adds more complexity than just erroring if it does not exist and it cannot be created.

@crosbymichael
Copy link
Member

Besides, you are still not solving the real problem of having a shared rootfs with volumes when volumes are a container level config and the rootfs is a shared object across containers. Even if you create these in some "init" layer, containers without a volume at the same path will still have these dirs/files showing up in other containers because the rootfs+init layer is shared.

@rhvgoyal
Copy link

@crosbymichael I will not create directories for read-only containers and expect them to be part of the image. That way applications will not have any surprise.

I think this requires little thought process shift. directories being mounted on need to be part of the image for read-only containers.

Alternatively, we users could think of mounting a tmpfs and creating directories in that tmpfs and doing all the volume mounting there for read-only containers?

@crosbymichael
Copy link
Member

@rhvgoyal ok, that makes more sense. If that is the case I don't see why we need a spec change as it raises more questions about nested mounts.

The runtimes can stay the same and when you mount from the filesystem's you just make it readonly. If the path does not exist the runtime tries to create the dir and errors as a read-only filesystem error and the image author should know that they need to create these dirs or they need to remove a volume mount.

Modifying the spec or adding a new field to tell the runtime to create the dir or not is just splitting the information in two places. As an image/container author I have to both, create a dir in the image. Tell the spec, dont create this dir. It should just work as expected. I don't see a spec change adding any value to solving this problem.

@rhvgoyal
Copy link

@crosbymichael so you are suggesting that caller can leave rootfs read-only so that if runc tries to create a file/dir it fails? We can try it. Hopefully it works with pivot_root(".", ".") stuff.

@crosbymichael
Copy link
Member

I mean, if we had this spec MUST how are you supposed to validate or what type of error are you going to return anyways? It would basically be the same with or without.

@mrunalp
Copy link
Contributor Author

mrunalp commented Oct 19, 2016

The runtimes can stay the same and when you mount from the filesystem's you just make it readonly. If the path does not exist the runtime tries to create the dir and errors as a read-only filesystem error and the image author should know that they need to create these dirs or they need to remove a volume mount.

This should work. It may require some changes to runc like the pivot_root change.

@crosbymichael
Copy link
Member

@mrunalp ya, and doing a stat before a mkdirall. I cannot remember if mkdirall returns an error if the underlaying filesystem is RO but the dir exists. Just something to double check

@mrunalp
Copy link
Contributor Author

mrunalp commented Oct 19, 2016

Closing this per the discussion on OCI weekly call today. We can handle this in the runtime (runc). Thanks!

@mrunalp mrunalp closed this Oct 19, 2016
@wking wking mentioned this pull request May 24, 2017
wking added a commit to wking/ccon that referenced this pull request Feb 26, 2018
This makes it possible to mount content inside a tmpfs.  For example,
systemd wants a tmpfs at /dev [1] and you may want to also mount
/dev/pts, /dev/shm, etc.  Or you may want to put a tmpfs at
/sys/fs/cgroup and a cgroup filesystem at /sys/fs/cgroup/systemd.
There's also some background discussion in [2].

[1]: https://www.freedesktop.org/wiki/Software/systemd/ContainerInterface/
[2]: opencontainers/runtime-spec#591
     Subject: Require the mount destination to be created before hand
wking added a commit to wking/ccon that referenced this pull request Feb 26, 2018
This makes it possible to mount content inside a tmpfs.  For example,
systemd wants a tmpfs at /dev [1] and you may want to also mount
/dev/pts, /dev/shm, etc.  Or you may want to put a tmpfs at
/sys/fs/cgroup and a cgroup filesystem at /sys/fs/cgroup/systemd.
There's also some background discussion in [2].

[1]: https://www.freedesktop.org/wiki/Software/systemd/ContainerInterface/
[2]: opencontainers/runtime-spec#591
     Subject: Require the mount destination to be created before hand
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants