-
Notifications
You must be signed in to change notification settings - Fork 882
sandbox/app-add: fix mount targets with absolute symlink targets #3490
Conversation
Hmm. |
// https://github.com/systemd/systemd/blob/v231/src/core/mount-setup.c#L392 | ||
// We set up the temporary playground as a slave bind mount to avoid this | ||
// limitation. | ||
func addMountStage0( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the reasoning behind moving this out of common
? It really seems like library code, not main
code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is used only in app-add and very specific to systemd stage1.
@squeed I don't quite understand the question. What do you mean by rewrite? The problem is that the stage2 rootfs may contain absolute symlinks so we have to chroot into that environment. |
Let me make sure I understand the problem: You bind mount something with a destination of |
@squeed that is the problem, yes and So for the previous example we actually have a bug in I can change this PR to use |
Regarding your comment to do a double This is not so easily possible since the incoming directory inside stage1 is of type |
I need to reiterate on this code anyways, and we still are discussing alternative implementations. |
I don't mean a double |
case 1: | ||
err = appAddStage1(appName, uuid, flagTarget) | ||
default: | ||
log.Fatalf("unkown stage %d", flagStage) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unknown
|
||
mounts, err := stage1init.GenerateMounts(ra, p.Manifest.Volumes, stage1init.ConvertedFromDocker(imageManifest)) | ||
if err != nil { | ||
return errwrap.Wrapf("Cou not generate mounts", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could
// symlinks will be wrong. | ||
mntPath, err := stage1init.EvaluateSymlinksInsideApp(appRootfs, m.Mount.Path) | ||
if err != nil { | ||
return errwrap.Wrap(fmt.Errorf("Could not evaluate path %v", m.Mount.Path), err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wrapf, could
|
||
err = mnt.Mount(src, pg.Playground(), "bind", syscall.MS_BIND, "") | ||
if err != nil { | ||
return errwrap.Wrapf("mount move src to rkt.propagate.stage1/mount failed", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this isn't really a move yet?
6f1f972
to
e82e721
Compare
e82e721
to
b87c707
Compare
// this function behaves like `readlink -m`. | ||
// | ||
// Unlike `readlink` EvalSymlinksAlways might return a relative path. | ||
func EvalSymlinksAlways(path string) (string, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Concern from my side: This is from the Go stdlib: https://godoc.org/path/filepath#EvalSymlinks. The problem with the stdlib-EvalSymlinks method is, that it err's in case the target link doesn't exist (readlink -e
behavior).
We need readlink -m
behavior, where path components do not need to exist, because we create the directories after all.
We could introduce a mkdir -p
behavior in the symlink evaluation method before invoking Go's stdlib EvalSymlinks
, but this would require yet more bigger refactorings, because currently rkt implements those two concerns "evaluating symlinks" and "ensuring the target exists" separately, hence this is (currently) the lesser evil.
gah, I hit some regression :-( |
Fortunately not a regression, two quite old unit tests TestAppToNspawnArgsRecursive and TestAppToNspawnArgsOverridesImageManifestReadOnly are broken now, because:
Violation of Invariant 2. was easy to fix but 1. not. Both tests check if nspawn args were generated correctly using a regex: https://github.com/coreos/rkt/blob/v1.22.0/stage1/init/common/pod_test.go#L171 I made a quick pass on the functional test suite and saw we do some read-only mount tests there and I think the unit tests above are questionable per se. I disabled the tests in a separate commit to assert, if there are more failures. |
@@ -778,3 +758,61 @@ func protectKernelTunables(opts []*unit.UnitOption, appName types.ACName, system | |||
|
|||
return opts | |||
} | |||
|
|||
// chroot is the struct that represents a chroot environment | |||
type chroot struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe move this to pkg
too?
Oh well, we introduced sandbox mount tests, and they apparently fail consistently on semaphore:
I can't reproduce this locally so I'll have to research on semaphore directly :-/ |
case dir == "": | ||
newpath, _, err := walkLink(file, linksWalked) | ||
return newpath, err | ||
case file == "": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hah. I hate this behavior of Split
so much.
link = filepath.Join(link, p) | ||
continue | ||
} | ||
chroot, err := newChroot(appRootfs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does it make sense to defer chroot.Escape()
here? So our root is where we want it to be if we error out?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good point, I was thinking of it myself. But if Escape()
fails, we're in a limbo state, no?
make clean target failures: I guess my absolute links to |
ee0246a
to
e48306a
Compare
This adds a new method for resolving symlinks. This is needed for rkt#3483
Currently resolving symlinks in stage1 is broken for a couple of edge cases, most prominently absolute symlinks which escape the stage2 rootfs. This fixes it by evaluating symlinks in chroot'ed environment in the stage2 rootfs. Fixes rkt#3483
This renames the above function to be more idiomatic.
Since EvaluateSymlinksInsideApp requires chroot now, these tests fail. This disables them.
3b52941
to
91cc9cf
Compare
current strategy: disable the Semaphore doesn't obey my env var, hence investigating. Maybe there's a missing |
91cc9cf
to
6720ad8
Compare
This test fails on semaphore due to an old kernel and thus is disabled there via an env variable.
6720ad8
to
2eb98e5
Compare
if !strings.HasPrefix(link, appRootfs) { | ||
return "", fmt.Errorf("symlink %q escapes app's root with value %q", next, target) | ||
} | ||
if err := chroot.escape(); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Related to @squeed concern above, better to move this right up after EvalSymlinksAlways
and do error checking and early returning only after it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I move this right after EvalSymlinksAlways
, then filepath.Abs
will be invoked in the outer rootfs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then nevermind, my previous comment is wrong. i thought FS-related operation were confined to EvalSymlinksAlways
.
LGTM. I re-triggered jenkins for the flake. |
The debian-testing, coreos failure has been seen before in #3224 (comment). The logs for the current failure are here. Relevant output:
I believe this is unlreated to this PR since it was already green in a previous run for this flavor, but I retriggered the build. |
retriggered build is green, hence merging. |
Currently the sandbox operates on the stage2 rootfs within the
stage1 mount namespaces.
In cases where the stage2 rootfs includes absolute symlink targets (i.e.
/var/run -> /run in Debian images) this scheme currently fails.
This introduces another step in the mount propagation to stage2. In
addition to performing a move mount to stage1, another bind mount and
chroot into stage2 has to be done in order to retain absolute symlinks.
Fixes #3483
TODOs: