Make the sandboxed file system more strict #7313

emusand · 2019-01-31T11:26:18Z

Description of the problem / feature request:

Make it possible to configure the sandbox to whitelist local directories. The sandbox will have read access to only these directories (and its execroot). No other local directories will be available.

Today it is possible to blacklist directories with option --sandbox_block_path=<directory>. This feature request adds the possibility to whitelist directories instead.

Feature requests: what underlying problem are you trying to solve with this feature?

The current sandbox has read permissions to its execroot and almost everything in /. If a rule reads a file with absolute path, bazel assumes it is a file provided by the operating system. Bazel will not rebuild the target if this file is updated.

My work group needs more hermetic builds. We have bad experience from a previous build system (IBM ClearCase) which did not track file accesses outside of the workspace (VOB). This is almost exactly the same limitation as in the current sandbox; rules can read any file on our distributed file systems with an absolute path, but the target will not be rebuilt if this file is updated. This limitation forced us to turn off the remote cache in ClearCase, and avoid using incremental builds in CI, since they were not reliable.

Any other information, logs, or outputs that you want to share?

This has been discussed in the bazel-discuss Google group.

Design Document: Bazel Sandboxing 2.0 describes the current sandbox well, and the reason for allowing read access to everything in /.

My work group is willing to implement this feature.

The text was updated successfully, but these errors were encountered:

philwo · 2019-01-31T18:52:30Z

@benjaminp Could you share the link to your tweaked sandbox that can mount an image instead of the host / again? I remember that you posted it on an issue recently, but can't find it.

I wonder if that would already be enough to support this use case. We could spend some time to polish it and get it into Bazel mainline :)

benjaminp · 2019-01-31T19:17:16Z

#6994 (comment) has the details.

emusand · 2019-02-05T11:49:48Z

I think that this feature request and mounting an image solve two different use cases.

As a first step we want to use parts the local system, but we want to make sure that no absolute paths to our distributed filesystems are available. This means that our builds will not be fully hermetical. That's OK for us. Our assumption is that the Linux distributions in our environment are similar enough, and that the differences in the local systems do not matter.

However, in the long run we want to migrate to fully hermetical builds. Then the possibility to mount a Docker image as / sounds very interesting.

benjaminp · 2019-02-05T16:02:40Z

Even with a rootfs, you can mount whatever you like with --sandbox_add_mount_pair.

emusand · 2019-02-05T18:19:14Z

Aha, I see. With your tweaked sandbox we can mount a dummy image as /, and then use --sandbox_add_mount_pair to mount our white-listed local directories. This will solve our use case!

emusand · 2019-02-06T10:03:39Z

It would be great to get the --sandbox_rootfs option into bazel baseline. This problem basically stops my work group from adopting bazel.

Would it be possible to clean it up and create a pull request? If so, do you have any rough estimate of how long time this will take?

philwo · 2019-02-06T10:10:35Z

I would be happy to get this into Bazel mainline and review a PR. :)

benjaminp · 2019-02-11T22:54:38Z

While the rootfs is a useful feature, I think It's worth taking a step back and deciding what the strategy for (Linux) sandboxing is before adding it. Maybe Bazel should switch to using a real container runtime like runc instead of slowly reinventing all container features in linux-sandbox. We'd also want to make sure whatever we do is compatible with sandboxfs.

emusand · 2019-02-12T16:50:16Z

We just want to use --sandbox_rootfs to avoid mounting /. A --[no]sandbox_mount_root would work just as fine for us.

Do you think that we can add the latter option instead? I agree that the number of sandbox tweaking options start becoming many, and that we should settle on a long-term strategy for Linux sandboxing, but perhaps a --[no]sandbox_mount_root option is acceptable anyway.

jmmv · 2019-03-03T13:27:29Z

I agree with @benjaminp on rethinking how sandboxing should work if we are going to make it more strict. I can't tell yet if adding an extra option is a good idea though, but maybe it's fine in the interim. From what @emusand says in the last comment, it sounds "simple", so if you could share a PR, maybe we could take it from there? :)

emusand · 2019-03-03T13:45:13Z

Ok, then we will implement the --[no]sandbox_mount_root option and share a PR!

Globegitter · 2019-04-03T20:40:47Z

@emusand any update on this option?

emusand · 2019-04-03T22:14:02Z

I hope to get time to implement the option in the next few weeks.

perwestling · 2019-07-15T14:10:25Z

emusand has been a bit buzzy the last couple of months, but others in the team have started to have a look at this. So maybe we will have something ready during August.

emlrsua · 2019-07-19T06:35:02Z

I'm the one who is looking at this, at least for the next week while the others are away on vacation.

Please correct me if I am mistaken, but it looks like the agreed upon solution is to implement the --[no]sandbox_mount_root option in order to avoid mounting /. We'd then use a whitelisting option, perhaps the pre-existing --sandbox_add_mount_pair, to add visibility to the tools/directories of our choice.

Are the changes to implement --[no]sandbox_mount_root likely to be limited to the files in the sandbox directory (such as LinuxSandboxedSpawnRunner.java and AbstractSandboxSpawnRunner.java)? Or are there other files I need to know about?

Would examining the code for the older (blacklisting) sandbox be helpful, and is a good representation of that release 0.5.2?

Where is / mounted? Is that done, for instance, at the start of the AbstractSandboxSpawnRunner.getWritableDirs method (I'm looking at release 0.25.0)? Does the local variable sandboxExecRoot contain the reference to /? If not, what holds that?

philwo · 2019-07-19T09:35:48Z

You can't just "not mount root" - you have to mount something on /. :) The question is only what do you mount there - your real "/" from the host, or an empty directory (this will usually not work, because shared libraries and certain tools are just assumed to exist), or a chroot that contains a minimum set of files that you need.

Next, you will have to completely change how the mounting is done in the linux-sandbox code. What we currently do is just remount things in-place to read-only except for the paths that we want to write to, then optionally bind mount whatever the user specified via --sandbox_add_mount_pair=source:target.

What this would have to look like if you want to implement mounting an alternate root is that you create a container directory, then mount all paths that are needed there, then pivot_root or chroot correctly into the container directory and run the command.

The Linux sandboxing code that does the actual mounts is implemented in C. The relevant code is here:

This builds the command-line to call the linux-sandbox binary:
https://source.bazel.build/bazel/+/master:src/main/java/com/google/devtools/build/lib/sandbox/LinuxSandboxedSpawnRunner.java;l=164
This is the linux-sandbox: https://source.bazel.build/bazel/+/master:src/main/tools/linux-sandbox.cc;l=209?q=linux-sandbox&ss=bazel

It's not much code and should be easy to understand.

The old sandbox code is here (this is the last revision before the rewrite): https://source.bazel.build/bazel/+/774553eea688338caae754c49fbfc66d9a3475b7:src/main/tools/linux-sandbox.c;bpv=;bpt=0

The old sandbox did use pivot_root and the container directory approach. It might be interesting to look at how it did it.

As mentioned in earlier comments on this issue, it might be a better approach to investigate adding a new SandboxedStrategy that uses "runc" instead.

emusand · 2019-07-19T11:09:50Z

Hi Mark,

Thank you for giving this a shot.

As @philwo explained in his excellent reply, the task is a bit more intricate than my previous posts might have indicated.

The old sandbox tried to give read access to the local system but nothing else, by only mounting a hard-coded list of local directories. Then users complained that bazel did not find tools installed in other directories. The new sandbox gives read access to everything under /.

For us the old sandbox worked better than the new. With the new sandbox, a user can add an implicit dependency to a file in our distributed file systems, for instance by adding a genrule(...) that reads a file in /proj with an absolute path. Bazel will not rebuild if this file is updated. Note that we still can, and need to, read files in /proj, but we should add these files to a bazel workspace and avoid reading them explicitly with absolute paths.

My idea was to add a --[no]sandbox_mount_root option, which basically toggles between the old and new sandbox behavior. If false, the sandbox would just mount a minimal part of the local system, like the old sandbox did. If true, everything will be mounted as read-only, like the new sandbox does. This means that we will have to bring back most of the code from the old linux-sandbox.c file that @philwo linked to, and place it in linux-sandbox-pid1.cc or in a separate file.

frazze-jobb · 2020-12-10T15:26:11Z

Hi everyone,

After a long pause on our efforts regarding this issue, we are picking it up again.
Has work been done in this area since last 2 years? @philwo @benjaminp

We want to extend the linux-sandbox to makes it more hermetic, by allowing users to whitelist directories to be bind mounted (such as /bin) and don't expose other directories in the sandbox.

frazze-jobb · 2021-02-15T16:10:34Z

We have started prototyping on an implementation and we would appreciate to hear your opinions. 😃

Our proposed solution is to:

Base it on the current linux-sandbox and the old sandbox code.
Add a flag to toggle the hermetic sandbox (yet to be named)
Create hard-links for files outside the bazel sandbox when the actual files and the sandbox resides in the same filesystem.
Copy files when the files and the sandbox does not reside in the same filesystem.
Utilizing bind mount for white listed directories, such as /usr/bin. Using --sandbox_add_mount_pair.
Isolate the sandbox with pivot_root.

We aim to create a pull request in a month or so.

philwo · 2021-02-15T16:29:44Z

Hi @frazze-jobb,

that sounds really cool! I'm happy to review the code when it's ready.

One question: If I understand correctly, you want to hard-link (or fallback to copy) input files of actions into the sandboxed execution root, in contrast to the current strategy of symlinking them. How do you want to prevent sandboxed actions from accidentally modifying the contents of hard-linked input files?

frazze-jobb · 2021-02-15T18:14:38Z

Hi @philwo,

We are planning to infer --sandbox_fake_username, hopefully that prevents it.

ulfjack · 2021-02-17T14:35:26Z

As far as I am aware, Linux currently does not have an acceptable high-performance mechanism to prevent modifying contents of hard-linked input files. I did research this problem a lot, but nothing I tried worked out. The closest I was able to get is by running the action under a different user id. The problem with that is that the process outside the sandbox can't read / delete the output files in all cases unless it has root privileges. I tried using acls which mostly works, but not quite. The fake user id used by the sandbox maps to the original user, and doesn't prevent file system modification as that user.

mafanasyev-tri · 2021-02-17T16:51:52Z

Suggestion: one method which can work for bazel would be to try to detect modifications instead. Once the action is done, re-run the stat on each file, and fail the action if ctime has changed.

philwo · 2021-02-18T20:46:33Z

I can think of two ways how to handle input files. The ideal way for local sandboxing would be to use a filesystem that supports copy-on-write copies like btrfs, XFS or APFS. That way you can make actual copies of files that do not take up any storage and the operation is basically instant, because only metadata has to be written (like a hard-link). On macOS it would probably be fine to just assume that everyone uses APFS by now, so we could use the feature there. On Linux, with ext4 still being the default filesystem for many new installations, it is probably not easily feasible.

Another idea was to let Bazel manage a content-addressed storage folder with all inputs and outputs, mount it read-only into the sandbox and then do symlinks to files in it.

<sandbox>/hello.cc -> /cas/<sha256 of hello.cc>
<sandbox>/hello.h -> /cas/<sha256 of hello.h>

Because the files in the /cas/ directory are in a flat namespace and their names are only hash digests, an action that follows the symlink and "looks around" would be much less likely to be able to introduce non-hermeticity compared to our current symlinking approach, where you end up in the actual workspace tree. However, this solution would come with a performance and storage overhead, as effectively all input and output files would have to be stored twice. :/

Edit: Output files would only have to be stored once (in the CAS) as you could just create symlinks in the bazel-out/ tree pointing into it.

frazze-jobb · 2021-02-19T19:20:15Z

Thanks for the input! It's highly appreciated. As you already mentioned using --sandbox_fake_username did not help us in this case.

We need to rethink how we implement support for hardlinks. We have an ongoing discussion internally in our organization that we may be moving to XFS on our linux systems in the future. So that copy-on-write feature sounds interesting.

Would you accept an --experimental_hermetic_hardlink_sandbox? With the disclaimer that its unsafe to use and may ruin your files.

You could probably clean away all your input-file modifications if you have the detection mechanism and then be able to use the hardlink-sandbox relatively safely.
It seems that some kind of detection of writing on the input files are already present.
WARNING: /.../execroot/cwd/bazel-out/host/.../somefile was modified during execution

frazze-jobb · 2021-03-31T07:13:16Z

Hi again,

I have now created a PR. I followed @mafanasyev-tri suggestion to detect modifications. Once the action is done, I re-run the stat on each file, and fail the action if mtime has changed because ctime gets modified when you increase link reference counter.

ulrfa · 2021-03-31T08:49:12Z

Excellent work @frazze-jobb! The PR also resolves #7091 and bazelbuild/rules_python#382.

Mathiasdm · 2021-05-11T07:13:45Z

Out of curiosity, any reason not to use sandboxfs for such a hermetic sandbox? Or would that give problems with the chroot?

frazze-jobb · 2021-05-11T17:43:44Z

Before we started working with this implementation, we had benchmarked sandboxfs and it seemed to be significantly slower than linux-sandbox, so we did not proceed with sandboxfs. And it was also noted at the time that "the Bazel community is uncertain if this is something to be used in the future or not" (Before June 2020).
At our site we are seeing a slight performance increase compared to linux-sandbox with the PR.
I don't know if we would get into any problems by implementing it in sandboxfs as I haven't digged in to it.

psigen · 2021-10-18T03:00:28Z

@frazze-jobb : is this issue resolved with your PR here getting merged? 🤞
#13279

frazze-jobb · 2021-10-18T07:06:28Z

Yes

philwo · 2021-10-19T12:39:13Z

Thank you, then I'll close this issue! 😊

dslomov added team-Local-Exec Issues and PRs for the Execution (Local) team type: feature request untriaged labels Jan 31, 2019

jmmv added P2 We'll consider working on this in future. (Assignee optional) and removed untriaged labels Mar 3, 2019

jmmv added this to the sandboxfs milestone Mar 14, 2019

Globegitter mentioned this issue Apr 3, 2019

Block access to files/paths outside of the sandbox #7936

Closed

This comment has been minimized.

Sign in to view

jmmv changed the title ~~Feature request: Make sandbox more strict~~ Make the sandboxed file system more strict May 13, 2020

divanorama mentioned this issue Feb 8, 2021

[request] robust way to specify LD_LIBRARY_PATH for java/javac #12978

Closed

benjaminp mentioned this issue Feb 18, 2021

Hermeticity bug: Package setup can find host libraries (leading to import failures) dropbox/dbx_build_tools#31

Open

frazze-jobb mentioned this issue Mar 30, 2021

Implement Hermetic sandbox with support for hardlinks #13279

Closed

philwo closed this as completed Oct 19, 2021

Make the sandboxed file system more strict #7313

Make the sandboxed file system more strict #7313

Comments

emusand commented Jan 31, 2019

Description of the problem / feature request:

Feature requests: what underlying problem are you trying to solve with this feature?

Any other information, logs, or outputs that you want to share?

philwo commented Jan 31, 2019

benjaminp commented Jan 31, 2019

emusand commented Feb 5, 2019

benjaminp commented Feb 5, 2019

emusand commented Feb 5, 2019

emusand commented Feb 6, 2019

philwo commented Feb 6, 2019 • edited Loading

benjaminp commented Feb 11, 2019

emusand commented Feb 12, 2019

jmmv commented Mar 3, 2019

emusand commented Mar 3, 2019

Globegitter commented Apr 3, 2019

emusand commented Apr 3, 2019

perwestling commented Jul 15, 2019

emlrsua commented Jul 19, 2019 • edited Loading

philwo commented Jul 19, 2019

emusand commented Jul 19, 2019 • edited Loading

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

frazze-jobb commented Dec 10, 2020

frazze-jobb commented Feb 15, 2021

philwo commented Feb 15, 2021

frazze-jobb commented Feb 15, 2021

ulfjack commented Feb 17, 2021

mafanasyev-tri commented Feb 17, 2021

philwo commented Feb 18, 2021 • edited Loading

frazze-jobb commented Feb 19, 2021 • edited Loading

frazze-jobb commented Mar 31, 2021

ulrfa commented Mar 31, 2021

Mathiasdm commented May 11, 2021

frazze-jobb commented May 11, 2021

psigen commented Oct 18, 2021

frazze-jobb commented Oct 18, 2021

philwo commented Oct 19, 2021

philwo commented Feb 6, 2019 •

edited

Loading

emlrsua commented Jul 19, 2019 •

edited

Loading

emusand commented Jul 19, 2019 •

edited

Loading

philwo commented Feb 18, 2021 •

edited

Loading

frazze-jobb commented Feb 19, 2021 •

edited

Loading