-
Notifications
You must be signed in to change notification settings - Fork 797
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
do we need to run buildah containers always with BUILDAH_ISOLATION = chroot #5818
Comments
In many cases, a container that's run using the image will not be given enough privileges for |
Thanks @nalind for your reply. |
Yes, the container image has the environment variable set in it to override the compiled-in default. |
I have a similar need to run buildah in Kubernetes with better isolation.
What privileges are those? How can I check if the environment provides them? |
For handling
Some of these operations can also be denied by the seccomp filter, or by the SELinux policy (or other mandatory access control rules), and it's entirely possible that I'm still forgetting some things. For me, it tends to be a trial-and-error process. |
A friendly reminder that this issue had no activity for 30 days. |
I've had some time to play with it. I ended up with a Pod definition that seemingly makes nested containerization possible with buildah-pod.yaml (click to expand)apiVersion: v1
kind: Pod
metadata:
generateName: buildah-
labels:
buildah-isolation-test: "true"
annotations:
# /dev/fuse fixes:
#
# fuse: device not found, try 'modprobe fuse' first
#
# Wouldn't be needed with STORAGE_DRIVER=vfs
io.kubernetes.cri-o.Devices: /dev/fuse
spec:
restartPolicy: Never
volumes:
- name: workdir
emptyDir: {}
initContainers:
- name: create-dockerfile
image: quay.io/containers/buildah:v1.38.1
volumeMounts:
- name: workdir
mountPath: /workdir
workingDir: /workdir
command: ["bash", "-c"]
args:
- |-
cat << EOF > Dockerfile
FROM docker.io/library/alpine:latest
RUN echo "hello world"
EOF
containers:
- name: buildah
image: quay.io/containers/buildah:v1.38.1
volumeMounts:
- name: workdir
mountPath: /workdir
workingDir: /workdir
env:
- name: BUILDAH_ISOLATION
value: oci
- name: STORAGE_DRIVER
value: overlay
command: ["bash", "-c"]
# unshare fixes:
#
# error running container: from /usr/bin/crun ... opening file `/sys/fs/cgroup/cgroup.subtree_control` for writing: Read-only file system
#
# --mount fixes:
#
# Error: mount /var/lib/containers/storage/overlay:/var/lib/containers/storage/overlay, flags: 0x1000: operation not permitted
#
# --map-root-user fixes:
#
# unshare: unshare failed: Operation not permitted
# --net=host fixes:
#
# error running container: from /usr/bin/crun ...: open `/proc/sys/net/ipv4/ping_group_range`: Read-only file system
#
# --pid=host fixes:
#
# error running container: from /usr/bin/crun ...: mount `proc` to `proc`: Operation not permitted
args:
- |-
# can also add --pid --fork to unshare
unshare --map-root-user --mount -- buildah build --net=host --pid=host .
securityContext:
capabilities:
add:
# SETFCAP fixes:
#
# unshare: write failed /proc/self/uid_map: Operation not permitted
- SETFCAP
seLinuxOptions:
# container_runtime_t fixes:
#
# error running container: from /usr/bin/crun ...: mount `devpts` to `dev/pts`: Permission denied
type: container_runtime_t Test with: kubectl delete pod -l buildah-isolation-test=true
kubectl create -f buildah-pod.yaml
sleep 5
kubectl logs -l buildah-isolation-test=true --tail=-1 --follow @nalind could you share your thoughts on the security implications of the settings I had to use:
|
When attempting to nest a container, the "host" namespaces are those being used by the container. If it runs, great. |
Here's what I'm using as a one-liner reference example that seems to work right now rootless (it will also work rootful but I think it needs more security for that, see below):
Now of these alone, having to do As far as security I'd emphasive from my PoV, as long as the outer container is invoked with a user namespace (as Nalin mentions, |
Thanks for the suggestions! Adding I tried removing the unshare command and adding Unfortunately, mounting
I found the labeling scary as well. It seems to work with |
Wait, maybe that's just the config in the quay.io/containers/buildah image |
The image contains an |
That helped! 🎉 Any thoughts on having to set |
Outside of a container, it's usually labeled |
Yes I think using |
Derp, I don't think it did anything at all. The cluster I was using to test probably doesn't enable the I'll try to get a cluster with user namespaces actually enabled. In any case, it seems it's possible to make
That still seems preferable to using |
Hi,
I have a buildah container image (quay.io/buildah/stable:latest) running with default setting as a "BUILDAH_ISOLATION = chroot" in Kubernetes. However, I am wondering is this really required to run the buildah as a container ?
Can someone pleas explain this ,
https://github.com/containers/buildah/blob/main/docs/buildah-build.1.md
_"--isolation type
Controls what type of isolation is used for running processes as part of RUN instructions. Recognized types include oci (OCI-compatible runtime, the default), rootless (OCI-compatible runtime invoked using a modified configuration, with --no-new-keyring added to its create invocation, reusing the host's network and UTS namespaces, and creating private IPC, PID, mount, and user namespaces; the default for unprivileged users), and chroot (an internal wrapper that leans more toward chroot(1) than container technology, reusing the host's control group, network, IPC, and PID namespaces, and creating private mount and UTS namespaces, and creating user namespaces only when they're required for ID mapping).
Note: You can also override the default isolation type by setting the BUILDAH_ISOLATION environment variable. export BUILDAH_ISOLATION=oci"_
The text was updated successfully, but these errors were encountered: