Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing examples / tests for --userns2 option #542

Open
Sacred-Salamander opened this issue Dec 11, 2022 · 9 comments
Open

Missing examples / tests for --userns2 option #542

Sacred-Salamander opened this issue Dec 11, 2022 · 9 comments

Comments

@Sacred-Salamander
Copy link

Trying to get a grasp on the nested namespaces and how to enter those with --userns2
There are no examples or tests I can find

For example when I run

bwrap --unshare-user --dev-bind / / --tmpfs /tmp /bin/bash

I believe this creates a nested user namespace, I can see a complete namespace set (time,cgroup,uts,ipc,net,user,mnt,pid) using lsns
while if I do lsns on the host there are only 2 namespaces for the process (user,mnt)

How can I make this work?

bwrap --userns 11 --userns2 12 --dev-bind / / /bin/bash 11</proc/80157/ns/user 12</proc/???/ns/user

I tried with

bwrap --userns 11 --userns2 12 --dev-bind / / /bin/bash 11</proc/80157/ns/user 12</proc/80157/ns/user

but results in this error

bwrap: Setting userns2 failed: Invalid argument

What should I feed into the file descriptor here?

I'm also wondering about this text in the bwrap manual about the option

This is useful because sometimes bubblewrap itself creates nested user namespaces (to work around some kernel issues) and --userns2 can be used to enter these.

Can anyone fully explain how this works, when bubblewrap creates nested namespaces and when it doesn't?
What are the kernel issues that are worked around? any upstream mailing list conversations about it?

@rusty-snake
Copy link
Contributor

75c2d94

Add support for --userns and --userns2

This allows you to reuse an existing user namespace to set up all the
other namespaces, entering that instead of creating a new one. The
reason you want to do this is that you can then also reuse other
namespaces that are owned by the user namespace. Typically you use
this to partially re-enter a previoulsy created bubblewrap sandbox.

This also adds --userns2 which is similar to --userns, but this is
switched into at the end instead of the start. Bubblewrap sometimes
creates nested such user namespaces[1], and to be able to reuse such a
setup we need to similarly reuse both namespaces via --userns2.

Technically using setns() is probably safe even in the privileged
case, because we got passed in a file descriptor to the namespace, and
that can only be gotten if you have ptrace permissions against the
target, and then you could do whatever to the namespace
anyway. However, for practical reasons this isn't useable for bwrap,
because (as described in a comment in acquire_privs()) setuid mode
causes root to own the namespaces that it creates. So as you will not
be able to access these namespaces for reuse anyway, its best to
disable it (in case of unexpected security issues).

[1] This is to work around an issue with mounting devpts without uid 0
mapped in the user namespace, where the outer namespace owns all the
other namespaces but the inner one has the right mappings.


bwrap: Setting userns2 failed: Invalid argument

Reason:

EINVAL The caller attempted to join the user namespace in which it is already a member.

You need to pass a different userns to --userns2 that is a child of --userns because:

EINVAL The caller tried to join an ancestor (parent, grandparent, and so on) PID namespace.

@Sacred-Salamander
Copy link
Author

Sacred-Salamander commented Dec 11, 2022

Can you please give an example, I also found most of that information, but it is not clear how I find or pass the child to the first namespace, since they are either effectively the same pid or the nested namespace belongs to pid 1

Would it be like /proc/80157/root/proc/80157/ns/user?

@rusty-snake
Copy link
Contributor

rusty-snake commented Dec 11, 2022

I don't have an (full, working) example. But I'm questing whether you understood user namespaces and nesting of them.

  1. How do you create the user namespaces you want to pass?
  2. How did you determined the pid the userns belongs to? (What technically doesn't make sense since a usernamespace can not belong to a process.)

edit: This was the reason for two user namespaces:

bubblewrap/bubblewrap.c

Lines 3008 to 3011 in bb7ac13

/* This is a bit hacky, but we need to first map the real uid/gid to
0, otherwise we can't mount the devpts filesystem because root is
not mapped. Later we will create another child user namespace and
map back to the real uid */

@Sacred-Salamander
Copy link
Author

I think I mostly understands it, but correct me if I don't make sense

  1. Bubblewrap creates the namespaces when I run the example
bwrap --unshare-user --dev-bind / / --tmpfs /tmp /bin/bash
  1. I can determine the pid of the process in several ways, lsns on both the host and in the bash shell I launched, but also echo $$ in the example shell all says the same pid

For example lsns on the host says this

$ lsns
        NS TYPE   NPROCS     PID USER COMMAND
4026533021 user        2   80157 user /bin/bash
4026533022 mnt         1   80157 user /bin/bash

And inside the example shell

$ lsns
        NS TYPE   NPROCS   PID USER COMMAND
4026531834 time        2 80157 user /bin/bash
4026531835 cgroup      2 80157 user /bin/bash
4026531836 pid         2 80157 user /bin/bash
4026531838 uts         2 80157 user /bin/bash
4026531839 ipc         2 80157 user /bin/bash
4026531992 net         2 80157 user /bin/bash
4026533021 user        2 80157 user /bin/bash
4026533022 mnt         2 80157 user /bin/bash

I'm sorry it is correct that the namespace doesn't belong to a process, is it correct to say that the process has attached namespaces?

@rusty-snake
Copy link
Contributor

rusty-snake commented Dec 11, 2022

Finally I have a ugly, working PoC

run in first terminal:

unshare --map-root-user --fork sh -c "echo \$\$ >/tmp/pid1 && unshare -U --fork sh -c \"echo \\\$\\\$ >/tmp/pid2 && sleep 10m && true\" && true"

run in second terminal:

bwrap --userns 3 3</proc/$(cat /tmp/pid1)/ns/user --userns2 4 4</proc/$(cat /tmp/pid2)/ns/user --dev-bind / / ls

I'm sorry it is correct that the namespace doesn't belong to a process, is it correct to say that the process has attached namespaces?

The manpages describes it as

A namespace wraps a global system resource in an abstraction that makes it appear to the processes within the namespace that they have their own isolated instance of the global resource. Changes to the global resource are visible to other processes that are members of the namespace, but are invisible to other processes. One use of namespaces is to implement containers.

Bubblewrap creates the namespaces when I run the example

At this point an fd to --userns userns is only reachable via ioctl I guess?

@Sacred-Salamander
Copy link
Author

Sacred-Salamander commented Dec 11, 2022

Great PoC

I guess that answers how to use the option

But is it possible to use it with the example I made? As far as I can tell bubblewrap also creates me an nested namespace from that, is it fundamentally different? Can I not get the intermediate pid from the host?

At this point an fd to --userns userns is only reachable via ioctl I guess?

I don't know, can it not be reachable from procfs?

@rusty-snake
Copy link
Contributor

Can I not get the intermediate pid from the host?

If you share the pidns there is no intermediate pid, because there is no need to fork twice.

I don't know, can it not be reachable from procfs?

I don't know a way since we do not know a process in this userns (direct member).

@Sacred-Salamander
Copy link
Author

Would it be possible with this example when I also unshare the pidns?

bwrap --unshare-user --unshare-pid --dev-bind / / --proc /proc --tmpfs /run --tmpfs /tmp /bin/bash

@Sacred-Salamander
Copy link
Author

About this part:

Bubblewrap sometimes creates nested such user namespaces[1], and to be able to reuse such a setup we need to similarly reuse both namespaces via --userns2.

I think that this is what the example/test should show how to do, how to enter it when using bubblewrap and not the unshare program

You also said that there is no intermediate pid when you share the pidns, as it doesn't fork twice. I'm not fully following this, does it mean that my first example does not create a nested user namespace at all? Or just one that I can't see in anyway? Does it mean my latest example from the post above is creating it but that I still can't access it? When and how is it possible to use userns2 with bubblewrap as initiator as the commit message suggests?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants