Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rosetta emulation mishandles /proc/<pid>/cmdline in 4.25.0 #7058

Closed
gwynne opened this issue Nov 4, 2023 · 19 comments
Closed

Rosetta emulation mishandles /proc/<pid>/cmdline in 4.25.0 #7058

gwynne opened this issue Nov 4, 2023 · 19 comments

Comments

@gwynne
Copy link

gwynne commented Nov 4, 2023

Description

After updating to Docker Desktop for Mac 4.25.0, Ubuntu 22.04 images running under Rosetta 2 emulation began to show corrupted /proc/<pid>/cmdline contents. Specifically, the actual contents of argv[0] are appended as an argument following the full executable path which previously represented argv[0]. Intermittently, /rosetta/rosetta is additionally prepended. This wreaks havoc with processes (such as Apple's Swift compiler toolchain) which rely on the contents of /proc/self/cmdline for convenient access to argc and argv. The actual argc and argv values passed to main() are unchanged.

Reproduce

  1. Configure Docker Desktop for Mac on an Apple Silicon host with Rosetta 2 emulation enabled.
  2. docker run -ti --pull always --platform=linux/amd64 ubuntu:latest
  3. ps -axjww
 PPID   PID  PGID   SID TTY      TPGID STAT   UID   TIME COMMAND
    0     1     1     1 pts/0       17 Ss       0   0:00 /rosetta/rosetta /bin/bash /bin/bash
    1    17    17     1 pts/0       17 R+       0   0:00 /usr/bin/ps ps axjww
  1. cat /proc/1/cmdline | xargs -0 echo
/rosetta/rosetta /bin/bash /bin/bash

Expected behavior

The output of both commands should closely match the results when running Docker Desktop 4.24.2:

$ docker run --rm -ti --pull always --platform=linux/amd64 ubuntu:latest
root@7f9275602dcf:/# ps axjww
 PPID   PID  PGID   SID TTY      TPGID STAT   UID   TIME COMMAND
    0     1     1     1 pts/0       18 Ss       0   0:00 /rosetta/rosetta /bin/bash
    1    18    18     1 pts/0       18 R+       0   0:00 /usr/bin/ps axjww
root@7f9275602dcf:/# cat /proc/1/cmdline | xargs -0 echo
/rosetta/rosetta /bin/bash

docker version

Client:
 Cloud integration: v1.0.35+desktop.5
 Version:           24.0.6
 API version:       1.43
 Go version:        go1.20.7
 Git commit:        ed223bc
 Built:             Mon Sep  4 12:28:49 2023
 OS/Arch:           darwin/arm64
 Context:           desktop-linux

Server: Docker Desktop 4.25.0 (126437)
 Engine:
  Version:          24.0.6
  API version:      1.43 (minimum version 1.12)
  Go version:       go1.20.7
  Git commit:       1a79695
  Built:            Mon Sep  4 12:31:36 2023
  OS/Arch:          linux/arm64
  Experimental:     false
 containerd:
  Version:          1.6.22
  GitCommit:        8165feabfdfe38c65b599c4993d227328c231fca
 runc:
  Version:          1.1.8
  GitCommit:        v1.1.8-0-g82f18fe
 docker-init:
  Version:          0.19.0
  GitCommit:        de40ad0

docker info

Client:
 Version:    24.0.6
 Context:    desktop-linux
 Debug Mode: false
 Plugins:
  buildx: Docker Buildx (Docker Inc.)
    Version:  v0.11.2-desktop.5
    Path:     /Users/gwynne/.docker/cli-plugins/docker-buildx
  compose: Docker Compose (Docker Inc.)
    Version:  v2.23.0-desktop.1
    Path:     /Users/gwynne/.docker/cli-plugins/docker-compose
  dev: Docker Dev Environments (Docker Inc.)
    Version:  v0.1.0
    Path:     /Users/gwynne/.docker/cli-plugins/docker-dev
  extension: Manages Docker extensions (Docker Inc.)
    Version:  v0.2.20
    Path:     /Users/gwynne/.docker/cli-plugins/docker-extension
  init: Creates Docker-related starter files for your project (Docker Inc.)
    Version:  v0.1.0-beta.9
    Path:     /Users/gwynne/.docker/cli-plugins/docker-init
  sbom: View the packaged-based Software Bill Of Materials (SBOM) for an image (Anchore Inc.)
    Version:  0.6.0
    Path:     /Users/gwynne/.docker/cli-plugins/docker-sbom
  scan: Docker Scan (Docker Inc.)
    Version:  v0.26.0
    Path:     /Users/gwynne/.docker/cli-plugins/docker-scan
  scout: Docker Scout (Docker Inc.)
    Version:  v1.0.9
    Path:     /Users/gwynne/.docker/cli-plugins/docker-scout

Server:
 Containers: 0
  Running: 0
  Paused: 0
  Stopped: 0
 Images: 2
 Server Version: 24.0.6
 Storage Driver: overlay2
  Backing Filesystem: extfs
  Supports d_type: true
  Using metacopy: false
  Native Overlay Diff: true
  userxattr: false
 Logging Driver: json-file
 Cgroup Driver: cgroupfs
 Cgroup Version: 2
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: io.containerd.runc.v2 runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 8165feabfdfe38c65b599c4993d227328c231fca
 runc version: v1.1.8-0-g82f18fe
 init version: de40ad0
 Security Options:
  seccomp
   Profile: unconfined
  cgroupns
 Kernel Version: 6.4.16-linuxkit
 Operating System: Docker Desktop
 OSType: linux
 Architecture: aarch64
 CPUs: 10
 Total Memory: 23.44GiB
 Name: linuxkit-660f35817eeb
 ID: ff62ed3c-f4d4-4485-b887-163b8db7bd14
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 HTTP Proxy: http.docker.internal:3128
 HTTPS Proxy: http.docker.internal:3128
 No Proxy: hubproxy.docker.internal
 Experimental: false
 Insecure Registries:
  hubproxy.docker.internal:5555
  127.0.0.0/8
 Live Restore Enabled: false

WARNING: daemon is not using the default seccomp profile

Diagnostics ID

CEC81793-5002-4A5E-9D37-5B165C9B772A/20231104060157

Additional Info

No response

@dgageot
Copy link
Member

dgageot commented Nov 17, 2023

Hello @gwynne, unfortunately, this is not something that can be fixed.

With Rosetta enabled, you'll see something like:

cat /proc/1/cmdline | xargs -0 echo
/rosetta/rosetta /bin/bash /bin/bash

With Qemu, it'll be:

cat /proc/1/cmdline | xargs -0 echo
/usr/bin/qemu-x86_64 /bin/bash /bin/bash

It's actually a feature of the Linux Kernel to preserve this argv0element. The Swift compiler toolchain should rely on its argv rather than on what's in /proc/self/cmdline

@dgageot dgageot closed this as completed Nov 17, 2023
@gwynne
Copy link
Author

gwynne commented Nov 17, 2023

@dgageot As it happens, I agree with regards to the Swift compiler (and as you can see in the issue I filed against it, I've made that argument to them). But the behavior of Rosetta versus Qemu changed between 4.24.2 and 4.25.0; Rosetta exhibits the problem in 4.25.0, Qemu does not. Neither does in 4.24.2.

@gwynne
Copy link
Author

gwynne commented Nov 17, 2023

(The issue still exists in 4.25.1 as well)

@dgageot
Copy link
Member

dgageot commented Nov 17, 2023

Yes, it's in 4.25.1 and it'll stay this way in the forceable future. Very sorry for you.
Enabling this feature that preserves the argv0 is required for Rosetta or Qemu to work properly.
Maybe in the past versions of Docker Desktop or on past versions of macOS we had a different setup but that was an incorrect setup for 99% of the workflows.

@norio-nomura
Copy link

This issue can be fixed by following steps, which will remain in effect until you exit Docker Desktop.

$ docker run -it --rm --platform=linux/amd64 ubuntu cat /proc/self/cmdline|xargs -0 echo
/usr/bin/cat cat /proc/self/cmdline
$ docker run -it --rm --privileged --pid=host justincormack/nsenter1
~ # mount -t binfmt_misc none /proc/sys/fs/binfmt_misc
~ # echo -1 >/proc/sys/fs/binfmt_misc/rosetta
~ # echo ':rosetta:M::\x7fELF\x02\x01\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x02\x00\x3e\x00:\xff\xff\xff\xff\xff\xfe\xfe\x00\xff\xff\xff\xff\xff\xff\xff\xff\xfe\xff\xff\xff:/run/rosetta/rosetta:OCF' >/proc/sys/fs
/binfmt_misc/register
~ # exit
$ docker run -it --rm --platform=linux/amd64 ubuntu cat /proc/self/cmdline|xargs -0 echo
/usr/bin/cat /proc/self/cmdline

@norio-nomura
Copy link

This issue can be fixed by following steps, which will remain in effect until you exit Docker Desktop.

I have created re-register-rosetta that performs this fix.
usage:

docker run --rm --privileged ghcr.io/norio-nomura/re-register-rosetta

example:

$ docker run --rm --platform=linux/amd64 ubuntu cat /proc/self/cmdline|xargs -0 echo
/usr/bin/cat cat /proc/self/cmdline
$ docker run --rm --privileged ghcr.io/norio-nomura/re-register-rosetta
Rosetta is not correctly registered. Re-registering rosetta...
Successfully re-registered Rosetta.
$ docker run --rm ghcr.io/norio-nomura/re-register-rosetta
It looks like Rosetta is correctly registered.
$ docker run --rm --platform=linux/amd64 ubuntu cat /proc/self/cmdline|xargs -0 echo
cat /proc/self/cmdline

@norio-nomura
Copy link

I assume the root cause of this issue is that Docker Desktop for Mac specifies P - preserve-argv[0] when registering to binfmt_misc. Quoted from the documentation below:
https://github.com/torvalds/linux/blob/54be6c6c5ae8e0d93a6c4641cb7528eb0b6ba478/Documentation/admin-guide/binfmt-misc.rst?plain=1#L52-L66

  • flags
    is an optional field that controls several aspects of the invocation
    of the interpreter. It is a string of capital letters, each controls a
    certain aspect. The following flags are supported:

    P - preserve-argv[0]
    Legacy behavior of binfmt_misc is to overwrite
    the original argv[0] with the full path to the binary. When this
    flag is included, binfmt_misc will add an argument to the argument
    vector for this purpose, thus preserving the original argv[0].
    e.g. If your interp is set to /bin/foo and you run blah
    (which is in /usr/local/bin), then the kernel will execute
    /bin/foo with argv[] set to ["/bin/foo", "/usr/local/bin/blah", "blah"]. The interp has to be aware of this so it can
    execute /usr/local/bin/blah
    with argv[] set to ["blah"].

The way I was modifying my earlier post is just to re-register without specifying this P flag.
I use the registration method used by lima as a reference: https://github.com/lima-vm/lima/blob/752afc08689e06d00293b3261817d5be70cb0148/pkg/cidata/cidata.TEMPLATE.d/boot/05-rosetta-volume.sh#L33-L34

My fix reverts back to before the fix when Docker Desktop for Mac enters Resource Saver Mode.
@dgageot I hope this issue will be fixed in Docker Desktop for Mac itself.

@al45tair
Copy link

I don't think this is really a Docker bug. Either we (Swift) shouldn't be using /proc/self/cmdline, or qemu and Rosetta should both be using prctl(2) with PR_SET_MM_ARG_START to point the command line at the one they hand to the emulated code. Whether the latter makes sense probably depends on your views about whether or not the fact that qemu or Rosetta are being used should show up in places like ps.

@al45tair
Copy link

Hmmm. Looking into this a bit further, with every combination I've tried except Rosetta, reading /proc/self/cmdline gets the results you'd hope (that is, they match the argv passed to main). When I enable "Use Rosetta for x86/amd64 emulation on Apple Silicon", I actually don't see the /run/rosetta/rosetta that gets prepended, but I do see an extra copy of the program's path:

argc = 4, argv = 0x7ffffffc3d68, envp = 0x7ffffffc3d90
environ = 0x7ffffffc3d90
argv[0] = 0x7ffffffc3f62 "/cmdline/tst"
argv[1] = 0x7ffffffc3f6f "one"
argv[2] = 0x7ffffffc3f73 "two"
argv[3] = 0x7ffffffc3f77 "three"
cmdline[0] = "/cmdline/tst"
cmdline[1] = "/cmdline/tst"
cmdline[2] = "one"
cmdline[3] = "two"
cmdline[4] = "three"

This makes me think that whatever /run/rosetta/rosetta is tries to remove itself from the arguments, but isn't taking account of the fact that it's being passed the path of the program in addition to the argv[0] value. Where does /run/rosetta/rosetta come from? Since it's clearly trying to adjust the command line, that suggests that there's a bug in the logic it's using to do so.

@al45tair
Copy link

I guess the rosetta binary is an Apple-supplied binary.

@gwynne
Copy link
Author

gwynne commented Feb 14, 2024

I guess the rosetta binary is an Apple-supplied binary.

Yeah, I've wondered if the problem was at least partially Rosetta's fault. (I also never got a chance to check whether the issue shows up in Ventura or only in Sonoma...)

@al45tair
Copy link

From here: https://developer.apple.com/documentation/virtualization/running_intel_binaries_in_linux_vms_with_rosetta

Important
When using Rosetta in macOS 13, set the preserve option to no.

That's certainly interesting. I'll have to investigate this a bit further at my end.

@norio-nomura
Copy link

From here: https://developer.apple.com/documentation/virtualization/running_intel_binaries_in_linux_vms_with_rosetta

Important
When using Rosetta in macOS 13, set the preserve option to no.

That's certainly interesting. I'll have to investigate this a bit further at my end.

I used that information to create a tool in #7058 (comment) to re-register with preserve=no.

@al45tair
Copy link

I used that information to create a tool in #7058 (comment) to re-register with preserve=no.

Indeed. I'm asking the Rosetta folks here about it, because — depending on what they say — I might change my mind about this being a bug in Docker Desktop. Though it would probably be better if Rosetta supported preserve too.

@al45tair
Copy link

al45tair commented Feb 19, 2024

Update: this isn't a bug in Swift or Docker Desktop. Apparently, reading /proc/self/cmdline is supposed to work properly under both qemu and Rosetta regardless of the setting of preserve. There is an issue causing it to not work properly under Rosetta right now (rdar://123115850).

@jmarrec
Copy link

jmarrec commented Apr 2, 2024

I have come the same conclusion that /proc/self/cmdline was duplicating the program's path and it was throwing off my CLI that indirectly made use of cmdline. @norio-nomura 's tool fixed it for me (thanks a lot!)

There's no way to see the status of rdar://123115850 (I believe that's Apple's internal tracking tool), is there?

@al45tair
Copy link

al45tair commented Apr 2, 2024

There is not, sorry. In the case of Swift, we changed to not using /proc/self/cmdline in order to avoid having to wait for an OS release with a fixed version of Rosetta. You can see what we did here (note: if it doesn't make you scream, you're probably reading it wrong).

Swift 5.10 has the fix, if you're a Swift user reading this.

@jmarrec
Copy link

jmarrec commented Apr 3, 2024

🤯 Holy molly. That's one terrible piece of code.
Strong 👍 on the comments though, it's one of these cases where there should not be any arguing over whether comments have any place in code.

I'm gonna go wash my eyes with bleach now to try to unsee that.

@al45tair
Copy link

al45tair commented Apr 3, 2024

🤯 Holy molly. That's one terrible piece of code.

🤣 To be fair, it is really only relying on the Linux ABI (which is well defined) and the fact that environ in both musl and Glibc initially points to the ABI-specified environment pointer array. The C library start-up code is actually quite similar, the main difference being that it starts out with a pointer it knows points at argc, whereas we need to validate our assumption about environ so we're a bit more defensive (and might fail to locate the arguments in some hopefully unusual cases).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants