Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Distro shut down even with running processes #8854

Open
1 of 2 tasks
bradwilson opened this issue Sep 22, 2022 · 65 comments
Open
1 of 2 tasks

Distro shut down even with running processes #8854

bradwilson opened this issue Sep 22, 2022 · 65 comments
Assignees
Labels

Comments

@bradwilson
Copy link

bradwilson commented Sep 22, 2022

Version

Microsoft Windows [Version 10.0.22621.521]

WSL Version

  • WSL 2
  • WSL 1

Kernel Version

5.15.62.1

Distro Version

Ubuntu 20.04

Other Software

Docker Engine version 20.10.18

Repro Steps

  1. Create fresh Ubuntu 20.04 image
  2. Enable systemd
  3. Install Docker into Ubuntu (Ansible example)
  4. Run nginx in Docker (docker run -d -p 80:80 nginx)
  5. Verify nginx is running in your browser (http://localhost/)
  6. Exit the shell

Expected Behavior

Since nginx is running, the distro should not shut down.

Prior to upgrading to 0.67.6 w/ systemd support, this was the (presumably correct, but definitely desired) behavior.

Actual Behavior

After a short period of time, the distro is forcefully shut down by the system.

Logging back in, you can see it was forcefully shut down:

 sh$ docker ps -a
CONTAINER ID   IMAGE     COMMAND                  CREATED          STATUS                       PORTS                               NAMES
ad20314283c5   nginx     "/docker-entrypoint.…"   14 minutes ago   Exited (255) 4 minutes ago   0.0.0.0:80->80/tcp, :::80->80/tcp   admiring_bardeen

Diagnostic Logs

No response

@cerebrate
Copy link

Unfortunately, this is the new documented behavior, per blog post:

It is also important to note that with these change, systemd services will NOT keep your WSL instance alive.

You've currently got to have something running as a child of init - the Microsoft init, not pid 1 - to keep the distro running. Not my preference, but.

@NotTheDr01ds
Copy link

This is (fairly) explicitly stated in Craig's DevBlog announcement:

It is also important to note that with these change, systemd services will NOT keep your WSL instance alive. Your WSL instance will stay alive in the same way it did before, which you can read more about here.

That was actually surprising to me, as Windows 11's wsl.conf boot.command used to keep services running after exiting the shell. It was only in the Preview releases that it was changed to terminate when all processing started interactively had terminated. Hence I filed #8661, thinking this was a regression. Based on the text above, it seems that this might have been an intentional change.

Perhaps we should have a feature-request for a /etc/wsl.conf flag to allow the instance to continue running when any background task is still running.

@cerebrate
Copy link

cerebrate commented Sep 22, 2022

Perhaps we should have a feature-request for a /etc/wsl.conf flag to allow the instance to continue running when any background task is still running.

Or just "until systemctl poweroff", which is pretty much how the above should behave when systemd support is enabled.

@benhillis
Copy link
Member

Fair points - I think an option to keep distros running (and not idle terminate them) is a good idea.

@benhillis benhillis self-assigned this Sep 22, 2022
@bradwilson
Copy link
Author

systemd services will NOT keep your WSL instance alive

Is a process launched by Docker considered to be a "systemd service"? Docker itself is, obviously, but things launched by Docker, I would've expected to keep the distro alive.

It seems to me like they implemented one thing (always shut it down when there's no interactive shell) but documented something else (don't let systemd services keep it alive).

@bradwilson
Copy link
Author

(BTW, hearing the description of "we only stay alive for things rooted on Microsoft's init" helps me understand the exact implementation. It's unfortunate that the blog post described in a way that doesn't seems to match the actual implementation. Why not just describe it in the technically correct way?)

@bradwilson
Copy link
Author

Fair points - I think an option to keep distros running (and not idle terminate them) is a good idea.

But what does this mean for systemd, then? I mean ps aux on a systemd-enabled distro has, as expected, a ton of stuff running. How would the proposed flag change the behavior? Would it basically always leave systemd distros always running, because of all those systemd processes? Or would the implementation better match the blog post description ("systemd services will NOT keep your WSL instance alive" but anything else will...?)

@NotTheDr01ds
Copy link

Fair points - I think an option to keep distros running (and not idle terminate them) is a good idea.

@benhillis

Just for clarity, it would be great if that option still allowed distros to terminate if no other processes were running, similar to the way it worked with the [boot].command section in the GA Windows 11 release. I'd only expect/want it to stay running if there's some background process, whether it was started through:

Basically, I think the flag would say "Work like it did in Windows 11 GA" ;-)

@NotTheDr01ds
Copy link

NotTheDr01ds commented Sep 22, 2022

@bradwilson

It's unfortunate that the blog post described in a way that doesn't seems to match the actual implementation. Why not just describe it in the technically correct way?

I'm not sure they realized they had ever changed it to work any other way. The way I read the description, it seems Craig thought that was the way it always worked.

But what does this mean for systemd, then?

I would expect that any background process would count as a keep-alive mechanism. And yes, that's one of the unfortunate things about Systemd - It typically starts a lot. How would you tell the difference between the System dbus instance vs. Nginx, for instance?

@NotTheDr01ds
Copy link

Answering my own question above, I guess it could be a "whitelist process" implementation, where /etc/wsl.conf had a keep-alive (or something named better) option with a list of processes that would prevent the distro from self-terminating.

Just throwing it out there - I'm not tied to it by any means.

@bradwilson
Copy link
Author

Basically, I think the flag would say "Work like it did in Windows 11 GA" ;-)

I have doubts about this being what I want, though I recognize that may be where this falls back to (see my question above about systemd services in general keeping things alive with this new proposed flag).

Having Docker keep the distro alive in Win11 GA is acceptable but not ideal. It's something that's I've come to accept, and I have quick commands to kill the things that are most likely to be "daemonized" in a pre-systemd world (notably, Docker and the GPG agent). In a world where we have systemd (and I have Docker's life cycle owned by systemd), I wouldn't want Docker itself to be something that would keep the distro alive, just the case of running Docker containers.

@NotTheDr01ds
Copy link

@bradwilson Yes, our replies are getting out of sync ;-) - I saw that you post that concern above and posted a separate reply (actually two now) regarding it.

@bradwilson
Copy link
Author

bradwilson commented Sep 22, 2022

How would you tell the difference between the System dbus instance vs. Nginx, for instance?

That's why I'm asking. A mode where "any background process keeps the distro alive" + systemd means the distro will never shut itself down, and I'll be running a lot of wsl -t distro to keep things clean. In that case, I'm better off without systemd (despite it ostensibly providing other value that I might want).

That means it would also be critical to ensure that systemd is something I can opt out of. Microsoft makes it sounds like it will eventually become the default, which is fine, so long as the wsl.conf flag to disable it doesn't disappear.

@cerebrate
Copy link

To throw another point into the decimal, there may also be use for the option value that says "never terminate this distro unless explicitly instructed".

It's not something I have to care about now myself, given systemd, but there is a mode of working when you don't necessarily want to work from a WSL prompt, but also enjoy the ability to throw out WSL commands in the current working directory with

$ do-something-linuxy

(using RunInWSL, but you get the idea). If you do this all day, stopping and restarting the distro every time you take too long between entering WSL-handled commands is just a lot of wasted overhead.

On the above discussion, for my money, if you're using systemd support, you're pretty much declaring that your distro is system-like, not throwaway-shell-like, so I'd argue with keeping the distro alive as long as systemd is running; i.e., until you explicitly systemctl poweroff. Especially since, given that the same announcement points out:

Additional modifications had to be made to ensure a clean shutdown (as that shutdown is controlled by systemd now)

WSL is presumably quietly running systemctl poweroff every time it idles out.

@NotTheDr01ds
Copy link

That means it would also be critical to ensure that systemd is something I can opt out of. Microsoft makes it sounds like it will eventually become the default, which is fine, so long as the wsl.conf flag to disable it doesn't disappear.

Definitely agree (and was just about to post that). I definitely want it opt-in regardless, since I think I'll prefer to run without it most of the time (and I have some distros that absolutely don't use Systemd at all).

But what about the "whitelist" option I mentioned above? Would that work for the scenarios you envision?

@cerebrate
Copy link

cerebrate commented Sep 22, 2022

You're always going to want distros without systemd, because Linux namespaces (which separate the distros) don't cover everything; things systemd does in one distro will and do leak to the others. For that reason alone, I can't see it ever becoming compulsory.

(This is why I put an explicit mention in the systemd-genie docs that running more than one distro at a time with systemd was unsupported. You don't want them fighting over whose sysctl values, whose loaded kernel modules, whose AppArmor profiles, or whose network configuration is the validest. That way lies many subtle and impossible-to-diagnose issues.)

@benhillis
Copy link
Member

benhillis commented Sep 23, 2022

@cerebrate - linux namespaces really should cover everything, maybe someday they will (looking at you binfmt!).

WSL is interesting in some ways because we run each distro as a privileged container (root is actually root). This makes the user powerful enough to be harmful to other tenants of the VM. But since they're all owned by the same Windows user, we've decided that's ok.

@cerebrate
Copy link

Having mentioned this elsewhere, for anyone who finds it of interest - the current workaround for this is to create a process that runs forever and detach it from the calling terminal such that it will persist (as a child of the init) even when the terminal is closed. So you can create a script, call it wait-forever.sh:

#!/bin/sh
while true
do
  sleep 1h
done

then run it with

nohup wait-forever.sh > /dev/null &

and then when you exit the terminal that script will keep running as a child of init, and that will keep the instance open.

(Make sure you don't run it inside a tmux session or anything else that might kill its children in ways nohup won't stop.)

@ghost
Copy link

ghost commented Nov 1, 2022

Or use https://linux.die.net/man/1/daemonize to accomplish this? I'd recommend this.

@cerebrate
Copy link

cerebrate commented Nov 1, 2022

@pmartincic

Have you tested this one? And if so, what did you use as the process to daemonize?

Because as far as I know, the criteria for "keeps the instance open" requires that a process be a child of the Microsoft init to do so, and since daemonize (unlike nohup) disassociates the daemonized process from its process group, it stops being a child of its init and becomes a child of pid 1 (systemd), which shouldn't work to keep the instance running.

And when I tested it, sure enough, it didn't.

@ghost
Copy link

ghost commented Nov 1, 2022

Hmmmm. I tested it once upon a time when I was fixing an issue relating to interop and shell lifetimes. But haven't touched it since then. I can check again. I don't remember what all I tested with.

@cerebrate
Copy link

FWIW, I think it may have worked in the past (pre-systemd-support), as I used daemonize to run my bottled systemd when writing genie, and the instance didn't go away then.

But when using the current native systemd support, daemonized processes don't appear to keep instances around.

@ghost
Copy link

ghost commented Nov 1, 2022

Thank-you, I'll look into that. Shouldn't have changed because of systemd support.

Edit: Oh no... I'll update after I look into it but I might know why.

@bradwilson
Copy link
Author

The reason seems fairly straight-forward: when you enable systemd, nothing is going to keep the instance alive except for an interactive shell, and this is by design.

@NotTheDr01ds
Copy link

NotTheDr01ds commented Nov 2, 2022

My personal preference at the moment to keep a distro running is to:

  • Install Funtoo keychain (available in most Linux distributions), which I use anyway (e.g. sudo apt install keychain). Keychain is a front-end that creates a singleton (per-user) ssh-agent that is then attached to (via environment variables) in each shell instance. It's nice for this purpose because of the singleton, background-process design.

  • Create /etc/profile.d/wsl_keepalive.sh with the following:

    #!/usr/bin/env sh
    eval $(keychain -q)
    

That ensures that even if you the distribution as root, the keep-alive code will still run. Note that it only works if your shell is POSIX compliant and is loaded as a login-shell.

To terminate the keepalive ssh-agent process, keychain - all (for each user that has started one -- typically only your default user, but could be root as well).

@ghost
Copy link

ghost commented Nov 2, 2022

Lol! By design. Those are my words to utter :), or more specifically Ben's.

My smoke tests with daemonize $(which sleep) n and detaching tmux seem to work as expected as I test it now. Both are reparented to relay init. The console is closed and relay init persists so long as the children are alive. Once relay init dies cleanup commences after a brief delay. I'm running 0.70.5.0

Perhaps I don't see what you see?

image
image

@SashiRin
Copy link

SashiRin commented Apr 2, 2024

My workaround is to spawn the sleeping windows process in systemd service. I use the builtin windows command waitfor to keep distro alive.

Update: signal first in order to cleanup previous waitfor command

[Unit]
Description=Keep Distro Alive

[Service]
ExecStartPre=/windows/path/to/waitfor.exe /si MakeDistroAlive #cleanup the waiting signal
ExecStart=/windows/path/to/waitfor.exe MakeDistroAlive

[Install]
WantedBy=multi-user.target

It will not shutdown Ubuntu as long as waitfor process is alive.

Another way is to block the stdin of choice command. See https://github.com/firejox/hang-stdin#keep-wsl-distro-alive

Thanks for your script.
It worked when I removed the ExecStartPre line seems it caused some errors.

Any idea about that?
OK, I found the comment in ExecStartPre line was the culprit. The whole script worked fine when I deleted comments.

 The unit keep-wsl-distro-alive.service has entered the 'failed' state with result 'exit-code'.
Apr 02 15:51:48 systemd[1]: keep-wsl-distro-alive.service: Service will not restart (restart setting)
Apr 02 15:51:48 systemd[1]: keep-wsl-distro-alive.service: Changed start-pre -> failed
Apr 02 15:51:48 systemd[1]: keep-wsl-distro-alive.service: Job 588 keep-wsl-distro-alive.service/start finished>
Apr 02 15:51:48 systemd[1]: Failed to start Keep WSL Distro Alive.

@biggestsonicfan
Copy link

I just nuked my wsl instance thinking there was a problem with my distro, as a recent update seemed to have killed KDE's gui. I use wsl to persist a btrfs drive partition, and since redoing my wsl distro, the connection is severed frequently due to wsl shutting down. There was never a custom service needed. I also run SUSE, not ubuntu.

@aki-k
Copy link

aki-k commented Apr 9, 2024

@biggestsonicfan If you start the WSL 2 instance like this, it will keep on running even if you close all the shells:

C:\>wsl.exe --distribution Ubuntu-22.04 --exec dbus-launch true & wsl.exe --distribution Ubuntu-22.04

@firejox
Copy link

firejox commented Apr 9, 2024

@biggestsonicfan You can check whether systemd feature was enabled in your distro. When systemd feature is enabled, the init process will be /sbin/init instead of /init. That will lead gui program killed because they would be reset as the child process of /sbin/init. After you closing the terminal, there is no /init process in your running distro. WSL will shutdown running distro automatically due to no /init process.

@aki-k
Copy link

aki-k commented Apr 9, 2024

@firejox The WSL 2 instance automatic shutdown is happening because Microsoft decided that's how they wanted it to behave by default. Closing your shells have no effect on the init process. WSL 2 instances run systemd by default now.

@firejox
Copy link

firejox commented Apr 9, 2024

@aki-k It does matter. In fact, tmux solution does not work for me. When I ran wsl --exec some-gui, there was no gui appeared. It is because that gui program create orphan process. It just run WSL and then shutdown immediately. That is also why I wrote the systemd custom service to spawn sleeping Windows process.

@aki-k
Copy link

aki-k commented Apr 9, 2024

@firejox After installing x11-utils in the WSL 2 instance, you can run:

C:\>wsl.exe --distribution Ubuntu-22.04 --exec xev

and xev opens. But this doesn't make any change to how Microsoft has designed the WSL 2 instance auto-shutdown.

@firejox
Copy link

firejox commented Apr 9, 2024

@aki-k xev prove nothing. I ran omnetpp and there was no gui appeared. BTW, I just explain what situation would lead auto-shutdown happen. My systemd service is toggleable. It can be easily back to auto-shutdown mechanism.

@biggestsonicfan
Copy link

@biggestsonicfan If you start the WSL 2 instance like this, it will keep on running even if you close all the shells:

C:\>wsl.exe --distribution Ubuntu-22.04 --exec dbus-launch true & wsl.exe --distribution Ubuntu-22.04

Thanks, but I don't want to start my instance like that.

@biggestsonicfan You can check whether systemd feature was enabled in your distro. When systemd feature is enabled, the init process will be /sbin/init instead of /init. That will lead gui program killed because they would be reset as the child process of /sbin/init. After you closing the terminal, there is no /init process in your running distro. WSL will shutdown running distro automatically due to no /init process.

Do I check the logs for that? ls -l /sbin/init shows lrwxrwxrwx 1 root root 22 Apr 3 00:04 /sbin/init -> ../lib/systemd/systemd

@aki-k
Copy link

aki-k commented Apr 9, 2024

@biggestsonicfan firejox doesn't know how systemd in WSL 2 and the WSL 2 auto-shutdown work.

@firejox
Copy link

firejox commented Apr 10, 2024

@biggestsonicfan It means systemd enabled in your distro. There are several ways you can try in your scenario.

  1. disable systemd in wsl.conf
    Because your scenario works before distro update, there are some processes be the child processes of /init.
  2. start invisible interactive shell in Windows side.
    potchy's solution is for this.
  3. run command that make /init process exist in your system after closing interactive shell.

The third way you can put the command in profile script which will automatically run when you start an interactive shell (see what NotTheDr01ds does). If you don' want spawn new running process for every time you start an interactive shell, you can use flock to control it. Here is an example.

# put below in your profile script ( /etc/profile, /etc/profile.d/ ...)
nohup flock -nxF / keep-wsl-alive-script >/dev/null &  #  keep-wsl-alive-script can be replace by `flock -xF / true` due to double locking mechanism.

If you want to use systemd service, I will recommend you to spawn sleeping Windows process in systemd service script. Because systemd service script will spawn child process of systemd instead of /init, it will not keep WSL alive except that child process use /init.

@aki-k Please give helpful information. Among these discussions, you just keep saying this is by design and don't explain the workflow of the design. It is important for those people who have the same problem and want to seek the solution.

@biggestsonicfan
Copy link

biggestsonicfan commented Apr 11, 2024

@biggestsonicfan It means systemd enabled in your distro.

It is not. As you state. systemd requires the systemd=true flag in the [boot] of /etc/wsl.conf. I do not have this. Further, as I previously stated, I had nuked my WSL instance in attempt to fix GUI application issues. The issue turned out to be having systemd enabled. With systemd enabled, a few GUI applications could not launch and Dolphin could not open applications. All applications launch and now Dolphin can open applications again as the kactivitymanagerd service could not run with systemd enabled.

Continuing on, I am running a SUSE distro. All three patterns of wsl_base, wsl_gui, and wsl_systemd are not installed.

To go the absolute furthest step, ps --no-headers -o comm 1 returns init(openSUSE-T, not systemd, as well as systemctl is-system-running returning offline.

I feel that the suggested "invisible shells" and "heartbeat checks" are only band-aids to a problem which needs a solution.

Unsubscribing from this as I don't think I'm going to get a real answer until I accidentally discover it as a feature pushed upstream.

@bradwilson
Copy link
Author

Honestly the best way to know whether systemd is enabled is to run ps aux (or whatever the equivalent switches might be in your distro). My non-systemd distro only shows /init, plan9, and the shell I'm running in:

USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root         1  0.0  0.0   2280  1524 ?        Sl   11:41   0:00 /init
root         8  0.0  0.0   2280     4 ?        Sl   11:41   0:00 plan9 --control-socket 5 --log-level 4 --server-fd 6 --pipe-fd 8 --log-truncate
root        11  0.0  0.0   2296   116 ?        Ss   11:41   0:00 /init
root        12  0.0  0.0   2296   124 ?        S    11:41   0:00 /init
bradwil+    13  0.0  0.0  15256  9416 pts/0    Ss   11:41   0:00 -bash
bradwil+  2772  0.0  0.0  12676  1552 pts/0    R+   17:14   0:00 ps aux

A systemd distro will show a much more:

USER         PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root           1  0.5  0.0 167056 12780 ?        Ss   17:16   0:00 /sbin/init
root           2  0.0  0.0   2280  1304 ?        Sl   17:16   0:00 /init
root           9  0.0  0.0   2716   384 ?        Sl   17:16   0:00 plan9 --control-socket 6 --log-level 4 --server-fd 7 --pipe-fd 9 --log-truncate
root          38  0.0  0.0  39608 15168 ?        S<s  17:16   0:00 /lib/systemd/systemd-journald
root          56  0.1  0.0  21960  5944 ?        Ss   17:16   0:00 /lib/systemd/systemd-udevd
root          67  0.0  0.0   4492   168 ?        Ss   17:16   0:00 snapfuse /var/lib/snapd/snaps/bare_5.snap /snap/bare/5 -o ro,nodev,allow_other,suid
root          70  0.7  0.0   4756  1756 ?        Ss   17:16   0:00 snapfuse /var/lib/snapd/snaps/core22_864.snap /snap/core22/864 -o ro,nodev,allow_other,suid
root          74  0.0  0.0   4624   168 ?        Ss   17:16   0:00 snapfuse /var/lib/snapd/snaps/gtk-common-themes_1535.snap /snap/gtk-common-themes/1535 -o ro,nodev,allow_ot
root          76  2.1  0.0   4708  1724 ?        Ss   17:16   0:01 snapfuse /var/lib/snapd/snaps/snapd_20290.snap /snap/snapd/20290 -o ro,nodev,allow_other,suid
root          78  1.0  0.0   4956  1700 ?        Ss   17:16   0:00 snapfuse /var/lib/snapd/snaps/ubuntu-desktop-installer_1276.snap /snap/ubuntu-desktop-installer/1276 -o ro,
systemd+     118  0.0  0.0  25532 12728 ?        Ss   17:16   0:00 /lib/systemd/systemd-resolved
root         128  0.0  0.0   4304  2752 ?        Ss   17:16   0:00 /usr/sbin/cron -f -P
message+     130  0.0  0.0   8592  4732 ?        Ss   17:16   0:00 @dbus-daemon --system --address=systemd: --nofork --nopidfile --systemd-activation --syslog-only
root         138  0.0  0.0  30100 18548 ?        Ss   17:16   0:00 /usr/bin/python3 /usr/bin/networkd-dispatcher --run-startup-triggers
syslog       143  0.0  0.0 222400  5052 ?        Ssl  17:16   0:00 /usr/sbin/rsyslogd -n -iNONE
root         146  0.3  0.1 2205684 37280 ?       Ssl  17:16   0:00 /usr/lib/snapd/snapd
root         151  0.0  0.0  15324  7252 ?        Ss   17:16   0:00 /lib/systemd/systemd-logind
root         222  0.0  0.0   4780  3348 ?        Ss   17:16   0:00 /bin/bash /snap/ubuntu-desktop-installer/1276/bin/subiquity-server
root         224  0.0  0.0 107228 21368 ?        Ssl  17:16   0:00 /usr/bin/python3 /usr/share/unattended-upgrades/unattended-upgrade-shutdown --wait-for-signal
root         233  0.0  0.0   3236  1116 hvc0     Ss+  17:16   0:00 /sbin/agetty -o -p -- \u --noclear --keep-baud console 115200,38400,9600 vt220
root         235  0.0  0.0   3192  1116 tty1     Ss+  17:16   0:00 /sbin/agetty -o -p -- \u --noclear tty1 linux
root         321  2.2  0.2 154372 69200 ?        Sl   17:16   0:01 /snap/ubuntu-desktop-installer/1276/usr/bin/python3.10 -m subiquity.cmd.server --use-os-prober --storage-ve
root         405  0.0  0.0   7532  4864 pts/1    Ss   17:16   0:00 /bin/login -f
bradwil+     434  0.0  0.0  17040  9260 ?        Ss   17:16   0:00 /lib/systemd/systemd --user
bradwil+     439  0.0  0.0 169976  4616 ?        S    17:16   0:00 (sd-pam)
bradwil+     470  0.0  0.0   6068  5052 pts/1    S+   17:16   0:00 -bash
root         485  1.1  0.1  44204 37536 ?        S    17:16   0:00 python3 /snap/ubuntu-desktop-installer/1276/usr/bin/cloud-init status --wait
root         509  0.0  0.0   2296   120 ?        Ss   17:17   0:00 /init
root         510  0.0  0.0   2296   128 ?        R    17:17   0:00 /init
root         511  0.0  0.0  21960  3388 ?        S    17:17   0:00 /lib/systemd/systemd-udevd
root         512  0.0  0.0  21960  3388 ?        S    17:17   0:00 /lib/systemd/systemd-udevd
bradwil+     513  0.5  0.0   6080  5032 pts/0    Ss   17:17   0:00 -bash
root         514  0.0  0.0  21960  3388 ?        S    17:17   0:00 /lib/systemd/systemd-udevd
root         515  0.0  0.0  21960  3388 ?        S    17:17   0:00 /lib/systemd/systemd-udevd
root         516  0.0  0.0  21960  3388 ?        S    17:17   0:00 /lib/systemd/systemd-udevd
bradwil+     529  0.0  0.0   7480  3236 pts/0    R+   17:17   0:00 ps aux

(These are both Ubuntu 22.04) As previously mentioned, seeing PID 1 being /init (non-systemd) vs. /sbin/init (systemd) is the tip off, even if the extra 30 processes weren't enough. 😂

@aki-k
Copy link

aki-k commented Apr 11, 2024

The issue turned out to be having systemd enabled. With systemd enabled, a few GUI applications could not launch

This is false.

Unsubscribing from this as I don't think I'm going to get a real answer until I accidentally discover it as a feature pushed upstream.

This is temper tantrum.

@codeart1st
Copy link

@firejox thanks! Do I put it in /etc/wsl.conf ?

It is systemd service, so you need to save as [service name].service file and put the file under systemd folder e.g. /etc/systemd/system/. And you can run these commands.

  • Start systemd service
sudo systemctl start [service name].service
  • Make it can auto run in next booting.
systemctl enable [service name].service

Autostart (enable) didn't work for me.

keep-distro-alive.service: Failed to execute /mnt/c/Windows/system32/waitfor.exe: Exec format error
keep-distro-alive.service: Failed at step EXEC spawning /mnt/c/Windows/system32/waitfor.exe: Exec format error

But the service itself is fine. Seems like the interop layer is not ready during wsl startup.

@firejox
Copy link

firejox commented May 4, 2024

@codeart1st It looks like keep-distro-alive.service which is loaded before wsl-binfmt.service or before systemd-binfmt.service. You can check startup order with command systemd-analyze

systemd-analyze plot > startup_order.svg

startup_order.svg will show the loaded order of all services in systemd.

If keep-distro-alive.service is loaded before those binfmt services, you can add this line into your service file under [Unit] section.

After=wsl-binfmt.service systemd-binfmt.service

This will make keep-distro-alive.service executed after the service which setup binfmt.

resources:
https://stackoverflow.com/questions/29309717/is-there-any-way-to-list-systemd-services-in-linux-in-the-order-of-they-were-l
https://stackoverflow.com/questions/21830670/start-systemd-service-after-specific-service

@codeart1st
Copy link

@codeart1st It looks like keep-distro-alive.service which is loaded before wsl-binfmt.service or before systemd-binfmt.service. You can check startup order with command systemd-analyze

systemd-analyze plot > startup_order.svg

startup_order.svg will show the loaded order of all services in systemd.

If keep-distro-alive.service is loaded before those binfmt services, you can add this line into your service file under [Unit] section.

After=wsl-binfmt.service systemd-binfmt.service

This will make keep-distro-alive.service executed after the service which setup binfmt.

resources: https://stackoverflow.com/questions/29309717/is-there-any-way-to-list-systemd-services-in-linux-in-the-order-of-they-were-l https://stackoverflow.com/questions/21830670/start-systemd-service-after-specific-service

image

Thank you, you're right.

@everson-plantae
Copy link

@firejox Your solution works well when there is only one active distribution, but I am working with two, when starting the second one, the process in the first one is interrupted.

@firejox
Copy link

firejox commented May 21, 2024

@everson-plantae It is because the [signal name] in waitfor.exe [signal name] is globally shared. You need to use different signal names for different distributions. For example, there are Fedora and Ubuntu in WSL. You may use waitfor.exe FedoraAlive for Fedora and waitfor.exe UbuntuAlive for Ubuntu. Or you can use choice.exe with named pipe to keep distro alive. choice.exe is no need to setup special [signal name] for different distribution.

@astroboylrx
Copy link

@firejox It seems WSL2 has become more aggressive when trying to shutdown itself.

I added the systemd service suggested above (with the After=wsl-binfmt.service systemd-binfmt.service). However, after a night, WSL2 still shut down. Then I went to check the Task Manager, the waitfor.exe process is still there... somehow it survives but didn't keep WSL2 alive...

Any suggestions would be greatly appreciated.

@cerebrate
Copy link

Like I said way back in the thread, the only processes that will keep the WSL session from being auto-terminated are children of the Microsoft init (not the pid 1 init, the WSL-supplied init). No systemd service can be this, so you won't get anywhere with those.

To be clear, here's a pstree:

systemd─┬─Relay(1061)───wait-forever.sh───sleep
        ├─2*[agetty]
        ├─automount───5*[{automount}]
        ├─containerd───12*[{containerd}]
        ├─dbus-daemon
        ├─dockerd───14*[{dockerd}]
        ├─init-systemd(De─┬─SessionLeader───Relay(1936)───machinectl
        │                 ├─SessionLeader───Relay(332257)───sleep
        │                 ├─init───{init}
        │                 ├─sh
        │                 └─{init-systemd(De}
        ├─polkitd───3*[{polkitd}]
        ├─rpc.gssd───{rpc.gssd}
        ├─rpcbind
...etc....

What you are looking for is the Relay(number) parent. A process which has that above it in the tree (here, that would be wait-forever.sh, machinectl, and sleep) will keep the WSL instance running; one that doesn't, won't.

_nohup_ing or _daemonize_ing a waitforever script is still the best option, I believe.

@Parsifa1
Copy link

I used to use dbus-launch true in shell config to solve this problem, but recently after I updated, wsl will automatically close after one night, is there any plan to add "wsl never automatically closes" to wslconfig?

@aki-k
Copy link

aki-k commented Jul 23, 2024

is there any plan to add "wsl never automatically closes" to wslconfig?

After following this saga since the beginning: "No."

@cerebrate
Copy link

is there any plan to add "wsl never automatically closes" to wslconfig?

#8854 (comment) , but since it's got such a simple workaround, I can't imagine it's anywhere near the top of the WSL team's list of possible new features.

@astroboylrx
Copy link

_nohup_ing or _daemonize_ing a waitforever script is still the best option, I believe.

@cerebrate I tried _nohup_ing and WSL2 (Ubuntu 22.04) still stopped after a night. Previously dbus-launch true was sufficient to keep distro alive.
Feels like the shutdown did become more aggressive since a very recent update (I think maybe I'm seeing what @Parsifa1 sees).

WSL version: 2.3.11.0
Kernel version: 6.6.36.3-1
WSLg version: 1.0.63
MSRDC version: 1.2.5326
Direct3D version: 1.611.1-81528511
DXCore version: 10.0.26100.1-240331-1435.ge-release
Windows version: 10.0.22635.3930

@darrenchang
Copy link

Fair points - I think an option to keep distros running (and not idle terminate them) is a good idea.

Are there still plans to implement this?

@codeart1st
Copy link

codeart1st commented Oct 9, 2024

Small update for Windows 11 24H2 the waitfor.exe solution won't work anymore.

Job for keep-distro-alive.service failed because the control process exited with error code.
See "systemctl status keep-distro-alive.service" and "journalctl -xeu keep-distro-alive.service" for details.

I think it's a problem with the PreStart

/mnt/c/Windows/system32/waitfor.exe /si MakeDistroAlive
ERROR: Cannot send the specified signal.

I disabled the PreStart part for now.

@hasancemcerit
Copy link

This workaround that got inspiration from other posts such as this one, is guaranteed to be working.

I am running some dockerized application stack using compose on kali-linux.
▶ Prerequisite: systemd is enabled

  1. Create bash script keepalive.sh as below.
#!/bin/bash
# This script will run and make kali-linux distro running all the time.

# check systemd and wait until it's up & running
while :; do [[ ! $(systemctl is-system-running --wait 2> /dev/null) ]] && sleep 1 || break; done

# run your docker compose, or whatever you want to run on wsl
# docker compose -f docker-compose.yml --env-file docker.env -p your-project up --no-deps --detach

# check if tmux is in the background, if not start a new session
[[ ! $(tmux ls &> /dev/null) ]] && tmux new -d -s keepalive > /dev/null 2> /dev/null

# check if infitinte tail is running, if not start
[[ ! $(ps aux | grep tail | grep -v grep) ]] && nohup sh -c "tail -f /dev/null &" < /dev/null > /dev/null 2> /dev/null
  1. Create a scheduled task that runs on windows startup.
wsl.exe -d kali-linux --exec bash ./keepalive.sh

Enjoy that your app lives 💓 and will keep on living.

Test it and see for yourself from windows 👀
$❯ wsl -l -v

  NAME          STATE           VERSION
* kali-linux    Running         2

$❯ wsl -d kali-linux --exec tmux ls

keepalive: 1 windows (created Wed Oct  9 08:33:48 2024)

From wsl terminal:
$❯ docker ps -a

NAMES      STATE     IMAGE                           NETWORKS    PORTS
app        running   ---                             host
...        running   ---                             app_net   
postgres   running   postgres:17.0-alpine3.20        app_net   
...        running   ---                             app_net      
portainer  running   portainer/portainer-ce:alpine   bridge      

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests