Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"Stop Debugging" variant for terminating entire process tree #230850

Open
Birch-san opened this issue Oct 9, 2024 · 7 comments
Open

"Stop Debugging" variant for terminating entire process tree #230850

Birch-san opened this issue Oct 9, 2024 · 7 comments
Assignees
Labels
debug Debug viewlet, configurations, breakpoints, adapter issues feature-request Request for new features or functionality

Comments

@Birch-san
Copy link

Birch-san commented Oct 9, 2024

image

I'd really like for "Stop Debugging" (or a variant thereof) to terminate the entire process tree.

Debugging multi-process applications is common in Python / pytorch, but to "Stop" my application I need to press Stop per subprocess.

I have to press stop over 10 times to kill a realistic program — because each time I press Stop, it then takes me to the next subprocess, and we have subprocesses for dataloader workers, per-GPU model workers, model compilation workers… and it moves my focus every time. I get further and further from the line of code or problem that made me want to Stop the program, and by the end of the carousel my focus is left inside the torchrun wrapper script, which is never where I'm doing development.

the consequence right now is that I avoid as far as possible running my application in a realistic way. I reduce processes until it's down to just 3, so that I only have to press stop 3 times to terminate the program.

pressing "Restart" basically doesn't work, because a new process cannot be started until all processes from the current run have been killed. so for my most common task "run until I hit a problem, hit restart": I have to hit Stop 3 times then Run, when what I really want is to just hit restart once.

possible solutions:

  • additional stop button (super stop, etc)
  • little chevron next to stop button, which expands with a dropdown of other types of stop (which can be otherwise accessed via commands / keyboard shortcuts)
  • no GUI, just a command that I could look for / keybind
  • the same applies to the "restart" button

I think I'd want gentle termination of the subprocesses, but to disconnect the debugger so I don't have to watch the shutdown procedure and all the exceptions that get raised along the way.

thanks for any consideration you can give this!

@Birch-san
Copy link
Author

Birch-san commented Oct 9, 2024

one challenge could be "how do you kill a process tree" / "is it sufficient to just kill the parent".

on POSIX systems (macOS, Linux, etc) I think this is just:

  • send a SIGTERM to the parent
    • if (and only if) it's well-behaved, it'll send SIGTERM to its children and wait for them to terminate
    • in practice we should be prepared for a less-well-behaved process
  • if parent dies before children: children get orphaned
  • reaping orphaned processes is the responsibility of PID 1
  • a normal POSIX OS should have an init service running on PID 1, which will reap orphaned processes
  • a container does not necessarily run an init service on PID 1, but it is best-practice to do so (e.g. via docker run --init or installing something like tini or s6-overlay as the container's ENTRYPOINT)
    • I'm not sure who runs VSCode in a container; I'd count this as an edge-case

on Win32 systems emulating POSIX semantics via Cygwin/MSYS2, killing process trees gently (i.e. emulating POSIX's SIGTERM semantics) is a bit different… I took some notes on this a while ago (.pptx, .pdf):

  • Non-Cygwin application run in mintty terminal experiences problems
    • Ctrl+C (SIGINT) is equivalent to ConHost ConsoleCtrl event
    • SIGKILL can be emulated by calling TerminateProcess() on each process in tree
    • MSYS2 injects remote threads into processes, invokes such calls from within
    • SIGTERM has no equivalent on Windows… necessary for atexit() handlers
    • ExitProcess() on each process in the tree is close, but missed by some
    • Kernel32.dll!CtrlRoutine() is what they’re expecting. Non-exported symbol!
    • Address of private symbol found by inspecting stack when Ctrl+Cing a console
    • Send CtrlRoutine to non-Cygwin processes in tree, Cygwin SIGTERM otherwise
  • 2018 April: finally figured out how to emulate Ctrl+C for native apps

and found these to be useful references regarding "how do you emulate SIGTERM on Windows":

https://stackoverflow.com/questions/48199794/winpty-and-git-bash
mintty/mintty#56
mintty/mintty#376
msys2/MINGW-packages#2645 (comment)
https://github.com/mintty/mintty/wiki/Tips#inputoutput-interaction-with-alien-programs
https://stackoverflow.com/questions/44788982/node-js-ctrl-c-doesnt-stop-server-after-starting-server-with-npm-start/51078011#51078011
nodejs/node#16103
git-for-windows/MSYS2-packages@f4fda0f

but maybe I'm overthinking this, or maybe a simple solution can be developed for POSIX systems before confronting "what to do on Windows". you also have access to the terminal, which probably emulates these semantics already; could initiate a termination from the terminal then detach the debugger from all processes.

@rebornix rebornix assigned connor4312 and unassigned roblourens Dec 16, 2024
@rebornix rebornix added the debug Debug viewlet, configurations, breakpoints, adapter issues label Dec 16, 2024
@connor4312 connor4312 added the feature-request Request for new features or functionality label Dec 16, 2024
@vs-code-engineering vs-code-engineering bot added this to the Backlog Candidates milestone Dec 16, 2024
Copy link

This feature request is now a candidate for our backlog. The community has 60 days to upvote the issue. If it receives 20 upvotes we will move it to our backlog. If not, we will close it. To learn more about how we handle feature requests, please see our documentation.

Happy Coding!

@connor4312
Copy link
Member

The Python debugger can also set lifecycleManagedByParent on the debug options of child sessions to have them all behave as a tree, cc @eleanorjboyd

@Birch-san
Copy link
Author

thanks for that clue; looks like there was previously an effort to implement that but seems issue 1320 didn't get resolved.
microsoft/vscode-python#20376
microsoft/debugpy#1320

@eleanorjboyd
Copy link
Member

Hi! As you are debugging you should be able to click into the call stack (in the run and debug panel), find the main process and if you cancel that one it should cancel all of the subprocesses too. Does that work for you?

@Birch-san
Copy link
Author

Birch-san commented Jan 7, 2025

hi @eleanorjboyd, I've just now tried that and it does appear to work, so that's nice. and even the re-run button on the main process seems to work well.
whereas if I don't focus the main process, terminate has to be clicked 4 times.

but it's a bit annoying focusing the main process since the list keeps moving (the torch compiler gradually creates subprocesses, one each second) and depending on how large the pane is, it can require scrolling.

Image

from an accessibility point of view, it involves aiming at a small target which isn't always in the same place (as it may be scrolled out of view). there's a few steps required to interact with it, but this is an interaction I will do 100% of the time I run my program.

it could be nice for the playback palette here to have two rows of buttons, one which communicates with the main process and one which communicates with the focused thread:
Image

because as it stands: the playback palette's UX has been designed so you can put it in a known, convenient location. I'm sure this was done because it's important! but if its functionality doesn't actually terminate my process, I don't get that convenience. a workaround where I have to click something that's harder to access, is harder.

====

and it'd also be nice to have some kind of process filtering, for me to tell VSCode which python subprocesses I want the debugger attached to.
for example, I usually don't want to attach a debugger to a compiler thread or to a dataloader thread.
and sometimes I don't want to attach it to the main process either.
torchrun is a wrapper script that creates a subprocess of my program per GPU that I have; I usually only want debuggers attached to these direct-child subprocesses of torchrun, as that's where the code I wrote resides. in fact I often only want debugger attached to rank 0 of those!
a way to specify "which subprocesses we do/don't attach the debugger to" would be really helpful!

@eleanorjboyd
Copy link
Member

@connor4312, what are your thoughts on this UX? Do you see others face this for js/ts debugging?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
debug Debug viewlet, configurations, breakpoints, adapter issues feature-request Request for new features or functionality
Projects
None yet
Development

No branches or pull requests

5 participants