Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Console process order changed in 15063's v2 console making npm harder to close #2170

Closed
rprichard opened this issue May 29, 2017 · 7 comments

Comments

@rprichard
Copy link

In 15063's v1 console, and any console in Windows 10.0.14393 and older, Windows seems to order the processes from newest-to-oldest for GetConsoleProcessList (probably the opposite of the attachment order). When a console is closed, Windows considers each process, in order, and delivers it a CTRL_CLOSE_EVENT event. Windows gives the process 5 seconds to exit, after which point it terminates the process.

In 15063's v2 console, the order of the process list has reversed and is now oldest-to-newest. Maybe the change was intentional? Regardless, it affects npm (the node package manager), as detailed here: microsoft/vscode#26807 (comment).

The test case given on microsoft/vscode#26807 was fairly easy for me to reproduce, and it doesn't require VSCode. Installing node.js adds node.bat and npm.cmd to the PATH. Run the commands in an ordinary console, and instead of "pressing the garbage bin button", try to close the console window. In my experience, the first click leaves the youngest node.exe process running, and the second click finishes the job. (VSCode's "garbage bin button" eventually turns into winpty posting a WM_CLOSE message, which kills winpty-agent.exe. The console window survives, but it's hidden, so the node.exe is leaked.)

Starting in Windows 8, when the console is delivering CTRL_CLOSE_EVENT events, it seems to abort the operation if a process exits on its own accord. Windows 7, on the other hand, skips over dead processes.

The combination of the 15063 v2 change and the Win8 change breaks npm, which has a process tree like so:

  • A: node.exe (npm run, owns Job1)
    • B: cmd.exe (assigned to Job1)
      • C: node.exe (tsc --watch)

Windows delivers CTRL_CLOSE_EVENT to A(node.exe). node.exe turns it into a libuv-emulated SIGHUP signal, which is sent to Node's main thread and ignored. After 5 seconds, Windows kills A(node.exe). Killing A(node.exe) destroys Job1, which destroys B(cmd.exe). Apparently the console then aborts the close operation because B(cmd.exe) is missing.

Previously, Windows signaled C(node.exe) first, and after C exited, the other two processes would definitely exit.

Also: There seems to be a race condition between process cleanup (in the kernel?) and the close event signaling (in conhost.exe?), and sometimes the console close operation is able to skip over already-dead processes.

I wrote a test program demonstrating things discussed in this report -- closetest.cc, https://gist.github.com/rprichard/7ec3fe1b199f513bee82fea196a82a79. It has notes at the top of the file and a --help option.

  • To see the change in signaling order (EXAMPLE 1 in closetest.cc), run Sysinternals dbgview.exe, then closetest.exe without arguments.

  • To create a console that's hard to close (EXAMPLE 2), run closetest.exe -d alternate --gap -n N. In Windows 8 and up, you'll have to close the console N + 1 times. In Vista or Windows 7, just once is enough.

  • To see the race condition (EXAMPLE 3), try:

    • closetest.exe -d [backward/forward] -m job -n 10 --alloc 1, or
    • closetest.exe -d [backward/forward] -m job -n 50

    On a v1 or <= 14393 console, use -d backward. On the v2 15063 console, use -d forward.

  • The closest analog to npm's issue is closetest.exe -d forward -m job -n 2 --alloc 100. With the 15063 v2 console, this command usually takes two Close clicks to destroy the console.

Windows build number: 10.0.15063.332

@miniksa
Copy link
Member

miniksa commented Jul 3, 2017

This wasn't intentional and I'm making a change to put the process order back to how it was.

Basically, what happened was that I decided to use std::list instead of NT Driver LIST_ENTRY for the process list.

Somehow the "InsertHeadList" command from the NT LIST_ENTRY structures got converted into a .push_back instead of a .push_front in the new std::list usage. Probably my fault. Probably my brain screwed up. Sorry about that.

The fix should make it into Insiders builds within the next few weeks. Thanks for your report!

@miniksa miniksa self-assigned this Jul 3, 2017
@miniksa
Copy link
Member

miniksa commented Jul 7, 2017

Hey @rprichard,

I'm interested in using your code sample above at https://gist.github.com/rprichard/7ec3fe1b199f513bee82fea196a82a79 as a part of our test suites to prevent further regressions in this area, but I didn't see any licensing information. Would you be able to provide the code under an open source license? Generally we prefer a permissive license such as MIT, but please choose whatever license fits best for you. You can take a look at https://choosealicense.com/ if you need help choosing a license.

Thanks,
--Michael

@rprichard
Copy link
Author

@miniksa That sounds good to me. I'm traveling at the moment, but I'll add an MIT license file once I have the chance.

@rprichard
Copy link
Author

@miniksa I updated the Gist by adding an MIT License block at the front (revisions 2 and then 3).

@miniksa
Copy link
Member

miniksa commented Jul 10, 2017

@rprichard Thank you very much!

@benhillis
Copy link
Member

Fixed in 16257.

@bitcrazed
Copy link
Contributor

This issue was moved to microsoft/terminal#95

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants