-
Notifications
You must be signed in to change notification settings - Fork 30.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
child_process kill() swallows synchronously known problems (does not error out) #30668
Comments
For your own reference, here is the code for At least one thing that jumps out is that |
Thanks @cjihrig. I modified my repro to check for the return value. In my repro
It's a bit sad that the implementation does not expose the error specifics ( Given the current implementation we can say that |
It's not hitting the |
Gotcha. For the record, that confirms what I wrote above:
That is, I can now more specifically say what my proposal is: to raise an explicit Error in that case, indicating to the programmer that there is no child process to manage (to exhibit the kill() on). |
This commit documents the return value from subprocess.kill(). PR-URL: nodejs#30669 Refs: nodejs#30668 Reviewed-By: Luigi Pinca <luigipinca@gmail.com> Reviewed-By: Gireesh Punathil <gpunathi@in.ibm.com> Reviewed-By: James M Snell <jasnell@gmail.com>
There has been no activity on this feature request for 5 months and it is unlikely to be implemented. It will be closed 6 months after the last non-automated comment. For more information on how the project manages feature requests, please consult the feature request management document. |
There has been no activity on this feature request and it is being closed. If you feel closing this issue is not the right thing to do, please leave a comment. For more information on how the project manages feature requests, please consult the feature request management document. |
See https://gist.github.com/jgehrcke/ab4656353c1155173d2dde5ffceb0d0b for repro code. The output when executing this:
The log output with millisecond resolution shows the chronological order of events.
In the repro code I use an asynchronous startup error detection technique, using the "error" event. In the log output we can see that this event is being caught, indicating ENOENT (command/executable not found, as desired in this repro). That's great.
After this one can still call the
kill()
method on the process. The call "succeeds", silently swallowing a problem. The problem as I would put it into words: "there is no process that you can kill here", and that problem should not be silently swallowed, because it makes it too easy to write code with race conditions.I believe that this should be considered a bug in NodeJS: this is a programmer error, i.e. this should throw an Error, as it would in other programming environments. If unhandled, this should crash the code, indicating to the programmer that their assumption that the process is alive was wrong. The programmer should be required to explicitly handle an error thrown by
kill()
.It's not necessary to demonstrate the struggle, but maybe makes it easier to understand: the repro code calls kill() another time, about 1 second after the startup error had happened. That kill() also swallows the problem.
I don't know if internally the
kill()
method just is a noop in this case (where the runtime knows that there is no PID to issue akill()
system call to), or if it actually calls the system call and then swallows the ENOENT. But in both cases it makes a conscious choice, knows about the absence of the process, and hides the erroneous attempt from the programmer.There could be two ways to check for the error synchronously:
kill()
system call does fail with ENOENT, if it is even executed.kill()
system call is not executed then the runtime seems to have internal state about the fact that the process is not there (I am pretty sure that this is what's happening, see my code comments in the repro code). That state could be used.It is documented that
If this is meant to be the only, reliable, documented way to find out that a
kill()
failed (not quite clear from the documentation) then I think the repro code is also quite insightful: in the repro code that event handler is not called after the erroneouskill()
. That's in fact proven by the additional 1-second wait, during which I would expect that handler to be called.In the repro code comments I have pointed out that the runtime magically detects that the child process is gone, and it terminates the code pre-maturely, to prevent an indefinitely long wait from happening.
That is, I think this issue mainly reveals a rather mean inconsistency where the runtime sometimes seems to consider the fact that there is no process, and sometimes it doesn't, making it difficult to write robust child process management code.
Note: currently there does not seem to be a documented synchronous way to detect a child process startup error (upon common system call errors such as ENOENT and EACCES), and no documented synchronous way to check for process "liveness", although both could be done synchronously with "fast" system calls. This might be strongly related to this topic here. Related: eclipse-theia/theia#3447 and nodejs/help#1191.
The text was updated successfully, but these errors were encountered: