Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

shutdown node in-flight #21283

Merged
merged 1 commit into from
Mar 16, 2019
Merged

Conversation

gireeshpunathil
Copy link
Member

An identified use case for embedders is their ability to tear down Node while it is still running (event loop contain pending events)

Here the assumptions are that:

  • embedders do not wish to resort to JS routines to initiate shutdown
  • embedders have the Environment handle handy
  • embedders stop Node through a second thread.

Fixes: #19365
Refs: nodejs/user-feedback#51

Checklist
  • make -j4 test (UNIX), or vcbuild test (Windows) passes
  • tests and/or benchmarks are included
  • documentation is changed or added
  • commit message follows commit guidelines

@nodejs-github-bot
Copy link
Collaborator

@gireeshpunathil sadly an error occured when I tried to trigger a build :(

@nodejs-github-bot nodejs-github-bot added the c++ Issues and PRs that require attention from people who are familiar with C++. label Jun 12, 2018
@gireeshpunathil gireeshpunathil added shared_lib Issues and PRs related to use of Node.js as a shared library. embedding Issues and PRs related to embedding Node.js in another project. labels Jun 12, 2018
@addaleax
Copy link
Member

Why is this dependent on __POSIX__ or Node being a shared library?

Also: Do we want to enable breaking out of the currently execution JS code? It might be nice to work towards something similar to Worker::Exit()

src/node.cc Outdated Show resolved Hide resolved
@gireeshpunathil
Copy link
Member Author

@addaleax - thanks for the review.

Why is this dependent on POSIX or Node being a shared library?

On an assumptions that (a) only embedders need this feature, (b) embedders always use or want to use node as a shared library, and (c) the shared library is only available on POSIX platforms.

Reviewing the code, at least (c) is not true, but unsure on (a) and (b). What do you propose?

opening this up on the regular path has the side effect of hosting an extra async handle in the event loop all the time. But your below suggestion may invalidate that I guess?

It might be nice to work towards something similar to Worker::Exit()…

Sure, thanks - I will see that.

@addaleax
Copy link
Member

@gireeshpunathil b) isn’t true either. If you’re an embedder, at this point you’re most likely compiling a custom version of node yourself and linking it statically (e.g. electron).

I don’t know about a), but then again I’m not sure we need to restrict it to that?

@gireeshpunathil
Copy link
Member Author

@addaleax - thanks for the guidance, let me go back and do more study on this - to set the semantic straight.

@addaleax
Copy link
Member

@gireeshpunathil Thinking about it a bit more … right now, the event loop stopping mechanism for workers piggybacks on the message passing mechanism, grep around for e.g. stop_event_loop_. Maybe refactoring that into its own uv_async_t/HandleWrap would be a good start, so that we could reuse it for the main loop?

@gireeshpunathil
Copy link
Member Author

@addaleax - some gaps in my understanding, please help with fill those:

  1. uv_stop - I see it as just setting a flag which a subsequent loop iteration will take notice of and exit the loop. Is it thread safe? the current design is to call it from the owner thread of the loop, right? In worker case too, it is only invoked by the owner thread. If we are forcing Node to stop in the middle, we don't have a handle to the thread to invoke uv_stop, it is right now running arbitrary code sequence, am I making sense here?

  2. Your suggestion to refactor stop_event_loop_ based mechanism to HandleWrap mechanism: This is how I am seeing it, let me know if we are in sync:

  • When a worker is created, create and start a HandleWrap object too, in its event loop.
  • Cache this Handle in a shared space so that main (or any) thread can access this.
  • When some one wants to stop a worker, just send an interrupt through the handlewrap object.
  • Convert this mechanism into an API so that main thread can re-use it too.

Please let me know.

@jasnell
Copy link
Member

jasnell commented Sep 10, 2018

What's the status on this one?

@jasnell jasnell added the stalled Issues and PRs that are stalled. label Sep 10, 2018
@gireeshpunathil
Copy link
Member Author

@jasnell - some open questions that I will need to find answers. stalled label looks good for now.

@gireeshpunathil
Copy link
Member Author

I was looking at the possibility of uv_stop semantics to force-close the loop, but realized it is not possible:

The main loop never shuts down while there are active handles in it.
The uv_stop only breaks out of the libuv for the current iteration, with the live handles in tact.
Once it comes out, it resets the stop_flag:
https://github.com/libuv/libuv/blob/8865e72e25fc8aba94da7c9e18c3b5629d288ecc/src/unix/core.c#L386-L387
Back in node::Start, this does not have any effect and the loop continues.

node/src/node.cc

Lines 794 to 797 in 7182aca

more = uv_loop_alive(env.event_loop());
if (more)
continue;

Workers indeed breaks out fully; but leaves the handles open:

node/src/node_worker.cc

Lines 206 to 214 in 7182aca

if (is_stopped()) break;
uv_run(&loop_, UV_RUN_DEFAULT);
if (is_stopped()) break;
platform->DrainTasks(isolate_);
more = uv_loop_alive(&loop_);
if (more && !is_stopped())
continue;

So in effect:

  • uv_stop does not causse exit of the main event loop
  • uv_stop (if modified to exit fully) causes handle leak, if re-enterd by embedders

may be we should revisit uv_walk and uv_xx_close or uv_stop followed by uv_close ? the former was earlier assessed to be unsafe.

@addaleax
Copy link
Member

addaleax commented Feb 8, 2019

@gireeshpunathil What do you mean by handle leak? All handles associated with an Environment are cleaned up when the Environment is released, so that should be okay, right?

@gireeshpunathil
Copy link
Member Author

@addaleax - o, yes. I see!

So that means the only issue is to gracefully shutdown the main loop. In theory, we can exit if we use a flag that the stopper thread sets, and the main thread sees in the run loop to break out. But may be there is a better way?

@addaleax
Copy link
Member

addaleax commented Feb 8, 2019

@gireeshpunathil I still think refactoring out the approach we currently use for Workers is the best way here (I think I mentioned that above); we use a combination of an uv_async_t, whose handler calls uv_stop(), and isolate->TerminateExecution().

@gireeshpunathil
Copy link
Member Author

sure @addaleax, will give it a try

@mhdawson
Copy link
Member

Just a note that this is waiting on #26099

gireeshpunathil added a commit to gireeshpunathil/node that referenced this pull request Mar 1, 2019
The current mechanism of uses two async handles, one owned by the
creator of the worker thread to terminate a running worker,
and another one employed by the worker to interrupt its creator on its
natural termination. The force termination piggybacks on the message-
passing mechanism to inform the worker to quiesce.

Also there are few flags that represent the other thread's state /
request state because certain code path is shared by multiple
control flows, and there are certain code path where the async
handles may not have come to life.

Refactor into a LoopStopper abstraction that exposes routines to
install a handle as well as to save a state.

Refs: nodejs#21283
addaleax pushed a commit that referenced this pull request Mar 1, 2019
The current mechanism of uses two async handles, one owned by the
creator of the worker thread to terminate a running worker,
and another one employed by the worker to interrupt its creator on its
natural termination. The force termination piggybacks on the message-
passing mechanism to inform the worker to quiesce.

Also there are few flags that represent the other thread's state /
request state because certain code path is shared by multiple
control flows, and there are certain code path where the async
handles may not have come to life.

Refactor into an AsyncRequest abstraction that exposes routines to
install a handle as well as to save a state.

PR-URL: #26099
Refs: #21283
Reviewed-By: Anna Henningsen <anna@addaleax.net>
@gireeshpunathil gireeshpunathil removed the stalled Issues and PRs that are stalled. label Mar 1, 2019
@gireeshpunathil
Copy link
Member Author

resuming from stalled state. I would rebase this, but before that would link to have a thought on the logistics:

  • where do we accommodate the AsyncRequest object, that can be accessed by a foreign thread - Environment? it makes sense as we are talking about tearing the main environment here.
  • Process ? this also make sense, but may be that object is already an overloaded assortment?
  • NodePlatform ? but then is there an easy way for the embedder to reach out to it?
    /cc @addaleax @nodejs/embedders

@addaleax
Copy link
Member

addaleax commented Mar 1, 2019

@gireeshpunathil I think Environment makes the most sense, yes.

Copy link
Member

@mhdawson mhdawson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with @richardlau 's comments.

@gireeshpunathil gireeshpunathil force-pushed the nodestop branch 2 times, most recently from 4bcdb27 to c495f43 Compare March 16, 2019 06:11
@gireeshpunathil
Copy link
Member Author

@gireeshpunathil
Copy link
Member Author

This commit introduces a `node::Stop()` API.

An identified use case for embedders is their ability to tear down
Node while it is still running (event loop contain pending events)

Here the assumptions are that (i) embedders do not wish to resort to
JS routines to initiate shutdown (ii) embedders have the Environment
handle handy. (iii) embedders stop Node through a second thread.

Fixes: nodejs#19365
Refs: nodejs/user-feedback#51

PR-URL: nodejs#21283
Reviewed-By: Anna Henningsen <anna@addaleax.net>
Reviewed-By: Richard Lau <riclau@uk.ibm.com>
Reviewed-By: Michael Dawson <Michael_Dawson@ca.ibm.com>
@gireeshpunathil
Copy link
Member Author

landed as d35af56

addaleax added a commit to addaleax/node that referenced this pull request Mar 17, 2019
This test was broken by d35af56.

Refs: nodejs#21283
Fixes: nodejs#26712
richardlau pushed a commit that referenced this pull request Mar 18, 2019
This test was broken by d35af56.

Refs: #21283
Fixes: #26712

PR-URL: #26713
Reviewed-By: Richard Lau <riclau@uk.ibm.com>
Reviewed-By: Gireesh Punathil <gpunathi@in.ibm.com>
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
targos pushed a commit to targos/node that referenced this pull request Mar 27, 2019
This commit introduces a `node::Stop()` API.

An identified use case for embedders is their ability to tear down
Node while it is still running (event loop contain pending events)

Here the assumptions are that (i) embedders do not wish to resort to
JS routines to initiate shutdown (ii) embedders have the Environment
handle handy. (iii) embedders stop Node through a second thread.

Fixes: nodejs#19365
Refs: nodejs/user-feedback#51

PR-URL: nodejs#21283
Reviewed-By: Anna Henningsen <anna@addaleax.net>
Reviewed-By: Richard Lau <riclau@uk.ibm.com>
Reviewed-By: Michael Dawson <Michael_Dawson@ca.ibm.com>
targos pushed a commit to targos/node that referenced this pull request Mar 27, 2019
This test was broken by d35af56.

Refs: nodejs#21283
Fixes: nodejs#26712

PR-URL: nodejs#26713
Reviewed-By: Richard Lau <riclau@uk.ibm.com>
Reviewed-By: Gireesh Punathil <gpunathi@in.ibm.com>
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
targos pushed a commit that referenced this pull request Mar 27, 2019
This commit introduces a `node::Stop()` API.

An identified use case for embedders is their ability to tear down
Node while it is still running (event loop contain pending events)

Here the assumptions are that (i) embedders do not wish to resort to
JS routines to initiate shutdown (ii) embedders have the Environment
handle handy. (iii) embedders stop Node through a second thread.

Fixes: #19365
Refs: nodejs/user-feedback#51

PR-URL: #21283
Reviewed-By: Anna Henningsen <anna@addaleax.net>
Reviewed-By: Richard Lau <riclau@uk.ibm.com>
Reviewed-By: Michael Dawson <Michael_Dawson@ca.ibm.com>
targos pushed a commit that referenced this pull request Mar 27, 2019
This test was broken by d35af56.

Refs: #21283
Fixes: #26712

PR-URL: #26713
Reviewed-By: Richard Lau <riclau@uk.ibm.com>
Reviewed-By: Gireesh Punathil <gpunathi@in.ibm.com>
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
targos added a commit that referenced this pull request Mar 28, 2019
Notable changes:

* crypto
  * Allow deriving public from private keys (Tobias Nießen)
    [#26278](#26278).
* events
  * Added a `once` function to use `EventEmitter` with promises
    (Matteo Collina) [#26078](#26078).
* tty
  * Added a `hasColors` method to `WriteStream` (Ruben Bridgewater)
    [#26247](#26247).
  * Added NO_COLOR and FORCE_COLOR support (Ruben Bridgewater)
    [#26485](#26485).
* v8
  * Added `v8.getHeapSnapshot` and `v8.writeHeapSnapshot` to generate snapshots
    in the format used by tools such as Chrome DevTools (James M Snell)
    [#26501](#26501).
* worker
  * Added `worker.moveMessagePortToContext`. This enables using MessagePorts in
    different vm.Contexts, aiding with the isolation that the vm module seeks to
    provide (Anna Henningsen)
    [#26497](#26497).
* C++ API
  * `AddPromiseHook` is now deprecated. This API was added to fill an use case
    that is served by `async_hooks`, since that has `Promise` support
    (Anna Henningsen) [#26529](#26529).
  * Added a `Stop` API to shut down Node.js while it is running
    (Gireesh Punathil) [#21283](#21283).
* meta
  * [Gireesh Punathil](https://github.com/gireeshpunathil) is now a member of
    the Technical Steering Committee
    [#26657](#26657).
  * Added [Yongsheng Zhang](https://github.com/ZYSzys) to collaborators
    [#26730](#26730).

PR-URL: #26949
targos added a commit that referenced this pull request Mar 28, 2019
Notable changes:

* crypto
  * Allow deriving public from private keys (Tobias Nießen)
    [#26278](#26278).
* events
  * Added a `once` function to use `EventEmitter` with promises
    (Matteo Collina) [#26078](#26078).
* tty
  * Added a `hasColors` method to `WriteStream` (Ruben Bridgewater)
    [#26247](#26247).
  * Added NO_COLOR and FORCE_COLOR support (Ruben Bridgewater)
    [#26485](#26485).
* v8
  * Added `v8.getHeapSnapshot` and `v8.writeHeapSnapshot` to generate snapshots
    in the format used by tools such as Chrome DevTools (James M Snell)
    [#26501](#26501).
* worker
  * Added `worker.moveMessagePortToContext`. This enables using MessagePorts in
    different vm.Contexts, aiding with the isolation that the vm module seeks to
    provide (Anna Henningsen)
    [#26497](#26497).
* C++ API
  * `AddPromiseHook` is now deprecated. This API was added to fill an use case
    that is served by `async_hooks`, since that has `Promise` support
    (Anna Henningsen) [#26529](#26529).
  * Added a `Stop` API to shut down Node.js while it is running
    (Gireesh Punathil) [#21283](#21283).
* meta
  * [Gireesh Punathil](https://github.com/gireeshpunathil) is now a member of
    the Technical Steering Committee
    [#26657](#26657).
  * Added [Yongsheng Zhang](https://github.com/ZYSzys) to collaborators
    [#26730](#26730).

PR-URL: #26949
BethGriggs pushed a commit that referenced this pull request Apr 5, 2019
Notable changes:

* crypto
  * Allow deriving public from private keys (Tobias Nießen)
    [#26278](#26278).
* events
  * Added a `once` function to use `EventEmitter` with promises
    (Matteo Collina) [#26078](#26078).
* tty
  * Added a `hasColors` method to `WriteStream` (Ruben Bridgewater)
    [#26247](#26247).
  * Added NO_COLOR and FORCE_COLOR support (Ruben Bridgewater)
    [#26485](#26485).
* v8
  * Added `v8.getHeapSnapshot` and `v8.writeHeapSnapshot` to generate snapshots
    in the format used by tools such as Chrome DevTools (James M Snell)
    [#26501](#26501).
* worker
  * Added `worker.moveMessagePortToContext`. This enables using MessagePorts in
    different vm.Contexts, aiding with the isolation that the vm module seeks to
    provide (Anna Henningsen)
    [#26497](#26497).
* C++ API
  * `AddPromiseHook` is now deprecated. This API was added to fill an use case
    that is served by `async_hooks`, since that has `Promise` support
    (Anna Henningsen) [#26529](#26529).
  * Added a `Stop` API to shut down Node.js while it is running
    (Gireesh Punathil) [#21283](#21283).
* meta
  * [Gireesh Punathil](https://github.com/gireeshpunathil) is now a member of
    the Technical Steering Committee
    [#26657](#26657).
  * Added [Yongsheng Zhang](https://github.com/ZYSzys) to collaborators
    [#26730](#26730).

PR-URL: #26949
@gireeshpunathil gireeshpunathil mentioned this pull request Dec 3, 2019
4 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
author ready PRs that have at least one approval, no pending requests for changes, and a CI started. c++ Issues and PRs that require attention from people who are familiar with C++. embedding Issues and PRs related to embedding Node.js in another project. semver-minor PRs that contain new features and should be released in the next minor version.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Ability to shutdown Node in-flight
8 participants