Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

process: add capture function for uncaught exceptions #17159

Closed

Conversation

addaleax
Copy link
Member

Introduce process.shouldAbortOnUncaughtException to control --abort-on-uncaught-exception behaviour, and implement some of the domains functionality on top of it.

Refs: #17143

Checklist
  • make -j4 test (UNIX), or vcbuild test (Windows) passes
  • tests and/or benchmarks are included
  • documentation is changed or added
  • commit message follows commit guidelines
Affected core subsystem(s)

process

@addaleax addaleax added domain Issues and PRs related to the domain subsystem. process Issues and PRs related to the process subsystem. semver-minor PRs that contain new features and should be released in the next minor version. labels Nov 20, 2017
@nodejs-github-bot nodejs-github-bot added c++ Issues and PRs that require attention from people who are familiar with C++. lib / src Issues and PRs related to general changes in the lib or src directory. labels Nov 20, 2017
@addaleax
Copy link
Member Author

apapirovski
apapirovski previously approved these changes Nov 20, 2017
Copy link
Member

@apapirovski apapirovski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, as long as CI is happy 👍

If the `--abort-on-uncaught-exception` flag is not set, this value is ignored.

*Note*: If your code uses the deprecated [`domain`][] built-in module,
this flag will be set by that module on whenever domain state changes.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there's an extra 'on' here. Could just be '[...]module whenever[...]'.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done, thanks for catching!

Copy link
Member

@AndreasMadsen AndreasMadsen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sigh, I don't really like this. Basically, this allows users to overwrite the behavior of --abort-on-uncaught-exception. That could introduce some really hard to debug bugs.

I really like that the complexity is now moved to JavaScript, but we should at least discuss alternatives (or why there are none) for the public API.

edit: only using the ❌ , to indicate that this needs further discussion.

when the `--abort-on-uncaught-exception` flag was passed on the command line
or set through [`v8.setFlagsFromString()`][]:

If the flag was passed to Node and `process.shouldAbortOnUncaughtException`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume the default value is true. But it is not clear from the documentation.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@apapirovski
Copy link
Member

Basically, this allows users to overwrite the behavior of --abort-on-uncaught-exception.

It's already allowed via domains though... ?

lib/domain.js Outdated
class Domain extends EventEmitter {
function updateShouldAbort() {
process.shouldAbortOnUncaughtException =
stack.every((domain) => domain.listenerCount('error') === 0);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, realize this isn't necessarily perf sensitive code but we could just abstract this to a separate top scope function like const checkErrorListenerCount = (domain) => domain.listenerCount('error') === 0. Take it or leave it, this is a very, very minor nit.

(The main impact of a change like this would be on future contributors more than practical impacts on this specific code.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would expect the perf to be the same, this function doesn’t actually close over any outside variables.

Copy link
Member

@apapirovski apapirovski Nov 20, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've tested a similar situation before and it's about 10% difference. I would think it's because the arrow function has to be created separately each time (I don't imagine this is something that V8 can optimize) but it's possible there's another reason that I overlooked.

But as I said, it's not perf sensitive so this was more about steering future contributors in the right direction as far as these type of things go.

Anyway, just ignore me... this took up more time than I intended it to. 😆

@AndreasMadsen
Copy link
Member

AndreasMadsen commented Nov 20, 2017

It's already allowed via domains though... ?

Yes. But domain is guaranteed to be well behaved. What I fear is that process.shouldAbortOnUncaughtException is set to false when it should be false true. Thus creating a difficult-to-debug bug.

Also, as I understand it, this will still emit events to uncaughtException which is not what domain does. So it doesn't actually overwrite --abort-on-uncaught-exception because the error never reaches uncaughtException.

@addaleax
Copy link
Member Author

What I fear is that process.shouldAbortOnUncaughtException is set to false when it should be false.

Sorry, I think one false is supposed to be a true? Which one?

Also, as I understand it, this will still emit events to uncaughtException which is not what domain does.

Uh, I’m not sure which scenario that would be – if shouldAbortOnUncaughtException is true, then it flat out aborts. If it is false, then yes, uncaughtException is called.

So it doesn't actually overwrite --abort-on-uncaught-exception because the error never reaches uncaughtException.

It does when --abort-on-uncaught-exception is passed and shouldAbortOnUncaughtException is false. Was/is that unclear? (This is about the only thing I added a new test for 😄)

@apapirovski
Copy link
Member

@addaleax Could we add a test that confirms the following?

If the --abort-on-uncaught-exception flag is not set, this value is ignored.

@addaleax
Copy link
Member Author

@addaleax Could we add a test that confirms the following?

If the --abort-on-uncaught-exception flag is not set, this value is ignored.

@apapirovski The existing tests for uncaught exceptions should cover that, since the default value is true; right?

@apapirovski
Copy link
Member

apapirovski commented Nov 20, 2017

Yeah, I think I'm just being overcautious. Was trying to account for a situation where enough changes occur that other tests aren't representative anymore but I don't think that's actually possible.

That said, I feel like we should at least add a check to confirm the default value of process.shouldAbortOnUncaughtException both with the flag and without. That way we can catch any weird regressions. (Especially since there's technically no actual test for the getter.)

@@ -2,7 +2,7 @@
'use strict';
const common = require('../common');

// This should make the process not crash even though the flag was passed.
// This should make the process does not crash even though the flag was passed.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the change might've turned this into a typo? I think the previous version was just referring to the next line in which case it was grammatically correct.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lol yes i just noticed myself 😄

@addaleax
Copy link
Member Author

@apapirovski Okay, I’ve added some more explicit tests about those facts

@AndreasMadsen
Copy link
Member

AndreasMadsen commented Nov 20, 2017

Sorry, I think one false is supposed to be a true? Which one?

Sorry. Fixed it. Yes, what you guessed. When process.shouldAbortOnUncaughtException is false but supposed to be true.

Uh, I’m not sure which scenario that would be – if shouldAbortOnUncaughtException is true, then it flat out aborts. If it is false, then yes, uncaughtException is called.

Exactly. When process.shouldAbortOnUncaughtException is false uncaughtException is called. However, in the domain case uncaughtException isn't called thus the current approach doesn't actually overwrite --abort-on-uncaught-exception.

It does when --abort-on-uncaught-exception is passed and shouldAbortOnUncaughtException is false. Was/is that unclear?

Yes. But not in the domain case.


I must express myself very poorly :/ I will try and extract the important parts of this conversation:

@AndreasMadsen:
Sigh, I don't really like this. Basically, this allows users to overwrite the behavior of --abort-on-uncaught-exception. That could introduce some really hard to debug bugs.

@apapirovski:
It's already allowed via domains though... ?

No, it isn't because domain prevents uncaughtException from being emitted. Thus it doesn't overwrite --abort-on-uncaught-exception.

I don't like this because --abort-on-uncaught-exception is an opt-in option, that we are now allowing the user to opt-out of after.

@apapirovski
Copy link
Member

@AndreasMadsen Makes sense. I don't have a particularly strong opinion on this. It seems like a satisfactory implementation change might be to use a Symbol instead?

@addaleax
Copy link
Member Author

No, it isn't because domain prevents uncaughtException from being emitted. Thus it doesn't overwrite --abort-on-uncaught-exception.

@AndreasMadsen I don't quite follow, tbh... when a domain error listener is there, that prevents uncaughtException from being emitted, yes. Is your point that domains give a third option other than aborting and uncaughtException? (If so: Why does the difference between domain.on('error') and process.on('uncaughtException') matter?).

I don't like this because --abort-on-uncaught-exception is an opt-in option, that we are now allowing the user to opt-out of after.

It might be helpful if you could explain why this is a bad thing in itself? This is a low-level primitive, and if you think it's a good idea we can certainly be more explicit about that in the docs.

Also: Coming to think of that, you can already kind of override it anyway, v8.setFlagsFromString('--no-abort-on-uncaught-exception'); does the trick, although it does so a bit more globally (for all isolates, if there are multiple ones).

It seems like a satisfactory implementation change might be to use a Symbol instead?

@apapirovski Use a Symbol for what? If we want to not expose this flag we can just not do that, but I would really prefer to keep working towards a domain impl that could be outsourced to userland...

@apapirovski
Copy link
Member

Use a Symbol for what? If we want to not expose this flag we can just not do that, but I would really prefer to keep working towards a domain impl that could be outsourced to userland...

I mean, we would still want an internal way to set shouldAbortOnUncaughtToggle from domain. Symbol would be the lowest effort change (search and replace), that is all. Trying to asses how far apart you & @AndreasMadsen are.

(I don't have a strong opinion on whether users should change this or not, so my questions/comments are not meant to lean one way or another.)

@addaleax
Copy link
Member Author

@apapirovski If we want this to be internal, we could always just expose it on require('internal/process') instead or so

@apapirovski
Copy link
Member

apapirovski commented Nov 20, 2017

@addaleax Yeah, I just wondered (aloud) if the lowest-effort change would be enough to bridge the gap or if there are other objections to this implementation. Not suggesting it's an optimal way to do it (or isn't).

And the main reason I'm chiming in in the first place is that a straight up red X was used here immediately which seems like an indicator of pretty strong opposition. Given that most PRs have 48 hours to land, it would be nice if conversations could be had without resorting to ❌

@@ -1920,10 +1944,12 @@ cases:
[`process.exitCode`]: #process_process_exitcode
[`process.kill()`]: #process_process_kill_pid_signal
[`promise.catch()`]: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Promise/catch
[`process.on('uncaughtException')`]: process.html#process_event_uncaughtexception
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A nit: out of ABC order)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mh - it's o after c, right?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The one before it is promise not process. Tripped me up the first few times I read it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

process.kill
process.on
promise.catch ?)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, yup. They look so similar …

is `true`, the process will abort when enountering an uncaught exception
(regardless of any [`process.on('uncaughtException')`][] listeners).

If the flag was passed to Node and `process.shouldAbortOnUncaughtException`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/Node/Node.js

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done!


If the `--abort-on-uncaught-exception` flag is not set, this value is ignored.

*Note*: If your code uses the deprecated [`domain`][] built-in module,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please void the use of your in the docs :-)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done!

@@ -412,6 +412,23 @@

return caught;
};

// This is a typed array for faster communication with JS.
const shouldAbortOnUncaughtToggle = process._shouldAbortOnUncaughtToggle;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we think there might be more of these kinds of config style toggles, it may be worthwhile creating a single TypedArray that can hold multiple that it exposed via process.binding('config') rather than introducing one per instance. For now, this is fine, I just would like to avoid having a growing number of one off TypedArray instances.

v8.setFlagsFromString('--abort-on-uncaught-exception');
// This should make the process not crash even though the flag was passed.
process.shouldAbortOnUncaughtException = false;
process.once('uncaughtException', common.mustCall());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My question would be whether this violates user expectations. It might be worthwhile emitting a process warning here so that, at the very least, the user can be aware of what is happening. It is entirely possible and likely that some userland module deep in the dependency tree could set this property and a user would have no idea why --abort-on-uncaught-exception is not working.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’m -1 on a process warning; if there is a shouldAbortOnUncaughtException = false statement, that’s not an accident.

It is entirely possible and likely that some userland module deep in the dependency tree could set this property

This flag isn’t really relevant for non-diagnostics tooling, so I would be surprised if that were true.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

...so I would be surprised if that were true

I've been surprised many times by what I've found in userland ;-)

If we're not going to warn on this, then the documentation for the --abort-on-uncaught-exception flag should be updated to clearly indicate that the argument can be short circuited by this new process property. If nothing else, that gives users something to go look for before filing the bug report here that the flag isn't working ;-)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, sure. 👍

@amaramth
Copy link

god, this is a disaster. do y'all not have anyone doing quality review any more?

@addaleax
Copy link
Member Author

@alice0meta If you have feedback that addresses the substance of this PR, I’d really like to hear it – this was not an API that anybody ever really wanted, it’s a means to an end, and it’s young enough to be considered not set in stone.

But otherwise I’m not sure what exactly you’re trying to tell us with that comment? What exactly do you think went wrong here?

@amaramth
Copy link

nod. sorry, um. i found this api by looking at the table of contents in https://nodejs.org/api/process.html. but it's a hack whose purpose is to make it easier to use certain legacy systems (right?).

this slows down and/or confuses people going to the node api to find out how to handle uncaught exceptions. but this could be largely remedied if, say, it was moved to the end & automatically hidden, like
image
plus a clickable arrow to show, to indicate that this api is disrecommended

i don't have experience with this kind of layout, though, so i imagine the right way to show this to the user might be somewhat different than this idea. i just think it at least needs to be expressed in the documentation in a way that doesn't slow the user down - it's part of the intended understanding of the function, right?

vsemozhetbyt pushed a commit that referenced this pull request Jun 27, 2018
setUncaughtExceptionCapture -> setUncaughtExceptionCaptureCallback

process.setUncaughtExceptionCaptureCallback and its docs were originally
added in #17159

PR-URL: #21523
Reviewed-By: Anna Henningsen <anna@addaleax.net>
Reviewed-By: Trivikram Kamat <trivikr.dev@gmail.com>
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Reviewed-By: Vse Mozhet Byt <vsemozhetbyt@gmail.com>
targos pushed a commit that referenced this pull request Jun 28, 2018
setUncaughtExceptionCapture -> setUncaughtExceptionCaptureCallback

process.setUncaughtExceptionCaptureCallback and its docs were originally
added in #17159

PR-URL: #21523
Reviewed-By: Anna Henningsen <anna@addaleax.net>
Reviewed-By: Trivikram Kamat <trivikr.dev@gmail.com>
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Reviewed-By: Vse Mozhet Byt <vsemozhetbyt@gmail.com>
@misterdjules
Copy link

I just stumbled on this PR while reading #17143. A few questions and comments came up to my mind while trying to understand the goal(s) of this PR and its impact on users:

  1. The original comment mentions what this PR does, but there is no description of why this PR is useful or necessary. It makes it difficult to determine what use cases are enabled, and what other use cases are impacted in any way. If I understand correctly, the reason why those changes were proposed was to remove as much of the domains implementation as possible from the rest of the exception handling code, so that eventually removing domains or swapping it out for another mechanism would be as non-disruptive as possible. Is that correct?

  2. It seems that this PR introduces a new mechanism that would allow any userland module to completely turn off the behavior that --abort-on-uncaught-exception provides. It seems the argument to dismiss that concern was that there was another mechanism provided by v8.setFlagsFromString that had already been available for a while and that also allowed disabling that behavior. From the perspective of users of post-mortem debugging, that is one more obstacle in trying to use a runtime that behaves consistently.

  3. This PR didn't notify the appropriate teams (in that case @nodejs/post-mortem, @nodejs/diagnostics and @nodejs/domains ) about these changes. As a result it is possible that a significant number of people who have significant knowledge about all impacted use cases were unable to provide feedback. I know I would have asked a lot of questions if I had been notified of this PR.

  4. Point 2) really seems like a red flag to me, and it seems it might not be necessary for this PR to achieve its goals. It seems that maybe this could be implemented as a hook that would not be exposed publicly? If that is not possible, do you mind clarifying why?

I haven't read every single comment in this PR, and as result I may be missing the nuances of what actual impact it has on users. But if point 2) is true, I would like to express my strong opposition to this change in its current form, and I would like to know whether we could explore an alternative that would achieve the same goals without having the same impact on the --abort-on-uncaught-exception use case.

Thoughts?

@mcollina
Copy link
Member

From a process perspective not tagging the relative groups is unusual, but we do not have official code owners and enough collaborators LGTM'ed this. So, this is ok (we might want to change our processes but that's a different topic). Note that 3 @nodejs/diagnostics members gave a looking good. cc @nodejs/tsc

The why of this PR is to ultimately be able to implement something like Domain outside of core. There are a significant number of people using require('domain') and it has been deprecated (but there) for ages. This is a required behavior for require('domain'). There is a general agreement that we should enable the ecosystem to innovate and evolve.

From the perspective of users of post-mortem debugging, that is one more obstacle in trying to use a runtime that behaves consistently.

I would say that this is extremely easy to disable, just add a preload script and override that function. Currently it's the only way to use promises and async/await safely (see https://github.com/mcollina/make-promises-safe for more details).

I would recommend to do a node -r make-postmortem-viable server that sets the defaults that you need.

@AndreasMadsen
Copy link
Member

AndreasMadsen commented Aug 31, 2018

@misterdjules This PR replaces complexity that already existed in domains. It thus doesn't change anything about the uncaught-exception mechanics. I agree that it is very often a bad idea to use this API. But those reasons are the same as why using domain is a bad idea.

In the end, this PR doesn't add or remove anything of consequence. I think we have just made it more transparent how domains work and why it is often a bad idea.

@misterdjules
Copy link

@mcollina

From a process perspective not tagging the relative groups is unusual, but we do not have official code owners and enough collaborators LGTM'ed this. So, this is ok (we might want to change our processes but that's a different topic).

I'm aware that this PR followed the documented process for contributions and changes. My goal with that comment was to give feedback that socializing such potentially impactful changes outside of the circle of usual Node.js core reviewers is important.

#17143 has also been discussing radical changes to the domains module (including its removal from core) without notifying teams, and instead individuals. In general I think there is room for improvement on the communication aspects of those impactful PRs.

Note that 3 @nodejs/diagnostics members gave a looking good.

Not all members of the diagnostics group (which has a lot more members than 3) share the same perspective or have the same use cases. Moreover, being part of a given group doesn't mean that they knew about that PR because they're part of that group, or that they were providing feedback with the interests of that group in mind.

The why of this PR is to ultimately be able to implement something like Domain outside of core.

Ah, that's good to know! I definitely understand the broader use case better now, thanks for clarifying 👍

I would say that this is extremely easy to disable, just add a preload script and override that function.

My overall impression is that resorting to using such mechanisms to disable public and documented core APIs to ensure consistency of the runtime's behavior is a sign of potential design issues and leads to worse developer experience. It's difficult to measure and quantify though.

The v8:: SetAbortOnUncaughtExceptionCallback API seems to provide the same ability as this new API to override the behavior of the runtime, but that API is a V8 API on the Isolate class, and is documented as "used by embedders".

Practically it's not much of a difference, because anyone could build a native module that uses that API (like e.g. node-report does), but one could argue that the way that this API is documented and exposed creates a higher barrier to entry and describes a clearer use case.

It seems that userland module authors could still be using that existing API to experiment with alternatives to domains. Is that not the case?

@mcollina
Copy link
Member

It seems that userland module authors could still be using that existing API to experiment with alternatives to domains. Is that not the case?

It is not, relying on V8 APIs is hard across Node.js versions and runtimes. Moreover, a significant chunk of the Node.js users prefer JS-only modules. Exposing a V8 API seems fair, considering how much is domain still being used in the ecosystem.

@misterdjules
Copy link

@AndreasMadsen

It thus doesn't change anything about the uncaught-exception mechanics.

What it changes is that it significantly lowers the barrier to entry for completely overriding the behavior of the runtime when an exception is thrown and not caught.

The impact of that is difficult to measure and quantify, but I think this could have a negative impact on post-mortem debugging use cases.

If that's the conscious trade-off the project is willing to make to allow for easier innovation in the effort to come up with an alternative to domains, that's fine. However I'm not sure that this newly added API is necessary for that, unless the existing v8:: SetAbortOnUncaughtExceptionCallback cannot be used to achieve the same purpose.

@misterdjules
Copy link

@mcollina

It is not, relying on V8 APIs is hard across Node.js versions and runtimes.

I agree with that if we consider the whole set of V8 APIs used across all modules, but that API has been there for a while and never changed (unless I'm missing something), and although it wouldn't be reasonable to presume it will never change, I wouldn't be surprised if it'd turn out to be as stable as any other userland API.

Moreover, a significant chunk of the Node.js users prefer JS-only modules

I would agree with that 👍

Exposing a V8 API seems fair

What V8 API are you referring to?

@mcollina
Copy link
Member

v8::SetAbortOnUncaughtExceptionCallback?

@misterdjules
Copy link

@mcollina But that v8::SetAbortOnUncaughtExceptionCallback API is already exposed, isn't it? Or are you referring to exposing it as a JS API, which seems like it'd be pretty much equivalent to what this PR does?

@addaleax
Copy link
Member Author

addaleax commented Aug 31, 2018

The why of this PR is to ultimately be able to implement something like Domain outside of core.

Maybe to add to this, since it sounds like you see a risk of domains in core eventually being removed: I’m with you here. The important feature, as far as I am concerned, is orthogonality & the ability to implement domains outside of core – it could become e.g. an npm module that we vendor in and to which users can migrate.

It thus doesn't change anything about the uncaught-exception mechanics.

What it changes is that it significantly lowers the barrier to entry for completely overriding the behavior of the runtime when an exception is thrown and not caught.

I think that’s a positive feature – even if we expect it and want it to be used infrequently. It enables use cases that would previously only have been accessible through semi-private API (e.g. overriding process._fatalException, which I’d guess happens more than this API itself is being used).

The impact of that is difficult to measure and quantify, but I think this could have a negative impact on post-mortem debugging use cases.

You’re right, and the documentation should be more explicit about the facts that process.on('uncaughtException') is the preferable approach here and that it interferes with post-mortem debugging.

@misterdjules
Copy link

@addaleax

Maybe to add to this, since it sounds like you see a risk of domains in core eventually being removed: I’m with you here. The important feature, as far as I am concerned, is orthogonality & the ability to implement domains outside of core – it could become e.g. an npm module that we vendor in and to which users can migrate.

I'm not necessarily seeing a risk of domains being removed from core. It is my understanding based on the status of #17143 that this won't happen before a functionally equivalent alternative is made available (but please correct me if that's an incorrect assumption).

I'm more concerned about the semantics of --abort-on-uncaught-exception and being able to guarantee (as much as possible) that those semantics are not broken.

I think that’s a positive feature – even if we expect it and want it to be used infrequently. It enables use cases that would previously only have been accessible through semi-private API (e.g. overriding process._fatalException, which I’d guess happens more than this API itself is being used).

If the main use cases are to expose a hook into process._fatalException it seems like it could be done separately from the changes made to expose the uncaught exception callback used to determine the behavior of --abort-on-uncaught-exception.

If the main use case is to allow userland implementations of the domain core module though, then it makes sense to expose both at the same time, and at this point it becomes a question of whether we believe that allowing userland implementations of domain-like modules is more desirable than guaranteeing the behavior of --abort-on-uncaught-exception.

My opinion is that I'd rather have a runtime that guarantees the behavior of --abort-on-uncaught-exception, and that experimentation on an alternative to the domain core module is possible in core without the hooks provided by this PR, but I understand that this is subjective and debatable.

The overarching thought behind my comments is that continuing to decouple further the error handling model from the core runtime seems like this could lead to making it very difficult to reason about what the actual error handling model is at any point in time depending of what code has executed. I think that this is an area where we should optimize for consistency instead of flexibility.

@misterdjules
Copy link

@addaleax

I'm experimenting with implementing a user-land domain equivalent, and it seems that because process.setUncaughtExceptionCaptureCallback can only register one callback, a Node.js program that would use a that user-land alternative and the core domain module at the same time would throw either the ERR_DOMAIN_CALLBACK_NOT_AVAILABLE or the ERR_DOMAIN_CANNOT_SET_UNCAUGHT_EXCEPTION_CAPTURE error, depending in which order which module was loaded.

Is that correct? If so, is that intentional and is the intention to have users either use the user-land or the core module consistently for all their dependencies?

If that's not intentional, does it seem feasible to you to make that setUncaughtExceptionCaptureCallback able to register multiple callbacks so that any "domain-like" module could work within the same process?

@addaleax
Copy link
Member Author

addaleax commented Oct 5, 2018

Is that correct?

@misterdjules Yes, I think so.

If that's not intentional, does it seem feasible to you to make that setUncaughtExceptionCaptureCallback able to register multiple callbacks so that any "domain-like" module could work within the same process?

I’m not sure, but it’s definitely something that could be changed if we want to. The main difference between the capture callback and process.on('exit') is that in the latter case, there is no way for a single listener to avoid subsequent listeners from being called; if we gave the capture callback a way to selectively allow that, I guess multiple callbacks should work?

@misterdjules
Copy link

@addaleax

Thanks for the helpful response, I really appreciate it!

The main difference between the capture callback and process.on('exit') is that in the latter case, there is no way for a single listener to avoid subsequent listeners from being called; if we gave the capture callback a way to selectively allow that, I guess multiple callbacks should work?

Do you mean that we shouldn't allow multiple callbacks to be called on uncaught exception? Is there a problem with allowing that?

@addaleax
Copy link
Member Author

addaleax commented Oct 5, 2018

@misterdjules I might be wrong, but that seems to be the most sensible semantics? The domain module uses this callback because, unlike an process.on('uncaughtException') listener, it doesn’t just listen to errors, it takes control of some of them and doesn’t want the listeners to see the exception in that case

@misterdjules
Copy link

@addaleax

If we consider the core domain module as being the only domain-like module loaded at any given time, I agree that those semantics make sense.

However if we're thinking of enabling user-land domain-like modules, they would have to co-exist with the core domain module in the same process, otherwise their adoption would be conditional to having applications that use them and all their dependencies not require domain anymore, which seems could be a big barrier to entry.

Is the rationale that this use case would be rare and that applications would either use only the core domain module or the same userland alternative?

@addaleax
Copy link
Member Author

addaleax commented Oct 5, 2018

@misterdjules I guess the question is more, like: If people use both the built-in domain module and an external one, is it always possible to tell which one should be the one handling an uncaught exception when it occurs?

@misterdjules
Copy link

@addaleax

If people use both the built-in domain module and an external one, is it always possible to tell which one should be the one handling an uncaught exception when it occurs?

Right, that's the kind of questions I'm wondering about. Currently I'm thinking: why not both? But I'm not sure I settled on an answer yet.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
c++ Issues and PRs that require attention from people who are familiar with C++. domain Issues and PRs related to the domain subsystem. lib / src Issues and PRs related to general changes in the lib or src directory. process Issues and PRs related to the process subsystem. semver-minor PRs that contain new features and should be released in the next minor version.
Projects
None yet
Development

Successfully merging this pull request may close these issues.