[Discuss] Resolving the unhandled rejections issue #22822

jasnell · 2018-09-12T15:26:24Z

@nodejs/tsc @nodejs/collaborators (yes, pinging @nodejs/collaborators is intentional and yes, I know it causes lots of people to get notifications... that's what I want because I want broad input into this conversation)

We currently have two PRs open that deal with unhandled promise rejections.

Both do or assume several things:

Both introduce APIs that users may use to observe Promise state.
Both make assumptions (albeit very different assumptions) about what the proper handling of unhandled rejections by Node.js should be.

These PRs have been generally treated as mutually exclusive by the authors and the discussion about the introduced APIs has been generally held up by disagreements in (a) the fundamental handling of unhandled rejections and (b) whether or not Node.js should warn users about potentially problematic coding patterns. Unfortunately, what I see happening now is that issues that should be separate threads of discussion are being lumped all in together. So I'd like to try to separate these things out.

Warn or Not Warn

Let's take the whole discussion about Promises out of the picture for now. There is a general philosophical/values discussion to be had about whether or not it is Node.js' responsibility to warn users about potentially problematic coding patterns that may exist. It has been argued by a couple of folks that it is not Node.js' place to "judge" someones code. This has been expanded to mean that when the user runs some code that is not ideal but otherwise runs, Node.js should just stay silent on the matter.

An example of this is resolving a Promise twice... e.g.

new Promise((resolve, reject) => {
  resolve('A')
  resolve('B')
});

or 

new Promise((resolve, reject) => {
  reject(new Error('A'));
  resolve('B');
});

In both cases, the resolve('B') is extraneous and will be swallowed by default by V8. The code will run but the second resolve is effectively a non-op. In a handful of cases, this situation could be intentional or could be a legitimate coding bug. Take the following, for instance... a forgotten break statement leads to the swallowed resolve:

let foo = getFoo()
new Promise((resolve, reject) => {
  switch (foo) {
    case 'A': resolve('A')
    case 'B': resolve('B')
  }
});

When we look at this, we can likely reason that this is clearly a programmer error, but the code will run just fine.

Thanks to an internal callback API provided to us by V8, Node.js can detect when this secondary resolve or rejection occurs. The question is, should Node.js, by default:

Emit a process warning letting the user know that this situation occurred, or
Invoke an optional callback API that the user must explicitly opt in to using to be notified about this situation
Both

Note that when the process.emitWarning() API was added back in the Node.js 6.0, I took the approach of allowing both. By default, process warnings are emitted to stderr. Usercode can, however, listen to the process.on('warning') event and use the --no-warnings command-line flag to disable the default handling and provide their own handling of the process warning events. Because runtime deprecations now use the process warning mechanism, runtime deprecation warnings can be handled in the same way.

For warnings about synchronous file system I/O, however, we take a different approach. Synchronous I/O can be a significant performance bottleneck in Node.js and is generally considered to be an antipattern, but there are perfectly legitimate use cases for it. Therefore, we do nothing by default when sync I/O is used but we provide an opt in command line flag --trace-sync-io that users can switch on to help identify where sync io may be happening within their applications.

Given precedent, I don't think we can argue that there is any single rule of thumb to be applied here. Whether or not we warn by default can really only be determined on a case by case basis determined by how useful, or how problematic, a given coding pattern may be. So with that, we need to determine if swallowed resolves/rejections are potentially problematic enough to warrant a warning by default.

Crash or Not Crash

When a Promise is rejected without a catch handler in place, what do we do? This has been a long raging discussion for several years now without any clear resolution. Some feel we should crash immediately, some feel we should ignore and just let it happen. Neither are wrong, neither are entirely correct. I would submit that this is something that Node.js should allow users to configure with a command line argument, but then the argument turns to what should a reasonable default among several options be. That's what we need to figure out here. For my money, crashing on garbage collection is not ideal but it's likely the best reasonable default option we have at this time so long as we also have the option to crash immediately or not at all available for users to select.

One API to Rule Them All

The two PRS currently open propose two different APIs users may use to get notified about Promise state.

The "swallowed resolves" PR (#22218) takes the approach of building on the existing process.on('unhandledRejection') event and adds a new process.on('multipleResolves') event.

For example:

function handler(type, p, reason, message) {
  // do something here
}
 process.on('multipleResolves', handler);

 new Promise((resolve, reject) => {
  resolve('A');
  resolve('B');
  reject(new Error('C'));
});

The "promise events API" PR (#21857) takes a more diagnostics API approach that adds a low level API for monitoring promise state. The basic idea would be to replace the existing process.on('unhandledRejection') event and to allow users to implement whatever handling they want:

const makeHook = (type) => (promise, value) => {
  // do something here
};
util.createPromiseHook({
  resolve: makeHook('resolve'),
  rejectWithNoHandler: makeHook('rejectWithNoHandler'),
  handlerAddedAfterReject: makeHook('handlerAddedAfterReject'),
  rejectAfterResolved: makeHook('rejectAfterResolved'),
  resolveAfterResolved: makeHook('resolveAfterResolved'),
}).enable();

const x = (async () => {
  throw new Error('A');
})();

Each of the APIs have their merits, and each have their warts. In the discussion threads for each of the PRs, the authors have argued their cases for why theirs in the right approach and we need to come to a conclusion on which API we want to move forward with. Given the disagreement thus far and the inability to come to resolution, the decision is likely to fall to the @nodejs/tsc unless the authors of the two PRs can come together and work through some compromise approach (which is what I would like to see).

Personally, I'm good with either of the two PRs. None of the arguments in favor of either has swayed me for or against either of them.

When discussing this issue, please separate these three individual aspects so that any one of them does not get lost in the discussion.

The text was updated successfully, but these errors were encountered:

mmarchini · 2018-09-12T15:40:28Z

Also pinging @nodejs/diagnostics and @nodejs/post-mortem since there are interested parties there who are not core collaborators.

mcollina · 2018-09-12T16:05:07Z

Both PRs are fine code-wise.

I am in favour of warning and crashing. I think warning adding that warning is good. I've seen plenty of code that had very hard to spot bugs because of that. It is a very tricky situation. It also encourages bad practices, like on('data', resolve) (this is hitting the resolve function for every chunk, while only the first is returned).

Node.js is built under the assumption that crashing in case of an uncaught exception is the right thing to do:

Note that 'uncaughtException' is a crude mechanism for exception handling intended to be used only as a last resort. The event should not be used as an equivalent to On Error Resume Next. Unhandled exceptions inherently mean that an application is in an undefined state. Attempting to resume application code without properly recovering from the exception can cause additional unforeseen and unpredictable issues.

Not doing this for promises is a real world problem in a lot of companies adopting Node.js for writing server-side applications. Note that the following code could introduce a memory or file descriptor leak in case of errors:

const express = require('express')
const controller = require('./controller')
const app = express()

app.get('/', async (req, res) => res.send(await controller(req)))

app.listen(3000, () => console.log('Example app listening on port 3000!'))

This is no different than:

const express = require('express')
const controller = require('./controller')
const app = express()

app.get('/', (req, res) => {
  controller((err, data) => {
    if (data) {
      res.send(data)
    }
  })
})

app.listen(3000, () => console.log('Example app listening on port 3000!'))

(this is not an express problem BTW, but I'm using express to showcase a very common pattern).

Considering how Node.js is built, crashing on 'unhandledRejection' is the safest thing to do by default to avoid leaks.

misterdjules · 2018-09-12T16:54:07Z

I've done some research internally at Netflix around this, and wrote a report about it. That survey doesn't map 100% to the questions in this issue but some of the conclusions do:

As a result of doing that survey my position is still that Node.js should provide a default of exiting on unhandled rejections (regardless of whether they are garbage collected) that is opt-out.

Please make sure to read the disclaimer before looking at this report.

I found collecting that data and doing interviews internally to be very helpful in trying to understand what is acceptable and unacceptable for users. I would really like if we could do a similar survey at the scale of the broader Node.js users community.

I know that nodejs/user-feedback#77 was started to do that, but it seems to struggle to get traction. I think if we went through that survey it could help us to get a better picture of what our users care about.

If there's a way I can help with that I'd be happy to continue contributing to that effort. /cc @nodejs/user-feedback

devsnek · 2018-09-12T17:45:30Z

@mcollina warnings in console don't work because they can't be revoked. you also can't revoke exiting the process. the unhandledRejection event makes no sense without the rejectionHandled event, and it annoys me that people ignore half of the API.

@misterdjules your survey is cool, but it still only represents 32 engineers at the same company working on the same infrastructure. We will definitely need more data to start drawing conclusions.

misterdjules · 2018-09-12T18:15:36Z

@devsnek

your survey is cool, but it still only represents 32 engineers at the same company working on the same infrastructure. We will definitely need more data to start drawing conclusions.

I think that's what the disclaimer that I linked to is saying, was that not clear when you read it?

I also mentioned that:

I would really like if we could do a similar survey at the scale of the broader Node.js users community.

So I think we're on the same page 👍

mcollina · 2018-09-12T18:27:09Z

@mcollina warnings in console don't work because they can't be revoked. you also can't revoke exiting the process. the unhandledRejection event makes no sense without the rejectionHandled event, and it annoys me that people ignore half of the API.

Over the last 2 years, in almost every consulting engagement I have been part of I had to fix significant production issues related to promise handling. This include potential DoS attack vectors. IMHO it is not safe to operate a Node.js application without a) a rigorous coding environment or b) an 'unhandledRejection' event handler that crashes the process.

I think Node.js should strive to be safer for newcomers.

It would be extremely easy for a developer to always .catch():

const express = require('express')
const controller = require('./controller')
const app = express()

app.get('/', wrap(async (req, res) => res.send(await controller(req))))

function wrap (func) {
  return function (req, res, next) {
    func(req, res, next).catch(next)
  }
}

app.listen(3000, () => console.log('Example app listening on port 3000!'))

The root for this problem is more deep, and it goes back to our callback model. Let's consider this little code:

stream.on('data', function (chunk) {
  doSomething(chunk, function onSomething (err) {
    if (err) {
      // do something with err!
      return
    }
    anotherFunction()
  })
})

Note that in that code, every exception that is thrown by anotherFunction() would crash the process by default.

Now a developer wants to migrate that to async/await and promises:

stream.on('data', async function (chunk) {
  try {
    await doSomething(chunk)
  }  catch (err) {
    // do something with err
    return
  }

  anotherFunction()
})

Note that in that code, every exception that is thrown by anotherFunction() is very likely to not clean up some system resources properly.

The problem is that a significant number of developers would "add" an async in front of a function without considering that it changes significantly how that function behaves. They are just doing it because they want to have await. I have seen this happening all the time, even by experience developers.

devsnek · 2018-09-12T19:22:21Z

@mcollina

I think Node.js should strive to be safer for newcomers.

I agree, of course. I want to make sure whatever we expose doesn't confuse newcomers any more than they already are by being inconsistent or breaking valid patterns of programming with JavaScript.

The problem is that a significant number of developers would "add" an async in front of a function without considering that it changes significantly how that function behaves.

So maybe we should be looking into developer education instead of putting training wheels on features by default. If I want to use a valid model of programming in JavaScript I shouldn't have to override node to do it.

We could even release an npm module to describe as many opinionated behaviours as we wanted. Honestly that would probably make the most sense, given how much of this is based around opinions around certain programming patterns. My or @BridgeAR's PRs would enable some pretty cool stuff to happen in userland, and I hope it would also fulfill your requirements, given how often you mention that things that can be done in userland should be done in userland.

Possible API:

require('@nodejs/promise-thing')({
  track: 'unhandled', // vs 'unhandled-on-gn'
  crash: true, // could emit an event or just log otherwise
});

P.S. only sith deal in absolutes 😉

Qard · 2018-09-12T19:44:20Z

I'm mostly in the fail-fast camp, thinking we should just crash when an unhandled exception occurs and provide a no-really-its-okay escape hatch for when you are really sure you know what you are doing. Approaching it the other way around just makes Node.js more and more a platform on which you are likely to shoot yourself in the foot. 🙀

ronkorving · 2018-09-13T01:48:11Z

On warnings:

Don’t make users opt into a hundred different events. With every new "bad pattern" that gets identified, we’re going to get a new event to listen to, and we all have to collectively update a million repositories to deal with them. Please, no. The single "warning" event is fine though, since you can opt-in to that, log the warning, and either ignore it or shutdown. Yes, the event is less specific, but people really don't want to write 20 lines of boilerplate to deal with a ton of different granular events, that just get bigger and bigger with every Node release.
ESLint is a thing, a very mature thing actually. You can detect use of Sync() functions with it for example, and I don’t see why we can’t detect double-resolve or resolve-after-reject patterns. People who care about proper, working, code, tend to use tools like ESLint already anyway. Why do this with runtime hooks when the whole thing can be avoided? So I say: don’t bother at runtime with "helpful" warnings like this, we have excellent tools (that can be improved if they lack a feature).

On crashing:

Why treat unhandled rejections any different from uncaught exceptions? I never understood this. Consistency would be progress. Allow a handler, but if there is none, crash.

devsnek · 2018-09-13T01:57:29Z

ESLint

I couldn't agree more. +100000000

Why treat unhandled rejections any different from uncaught exceptions?

Promises are designed such that they don't halt execution when they reject. A camp of users (myself included) feels that by node shouldn't be infringing on JavaScript design goals by default.

I of course agree that we should have good debugging utilities, but I don't think they need to be killing processes and cluttering my console by default. Tooling like ndb and such can provide all the stuff we want better than core can do by itself, so I don't really understand why we wouldn't want to focus on making things easier for tooling, instead of trying to provide strange defaults.

jasnell · 2018-09-13T03:11:41Z

With regards to eslint, unfortunately it doesn't catch every case. For instance, it would not reliably be able to catch Matteo's `on('data', resolve)` example unless it's able to determine reliably that the thing you're using is an event emitter and the specific event being handled can be emitted more than once. Linting is great, but it cannot reliably catch all of the problematic cases.

…

On Wed, Sep 12, 2018, 20:48 Ron Korving ***@***.***> wrote: On *warnings*: - Don’t make users opt into a hundred different events. With every new "bad pattern" that gets identified, we’re going to get a new event to listen to, and we all have to collectively update a million repositories to deal with them. Please, no. The single "warning" event is fine though, since you can opt-in to that, log the warning, and either ignore it or shutdown. Yes, the event is less specific, but people really don't want to write 20 lines of boilerplate to deal with a ton of different granular events, that just get bigger and bigger with every Node release. - ESLint is a thing, a very mature thing actually. You can detect use of Sync() functions with it for example, and I don’t see why we can’t detect double-resolve or resolve-after-reject patterns. People who care about proper, working, code, tend to use tools like ESLint already anyway. Why do this with runtime hooks when the whole thing can be avoided? So I say: don’t bother at runtime with "helpful" warnings like this, we have excellent tools. On *crashing*: - Why treat unhandled rejections any different from uncaught exceptions? I never understood this. Consistency would be progress. Allow a handler, but if there is none, crash. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#22822 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAa2eZ87Hg4wGIb1azLQoPkBxFjdZA4Zks5uablhgaJpZM4WlodJ> .

ronkorving · 2018-09-13T07:00:18Z

@jasnell Excellent point. I guess then the question becomes: is the tooling good enough? IMHO, it's good enough to avoid having to make the case for a new dedicated event. I'll end my argument there :)

mcollina · 2018-09-14T08:47:58Z

Promises are designed such that they don't halt execution when they reject. A camp of users (myself included) feels that by node shouldn't be infringing on JavaScript design goals by default.

@devsnek I'm talking about problems that our users are having today. I've seen this happening so many times in the trenches. It is not about having training wheels, it is about developer productivity and making things simpler to build. I want people to not waste days of work because they could not find the switch to make the most popular frameworks and modules to work well in production.
I think there are standards and specs, and there is software that is built with them. I value the individual developer more than standards and specs.

If I want to use a valid model of programming in JavaScript I shouldn't have to override node to do it.

I think this is a bold sentence. How come Node.js does not provide a valid model of programming JavaScript? It is used by millions of developers all around the world.
The majority of our current API do not work well in a world of "on error resume", does this makes Node.js not a valid model of programming in JavaScript?

devsnek · 2018-09-14T13:01:19Z

Node.js should be helpful and unopinionated by default. Save opinions for command line options.

rubys · 2018-09-14T15:25:26Z

Probably worth a separate issue, but I'm of the opinion that command line options aren't sufficient, particularly for options that affect whether or not the application runs correctly.

Rails has put a lot of thought in this, something I think we could benefit from. A rails generate command creates a directory with config files specifying configuration options (complete with comments!), and even has separate configuration options for different environments (development, test, production).

When there is a behavior change release to release, generally a new configuration option is created, allowing application developers to opt-in to the new behavior early if they desire, or to retain (temporarily or permanently) the old behavior after they upgrade.

The thought is that if we make configuration easy an discoverable and environment dependent, there will be less discussion over what the right defaults are. In this specific case, I feel a strong case can be made for the default behavior for unhandled rejections in a test environment to be something that alerts the user (and perhaps not just with a warning, but with an exception).

Node not being as monolithic and "batteries included" as Rails will make this a bit more difficult to pull off, but with NODE_ENV already having been popularized by express, if we document how configurations can be defined, perhaps generators such as Yeoman will follow.

devsnek · 2018-09-14T15:46:34Z

@rubys i like the direction but I think debugging and linting tools can do all that better than we can, with crazier tracing and tracking etc. our job is just to enable the tooling to do all that.

there are plenty of antipatterns in this amazing language and I don't see any good reason to special case this one. people can use tooling just like they do for everything else. node should not pass judgement on your code. we don't have built in leak warnings or double equals warnings or hasOwnProperty warnings, but people can build tools that handle all that.

boneskull · 2018-09-14T16:02:08Z

@mcollina @jasnell

While ESLint doesn't enable this type of checking (stream.on('data', resolve)) out-of-the-box, an arguably effective rule could be created.

ESLint might not know that stream is a ReadableStream, but it doesn't necessarily have to. It can warn or error on <someObject>.on('data', resolve) and the user can tell ESLint to make an exception via an inline directive--e.g., if someObject is not a ReadableStream at all--as appropriate.

Likewise, a similar rule could be created in the stream.on('data', async () => { /* .. */}) case.

Would this not be Good Enough Most of the Time?

benjamingr · 2018-09-16T21:11:26Z

Hey, just saying I am away on holiday until the end of the month so with very limited availability.

I'd like to echo what @mcollina @misterdjules and @Qard and @ronkorving has said. The fact we don't crash on unhandled rejection is something people find really surprising over and over again.

If we can get away with crashing on nextTick that's fine, otherwise crashing on GC is also great. Ideally we'd crash even sooner than nextTick.

We have also already been emitting a warning saying we're going to crash on nextTick for ~2 years. It is very easy to opt out of and will always be so.

As said in the PRs I am fine with both PRs.

devsnek · 2018-09-16T21:18:29Z

@benjamingr i'd imagine its also very easy to opt into opinionated behaviour, instead of out. Like i said above, we don't coerce any other behaviours in js that might be considered anti-patterns.

benjamingr · 2018-09-16T21:38:26Z

@benjamingr i'd imagine its also very easy to opt into opinionated behaviour, instead of out. Like i said above, we don't coerce any other behaviours in js that might be considered anti-patterns.

I'm pretty sure we do - Node exits on uncaught synchronous exceptions.

Also, we have special behaviour for errors in event emitters.

devsnek · 2018-09-16T21:55:58Z

Node exits on uncaught synchronous exceptions.

that's not node specifically... that's just how js works. if you throw an error and nothing catches it, all execution will stop. if there's no execution happening, node has nothing else to do and quits. if you want to change this behaviour, we provide hooks to restart execution, but nothing out of the ordinary happens by default.

Also, we have special behaviour for errors in event emitters.

i don't understand how this is related.

benjamingr · 2018-09-16T22:00:38Z

that's not node specifically... that's just how js works. if you throw an error and nothing catches it, all execution will stop.

This is not what other execution environments of JavaScript (browsers, for example) do. Nor is it what JavaScript does in some other environments typically (like microcontrollers). Nor is it typically what happens when embedding a JavaScript runtime as a scripting engine on top of another program.

It is an opinion Node has on how uncaught errors should be handled.

ljharb · 2018-09-16T22:19:31Z

It’s certainly allowed in the spec to do this with uncaught exceptions - but promise rejections are not uncaught exceptions, altho the latter may cause the former.

If we added to the spec a line that explicitly prohibited using Promise rejection hooks to exit, which would more clearly match the intent of these hooks (ie, solely for logging), would that resolve the issue? (to clarify, this is a hypothetical: for a number of reasons, I don’t think such a note would ever land in the spec, even if it matches our intentions)

benjamingr · 2018-09-16T22:36:03Z

If we added to the spec a line that explicitly prohibited using Promise rejection hooks to exit, which would more clearly match the intent of these hooks (ie, solely for logging), would that resolve the issue?

I would be very hesitant with TC39 dictating rather than specifying this behaviour - especially since consensus in Node.js does not tend towards the different behaviour between uncaught errors and uncaught errors in async functions at the moment.

mcollina · 2018-09-17T07:13:08Z

While ESLint doesn't enable this type of checking (stream.on('data', resolve)) out-of-the-box, an arguably effective rule could be created.

ESLint might not know that stream is a ReadableStream, but it doesn't necessarily have to. It can warn or error on <someObject>.on('data', resolve) and the user can tell ESLint to make an exception via an inline directive--e.g., if someObject is not a ReadableStream at all--as appropriate.

Likewise, a similar rule could be created in the stream.on('data', async () => { /* .. */}) case.

Would this not be Good Enough Most of the Time?

@boneskull no, for mainly two reasons. Back at Node.js Interactive 2015 in Austin, I suggested to use an eslint rule to solve the unsafe Buffer constructor problem, however it did prove highly ineffective in avoid the growth in usage of the unsafe Buffer constructors.

The same does not only applies to all event emitters, but also to all errorbacks. Here is another popular one:

myFunction(obj, async function (err, data) {
  // ...
})

Based on those two experiences, I do not think an eslint rule would be sufficient.

I came to the conclusion that it is not possible to mix-and-match errrobacks and EventEmitter and promises without some form of exit on unhandled rejection.

@rubys

The thought is that if we make configuration easy an discoverable and environment dependent, there will be less discussion over what the right defaults are. In this specific case, I feel a strong case can be made for the default behavior for unhandled rejections in a test environment to be something that alerts the user (and perhaps not just with a warning, but with an exception).

Typically developers do not test their error paths well and they rely on their frameworks/tools to do the heavy lifting. If they did, they would experience this problem in the form of a test that does not complete. In some form, testing already covers it.

@devsnek

that's not node specifically... that's just how js works. if you throw an error and nothing catches it, all execution will stop. if there's no execution happening, node has nothing else to do and quits.

This is not correct. Node.js does not quit if there is a server listening, or any other low-level resources being held (sockets, paused file streams, etc..). Node.js keeps going even if there is nothing to do, because that is the nature of servers. In case of an error, to prevent memory leaks and hard-to-debug situations, the process should exit. This pattern has proven extremely successful so far.

@ljharb

If we added to the spec a line that explicitly prohibited using Promise rejection hooks to exit, which would more clearly match the intent of these hooks (ie, solely for logging), would that resolve the issue? (to clarify, this is a hypothetical: for a number of reasons, I don’t think such a note would ever land in the spec, even if it matches our intentions)

No, I do not think so. The only net result would be create a situation where developers, in order to operate Node.js safely, would need to be violating the spec. In other form, it would put Node.js outside of the spec.

ljharb · 2018-09-17T07:18:51Z

I’m arguing that exiting on unhandled rejections puts node outside the spec now, even if it’s not explicit, and i don’t agree that “safely” requires it.

mcollina · 2018-09-17T07:48:51Z

I’m arguing that exiting on unhandled rejections puts node outside the spec now

In the last few years, in all production applications I worked on, I have always seen such a handler. I think you are arguing that all those applications are not JavaScript programs, which is very interesting. Most developers do not even know it is there, as it was added once in a startup script and never touched again. This practice is considered "good house keeping".

Side note, the "log and exit" approach is typical pattern for server-side production systems.

It might well be that the spec was designed not considering the usage of JavaScript in servers and embedded devices, but just browsers. The type of user-interaction is completely different.
If a behavior that is non-compliant to the spec is needed to operate JavaScript programs, isn't it a spec issue?

i don’t agree that “safely” requires it.

I have made several code examples of dangerous situations, and how much discipline it is required to avoid such problems. So far, the risk that those situations cause has not be challenged, and the two proposed solution for these issues are: a) education b) linting. Can you please articulate?

ljharb · 2018-09-17T07:54:26Z

We don’t use such a handler at airbnb, and we don’t run into these problems. Perhaps education and linting are addressing it for us, perhaps it’s luck, but i remain confused how this is such a common problem for folks (the lack of exiting; not unhandled rejections specifically).

mcdonnelldean · 2018-09-17T08:20:20Z

I'd just like to second Matteo's argument. Any software I have delivered (my role is a delivery architect) in the context of node server systems, we have had a handler to log and die. In my whole career across many languages, in a production context, this has been a pattern that was necessary for realistic post mortem and clean up

Generally die and stack-trace is good but not nearly enough data for a production system in many cases. There are contextual things that need to be done in some cases. For example we may,

Pop a message on a queue to have a cleaner sub system analysis any data issues as a result of the process dying. This is my primary use case. When something dies, in a lot of cases I have been involved with, sensitive systems are left in an undesirable state, something needs to know when this happens to help clean up.
Log more detailed meta around who the user was, what their perms were at the time, what action they were trying to perform (the stack is the immediate problem, not always the overarching one)
...Among others I would prefer not to talk about for sensitivity reasons.

It seems to me that there is some naivety in the statement that because x doesn't do it y doesn't need it. I've worked on systems as big as Air B&B that didn't use this pattern and smaller mission critical systems that would not be considered viable if they didn't 'log and die'.

I feel saying something is "outside of spec" without interrogating the use cases production systems are using it in is a dangerous way to dictate features.

At the very least I would assume the technical leadership would survey those in the community building production grade systems to understand the impact.

I am happy to give a more detailed response if needed but I would need to sanitise my words and examples more for confidentiality reasons.

mcdonnelldean · 2018-09-17T08:32:16Z

@ljharb Could it be that because airbnb own their whole system, as such, they can mitigate at a platform level?

My experience tends to be with highly integrated with external (and generally old and crusty) systems that don't produce anything overly useful in terms of logs or rollback-ability.

I've found it's not uncommon that the external system doesn't correctly clean itself up after the internal system died. We don't have access to modify the external system (in many cases we don't even own it).

Come to think about it, as I think of more examples, they are mostly times where we cannot control all pieces of the system .

benjamingr · 2018-09-17T08:46:08Z

@ljharb would it be possible to schedule a quick call (when I'm back from trekking after the 28th) where me (and other interested parties) can learn about what airbnb does differently from our own experience to better understand your perspective?

rvagg · 2018-09-17T12:33:13Z

Since this is likely to be a long-running argument, perhaps we should treat it as baby-steps? Introduce more obtrusive behaviour later if we can reach agreement.

Right now I think the priority should be to just get an avenue for insight and the ability to suggest to users a "best practice" setup that might use that insight to drive intrusive behaviour.

I echo the nearForm experience (noted by both @jasnell and @mcollina) and say that NodeSource has basically the same. Average devs on complex code bases get themselves into a lot of trouble with Promises and we don't currently have good tools to point them in better directions. The best we can do is educate but that doesn't scale very well.

So, with that in mind. I like @devsnek's approach in #21857 but I hate the API cause it seems to unidiomatic for Node.

util.createPromiseHook({
  resolve: makeHook('resolve'),
  rejectWithNoHandler: makeHook('rejectWithNoHandler'),
  handlerAddedAfterReject: makeHook('handlerAddedAfterReject'),
  rejectAfterResolved: makeHook('rejectAfterResolved'),
  resolveAfterResolved: makeHook('resolveAfterResolved'),
}).enable();

We have events for this kind of thing. What's wrong with just more process.on('resolveAfterResolved') style events? Events are "hooks", why do we need a new thing to grok? I'd love to be able to write best-practices documentation for users that suggest which events to listen to and what behaviour to introduce. Maybe even a best-practices.js that they could throw in their code that does a fatal on some actions, warnings on others, etc.

I understand that frustration with Node stomping around in Promise-purity land and can see this leading to years of discussions! So for now, how about low-touch, opt-in and we can use that as a way to experiment with "sensible defaults"?

Flarna · 2018-09-17T12:51:39Z

Maybe it's a little bit of topic but on point regarding events: I would prefer that registration of event handlers doesn't change the actual behavior. e.g. if you register for uncaughtException to just add more logging,... you have to take care to exit the process even there is no API to tell node to do this in the same way as if there is no listener installed.
Similar is valid for the various signal handlers which have a default action.

Besides that it should be allowed to install several listeners to avoid that frameworks conflict with debugging/Logging/deployment tools by overriding their listeners.

benjamingr · 2018-09-17T18:58:17Z

@rvagg

Thank you for weighing in. Regarding:

I understand that frustration with Node stomping around in Promise-purity land and can see this leading to years of discussions! So for now, how about low-touch, opt-in and we can use that as a way to experiment with "sensible defaults"?

As someone who was involved with promises back before they were part of the language here is a quick recap of our timeline:

We started bugging Node.js and io.js about unhandled rejections around 0.12 / first io.js release.
Petka added the unhandled rejection hook (to process, as I specified) to io.js which landed in io.js 1.3 - we also PRd the hook into all popular promise libraries since native promises were not popular then.
Me and James landed a PR which adds a warning to the 7.7 so Node.js 8+ all warn that unhandled rejections will crash the process.

I am personally scared about "crash after microtick" semantics - although a lot less now after hearing that my experience about crashing there is echoed by both NodeSource and nearForm. We've had an opt in way (with both you, me and Matteo have been using for a while now).

So, with that in mind. I like @devsnek's approach in #21857 but I hate the API cause it seems to unidiomatic for Node.

Isn't that precisely @BridgeAR's PR (the alternative) then?

misterdjules · 2018-09-17T23:04:05Z

@benjamingr

If we can get away with crashing on nextTick that's fine, otherwise crashing on GC is also great. Ideally we'd crash even sooner than nextTick.

Did we get consensus on "exit on GC"? I thought that wasn't the case.

devsnek · 2018-09-17T23:10:04Z

until GC actually solves the "promises rejected without being detected" problem (and it probably never will), there will not be any consensus on GC.

Fishrock123 · 2018-09-21T15:24:22Z

You can solve that problem by just logging out remaining unhandled promises on application exit.

benjamingr · 2018-09-25T19:26:04Z

Did we get consensus on "exit on GC"? I thought that wasn't the case.

We did at some point (at the summit) but we absolutely don't have to do that - again, I'm also absolutely fine with crashing on nextTick if that's acceptable. I think we should just present all options to the TSC and bring it to a vote at this point.

misterdjules · 2018-09-25T20:26:03Z

@benjamingr

Did we get consensus on "exit on GC"? I thought that wasn't the case.

We did at some point (at the summit) but we absolutely don't have to do that

OK, I had the impression that the public discussion at #20097 showed that there wasn't consensus on "exit on GC".

Fishrock123 · 2018-09-28T15:11:59Z

You can solve that problem by just logging out remaining unhandled promises on application exit.

@devsnek Please elaborate how this is 👎, it does most certainly fix that problem.

devsnek · 2018-09-28T15:30:43Z

@Fishrock123 because GC doesn't run on exit. We would either need to make new APIs to force the GC to run and then wait for that at every exit, or keep an additional list of rejected promises that was somehow weak. both options seems like a degraded experience and I can't imagine saying "yes this is a good way to debug promises" while being potentially spammed with random console warnings when my application exits.

devsnek · 2018-09-28T15:36:33Z

i would really like if we could try to think of other ideas besides GC. its an annoying box that leads to weird solutions that don't really make things better for users.

one quick idea: add a new section to ndb which lets you visually track promise rejection.

another idea: we already expose these events, so let userland go crazy, we can just focus on giving userland the stuff they need to go crazy.

another one: --trace-rejection checks at the end of a tick if a promise has rejected without any handlers and crashes.

i'm sure we can come up with nice stuff.

benjamingr · 2018-09-28T16:24:45Z

i would really like if we could try to think of other ideas besides GC. its an annoying box that leads to weird solutions that don't really make things better for users.

Would you please refrain from such statements? They are needlessly decisive and focus on what you don't want to happen ("GC on exit") rather than technical arguments for/against solutions.

We are certainly not "in a box" and all ideas and suggestions are welcome.

one quick idea: add a new section to ndb which lets you visually track promise rejection.

This was tried at some point but did not turn out to be useful enough (cc @paulirish - can you or someone else do a postmortem?)

another idea: we already expose these events, so let userland go crazy, we can just focus on giving userland the stuff they need to go crazy.

The discussion here is about the default behaviour if I understand correctly.

i'm sure we can come up with nice stuff.

👍 ❤️

benjamingr · 2018-09-28T16:27:05Z

until GC actually solves the "promises rejected without being detected" problem (and it probably never will), there will not be any consensus on GC.

We can do "warn on unhandled rejections on next tick and exit on GC" which is what we talked about in the summit - the problem is that some people are against exiting by default (like Jordan) and some are against waiting for GC (like Julien).

I'll be back from holiday in a week and would love to bring this up to the TSC and work on the survey - I was away for a month and wasn't in a great place personally before (family illness) the two months prior but I want to move forward on this at the second half of October.

Edit: to clarify that is not a problem with Jordan or Julien - the problem is that we don't have consensus on the end game - the discussion itself is constructive and I appreciate it.

misterdjules · 2018-09-28T16:48:12Z

@benjamingr

and some are against waiting for GC (like Julien).

Just to clarify, my concern with exit on GC is not the fact that there is a need to "wait", but the fact that it leads to nondeterministic exit codes/status.

I hope things are getting better @benjamingr, take care! ❤️

jasnell · 2018-10-25T21:02:14Z

This was resolved

misterdjules · 2018-11-07T18:31:25Z

@jasnell Do you mind elaborating on how this was resolved? I think it'd be helpful to mention it here for people who are not familiar with recent discussions on that topic.

My understanding is that the current default behavior will not change, and that additional opt-in behaviors (crash on next tick and silent mode) will be made available via command line options (from #20097). Is that accurate?

This is controversial, but I think our use case, and space requires erring on the side of caution. [See this discusion](nodejs/node#22822) for more perspective.

jasnell added discuss Issues opened for discussions and feedbacks. promises Issues and PRs related to ECMAScript promises. labels Sep 12, 2018

Trott mentioned this issue Sep 13, 2018

opt-out or opt-in for warnings etc. #22587

Closed

jasnell closed this as completed Oct 25, 2018

[Discuss] Resolving the unhandled rejections issue #22822

[Discuss] Resolving the unhandled rejections issue #22822

Comments

jasnell commented Sep 12, 2018

Warn or Not Warn

Crash or Not Crash

One API to Rule Them All

mmarchini commented Sep 12, 2018 • edited Loading

mcollina commented Sep 12, 2018

misterdjules commented Sep 12, 2018

devsnek commented Sep 12, 2018

misterdjules commented Sep 12, 2018

mcollina commented Sep 12, 2018

devsnek commented Sep 12, 2018

Qard commented Sep 12, 2018

ronkorving commented Sep 13, 2018 • edited Loading

devsnek commented Sep 13, 2018 • edited Loading

jasnell commented Sep 13, 2018 via email

ronkorving commented Sep 13, 2018

mcollina commented Sep 14, 2018

devsnek commented Sep 14, 2018

rubys commented Sep 14, 2018

devsnek commented Sep 14, 2018

boneskull commented Sep 14, 2018

benjamingr commented Sep 16, 2018 • edited Loading

devsnek commented Sep 16, 2018 • edited Loading

benjamingr commented Sep 16, 2018 • edited Loading

devsnek commented Sep 16, 2018 • edited Loading

benjamingr commented Sep 16, 2018 • edited Loading

ljharb commented Sep 16, 2018 • edited Loading

benjamingr commented Sep 16, 2018

mcollina commented Sep 17, 2018 • edited Loading

ljharb commented Sep 17, 2018

mcollina commented Sep 17, 2018 • edited Loading

ljharb commented Sep 17, 2018

mcdonnelldean commented Sep 17, 2018

mcdonnelldean commented Sep 17, 2018

benjamingr commented Sep 17, 2018

rvagg commented Sep 17, 2018

Flarna commented Sep 17, 2018

benjamingr commented Sep 17, 2018

misterdjules commented Sep 17, 2018

devsnek commented Sep 17, 2018 • edited Loading

Fishrock123 commented Sep 21, 2018

benjamingr commented Sep 25, 2018

misterdjules commented Sep 25, 2018

Fishrock123 commented Sep 28, 2018

devsnek commented Sep 28, 2018 • edited Loading

devsnek commented Sep 28, 2018 • edited Loading

benjamingr commented Sep 28, 2018

benjamingr commented Sep 28, 2018 • edited Loading

misterdjules commented Sep 28, 2018

jasnell commented Oct 25, 2018

misterdjules commented Nov 7, 2018

mmarchini commented Sep 12, 2018 •

edited

Loading

ronkorving commented Sep 13, 2018 •

edited

Loading

devsnek commented Sep 13, 2018 •

edited

Loading

benjamingr commented Sep 16, 2018 •

edited

Loading

devsnek commented Sep 16, 2018 •

edited

Loading

benjamingr commented Sep 16, 2018 •

edited

Loading

devsnek commented Sep 16, 2018 •

edited

Loading

benjamingr commented Sep 16, 2018 •

edited

Loading

ljharb commented Sep 16, 2018 •

edited

Loading

mcollina commented Sep 17, 2018 •

edited

Loading

mcollina commented Sep 17, 2018 •

edited

Loading

devsnek commented Sep 17, 2018 •

edited

Loading

devsnek commented Sep 28, 2018 •

edited

Loading

devsnek commented Sep 28, 2018 •

edited

Loading

benjamingr commented Sep 28, 2018 •

edited

Loading