-
Notifications
You must be signed in to change notification settings - Fork 30.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Flaky test-async-wrap-uncaughtexception #16210
Comments
cc @nodejs/async_hooks @nodejs/platform-windows |
This is happening pretty much constantly as of about 12 hours ago. Yikes! I don't see anything obvious that has landed that would cause this. @nodejs/testing @nodejs/build |
I've had this issue in #15538 for a long time (6 days). Where it also appears to be unrelated to my changes. edit: fixed wrong link. |
|
@joaocgreis Taking a guess based on that: This might be because test.py sets up stdio differently than a plain run of Node from the command line, so the stdio objects in Node use different implementations – do you think you could try to verify/disprove that? |
@addaleax verified. Redirecting both stdout and stderr to a file makes this happen in cmd. Thanks! How to check stderr type for future referenceRelease\node.exe -e "console.log(util.inspect(process.stderr))"
Release\node.exe -e "console.log(util.inspect(process.stderr))" 2>&1 | cat
Release\node.exe -e "console.log(util.inspect(process.stderr))" > file 2>&1 & type file Some more notes:
So, the error is that node/test/parallel/test-async-wrap-uncaughtexception.js Lines 14 to 19 in 006fdb2
Can anything there be scheduling more async work? |
I've changed the workspace directory of |
I think this is not specific to Windows: https://ci.nodejs.org/job/node-test-commit-linux/13653/nodes=centos7-64/console |
Also I vaguely remember that I have seen this on my macbook, but it does not reproduce much. |
Happened twice in a row on Windows 10 in CI: https://ci.nodejs.org/job/node-test-binary-windows/12364/COMPILED_BY=vcbt2015,RUNNER=win10,RUN_SUBSET=3/console not ok 36 parallel/test-async-wrap-uncaughtexception
---
duration_ms: 0.188
severity: fail
stack: |-
Mismatched <anonymous> function calls. Expected exactly 1, actual 2.
at Object.exports.mustCall (c:\workspace\node-test-binary-windows\test\common\index.js:501:10)
at Object.<anonymous> (c:\workspace\node-test-binary-windows\test\parallel\test-async-wrap-uncaughtexception.js:14:33)
at Module._compile (module.js:617:30)
at Object.Module._extensions..js (module.js:628:10)
at Module.load (module.js:536:32)
at tryModuleLoad (module.js:479:12)
at Function.Module._load (module.js:471:3)
at Function.Module.runMain (module.js:658:10)
at startup (bootstrap_node.js:191:16) |
If we're pretty sure the being-emitted-twice thing is not actually a bug but dependent on the state of the machine, easiest fix might be to use |
It's possible for `beforeExit` to be emitted more than once if work is scheduled by its listeners. Accommodate this fact in the test. Fixes: nodejs#16210
Fix (maybe?) in #16598 |
Seems like we now have a repro - https://ci.nodejs.org/job/node-stress-single-test/1475/ |
Here's three CI stress test failures in a row on that commit:
The immediately preceding commit is ab2c351. Here's three CI stress test successes in a row on that commit:
(ping @jasnell) |
@Trott that commit landed 6 days ago while this issue was opened 18 days ago...I think there could be more to it, might need to audit some crypto error PRs |
@joyeecheung Urm...hmmm... Maybe that commit caused the failures to happen on ubuntu 1604 which only started failing more recently? Maybe the bug has deeper roots for Windows? I did the stress testing on Ubuntu so it would go faster, but I guess I may need to take a deep breath, set aside several hours, and do it on Windows.... |
@Trott I looked into e567402 again and AFAICT it does not seem to touch the code path tested in |
Hmmm....after 12 hours of nothing but red (presumably due to this issue), Windows is coming back green/yellow/red now... |
Argh, nope, still a thing, now showing up on fedora24: https://ci.nodejs.org/job/node-test-commit-linux/13748/nodes=fedora24/console ...
not ok 136 parallel/test-async-wrap-uncaughtexception
---
duration_ms: 0.138
severity: fail
stack: |-
Mismatched <anonymous> function calls. Expected exactly 1, actual 2.
at Object.exports.mustCall (/home/iojs/build/workspace/node-test-commit-linux/nodes/fedora24/test/common/index.js:490:10)
at Object.<anonymous> (/home/iojs/build/workspace/node-test-commit-linux/nodes/fedora24/test/parallel/test-async-wrap-uncaughtexception.js:14:33)
at Module._compile (module.js:641:30)
at Object.Module._extensions..js (module.js:652:10)
at Module.load (module.js:560:32)
at tryModuleLoad (module.js:503:12)
at Function.Module._load (module.js:495:3)
at Function.Module.runMain (module.js:682:10)
at startup (bootstrap_node.js:191:16) |
BTW I have seen this a few times this week on my macbook, but have never been able to reproduce reliably. It only shows up when I run the tests in parallel. |
I'm worried about this one at Code + Learn next week because I can totally imagine a third of the room having the test fail locally or something like that.... |
FYI: #16692 |
Alternative: Marking as flaky #16694 PR @jasnell RE: #16692 we see either a regression between 8.6.0 and 8.7.0 or some level of non-determinism, so IMHO we shouldn't force the test to pass and so hide the bug |
`parallel/test-async-wrap-uncaughtexception.js` has become flaky. At this time investigating the cause is still on going, but this issue become has prevalent. In order to restore CI status to be relevant, this marks the test as explicitly FLAKY. PR-URL: nodejs#16694 Refs: nodejs#16210 Reviewed-By: Rich Trott <rtrott@gmail.com> Reviewed-By: Colin Ihrig <cjihrig@gmail.com> Reviewed-By: Myles Borins <myles.borins@gmail.com>
I created a new project to track flakes on CI and put a variety of related issues to this test in a column. Please feel free to add more related issues / prs |
FWIW, moving this test to sequential does not solve the problem (tried it in #16733) so having too much other stuff happening on the system is unlikely to be the problem. |
I can replicate this if I use the inspector and step into or out of stuff. |
I'm starting to suspect that the extra |
@Trott The only V8 functionality I know of that will require the change is asynchronous WASM compilation, which is currently behind a flag even in V8 master. It might be related to f27b5e4, but I’d be surprised since all I can see V8 doing when the test is being executed is scheduling background tasks, no foreground or delayed foreground tasks… |
`parallel/test-async-wrap-uncaughtexception.js` has become flaky. At this time investigating the cause is still on going, but this issue become has prevalent. In order to restore CI status to be relevant, this marks the test as explicitly FLAKY. PR-URL: nodejs#16694 Refs: nodejs#16210 Reviewed-By: Rich Trott <rtrott@gmail.com> Reviewed-By: Colin Ihrig <cjihrig@gmail.com> Reviewed-By: Myles Borins <myles.borins@gmail.com>
Do not let an internal handle keep the event loop alive. Fixes: nodejs#16210
This has been fixed and can be closed now. |
I have seen multiple errors from this one on Windows this week. sample
The text was updated successfully, but these errors were encountered: