Skip to content
This repository has been archived by the owner on Apr 22, 2023. It is now read-only.

test: microtask-queue-run(-domain) should be made non-flaky #25607

Closed
joaocgreis opened this issue Jul 1, 2015 · 10 comments
Closed

test: microtask-queue-run(-domain) should be made non-flaky #25607

joaocgreis opened this issue Jul 1, 2015 · 10 comments

Comments

@joaocgreis
Copy link
Member

test-microtask-queue-run and test-microtask-queue-run-domain appear to be flaky (PR: #25606), failing very rarely. Testing 10k runs on the platforms I have available, I got failure rates as high as:

  • Linux x64: OK: 9964 NOT OK: 36
  • Windows 7 x64: OK: 9843 NOT OK: 157
  • SmartOS x86 (gcc 4.7): OK: 9904 NOT OK: 96

The results appear to be influenced by the machine load, I got much lower results on an otherwise idle machine.

Quick test:

time (OK=0; NOK=0; for i in `seq 10000`; do ./out/Release/node test/simple/test-microtask-queue-run.js && OK=$(($OK+1)) || NOK=$(($NOK+1)); echo "$i   OK: $OK   NOT OK: $NOK"; done)

Other tests from test-microtask-queue-* seem okay.

@joaocgreis
Copy link
Member Author

/cc @joyent/node-collaborators

@tunniclm
Copy link

tunniclm commented Jul 3, 2015

It looks like the callbacks are executing in a different order sometimes.
I reduced that order difference to this test:

var assert = require('assert');

setTimeout(function() { /* A */
        var called = false;
        process.nextTick(function() { called = true; }); /* B */
        setTimeout(function() { assert(called) }, 0); /* C */
}, 0);

This fails sometimes because the callback on line (C) fires before the callback on line (B). This looks to be the same thing that causes the microtask-queue-run.js test to sometimes fail.

Interestingly, this variant does not fail for me:

var assert = require('assert');

var called = false;
process.nextTick(function() { called = true; });
setTimeout(function() { assert(called) }, 0);

@tunniclm
Copy link

tunniclm commented Jul 3, 2015

Looking at the API doc it seems like process.nextTick() callbacks will run at the head of the event loop, setTimeout() callbacks will happen in the middle (with the other IO) and setImmediate() callbacks (from setImmediate) will run at the end. [EDIT: this was wrong, see a couple of comments down]

This test also doesn't fail for me:

var assert = require('assert');

setTimeout(function() {
        var called = false;
        process.nextTick(function() { called = true; });
        setImmediate(function() { assert(called) });
}, 0);

@tunniclm
Copy link

tunniclm commented Jul 3, 2015

Finally, this test does fail sometimes:

var assert = require('assert');

setTimeout(function() {
        var called = false;
        setImmediate(function() { called = true; });
        setTimeout(function() { assert(called) },0);
}, 0);

So it looks like setTimeout() can run on the same event loop iteration on which it was enqueued and that can cause it to occur first (before process.nextTick() or setImmediate() callbacks enqueued within the same loop iteration).

@tunniclm
Copy link

tunniclm commented Jul 3, 2015

Looks like I got the ordering slightly wrong. The setImmediate() API doc implies it's this:

  1. process.nextTick() callbacks
  2. IO callbacks
  3. setImmediate() callbacks
  4. setTimeout() / setInterval() callbacks

Also, note that setImmediate() callbacks will execute on the iteration after they were enqueued. (EDIT: This explains why changing the process.nextTick() to a setImmediate() had no effect on the failure behaviour).

@tunniclm
Copy link

tunniclm commented Jul 3, 2015

So, I'm not 100% sure of the intention of the original test, but it looks like it would be appropriate to change the setTimeout() on test/simple/test-microtask-queue-run.js:53 to be a setImmediate() instead.

@mhdawson
Copy link
Member

mhdawson commented Jul 6, 2015

This test seems to fail quite regularly in our internal IBM builds, @tunniclm you may be able to validate your changes there.

@misterdjules misterdjules modified the milestones: 0.12.6, 0.12.7, 0.12.8 Jul 6, 2015
whitlockjc added a commit to whitlockjc/node-v0.x-archive that referenced this issue Jul 24, 2015
Whenever a timer of a specific delay creates a new timer with the same
delay we currently end up processing the new timer immediately instead
of scheduling the timer to run in the future.  This commit basically
identifies the situation, halts processing the current timer chain and
reschedules the timer chain appropriately in the future.

Fixes nodejs#25607
whitlockjc added a commit to whitlockjc/node-v0.x-archive that referenced this issue Jul 24, 2015
Whenever a timer of a specific delay creates a new timer with the same
delay we currently end up processing the new timer immediately instead
of scheduling the timer to run in the future.  This commit basically
identifies the situation, halts processing the current timer chain and
reschedules the timer chain appropriately in the future.

Fixes nodejs#25607
@misterdjules
Copy link

@tunniclm @mhdawson Although I'm not the original author of the test, its intention seems to make sure that microtasks run at the appropriate time compared to timers' and process.nextTick's callbacks.

The reason why this test is flaky is described in one comment of the PR that fixes it.

whitlockjc added a commit to whitlockjc/node-v0.x-archive that referenced this issue Jul 24, 2015
Whenever a timer with a specific timeout value creates a new timer with
the same timeout, the newly added timer might be processed immediately
in the same tick of the event loop instead of during the next tick of
the event loop at the earliest.

Fixes nodejs#25607
@tunniclm
Copy link

@misterdjules I looked through that thread on Friday (and the original commit that added the test) as I started to get back to looking at issues (I was unavailable for a few weeks).

From that, my understanding is that the tests are there to ensure that the V8 microtask queue used to process the "built-in" Promises were being executed at the right time in comparison to setImmediate(), setTimeout() and process.nextTick(). Particularly when there were nested callbacks.

What I found in my exploration of nested setTimeout() calls is that it is possible to add a timeout in a timeout callback such that it fires on the same iteration of the event loop in which it was added. I didn't see a note in the documentation to say whether this should be allowed or not (whereas there is such a note for setImmediate()), but perhaps I missed it?

The above PR assumes that this scenario should not be allowed and if that is correct, then it looks like a good fix for this issue. And in that case I think we should update the doc to describe this contract under https://nodejs.org/api/timers.html#timers_settimeout_callback_delay_arg .

@misterdjules
Copy link

@tunniclm Continued this discussion in the PR that fixes this issue itself, so that it's easier to see the actual changes along the comments: #25763 (comment).

whitlockjc added a commit to whitlockjc/node-v0.x-archive that referenced this issue Jul 27, 2015
Whenever a timer with a specific timeout value creates a new timer with
the same timeout, the newly added timer might be processed immediately
in the same tick of the event loop instead of during the next tick of
the event loop at the earliest.

Fixes nodejs#25607
whitlockjc added a commit to whitlockjc/node-v0.x-archive that referenced this issue Oct 1, 2015
Whenever a timer with a specific timeout value creates a new timer with
the same timeout, the newly added timer might be processed immediately
in the same tick of the event loop instead of during the next tick of
the event loop at the earliest.

Fixes nodejs#25607
whitlockjc added a commit to whitlockjc/node that referenced this issue Jul 15, 2016
Whenever a timer is scheduled within another timer, there are a few
known issues that we are fixing:

* Whenever the timer being scheduled has the same timeout value as the
outer timer, the newly created timer can fire on the same tick of the
event loop instead of during the next tick of the event loop
* Whenever a timer is added in another timer's callback, its underlying
timer handle will be started with a timeout that is actually incorrect

This commit consists of
nodejs/node-v0.x-archive#17203 and
nodejs/node-v0.x-archive#25763.

Fixes: nodejs/node-v0.x-archive#9333
Fixes: nodejs/node-v0.x-archive#15447
Fixes: nodejs/node-v0.x-archive#25607
Fixes: nodejs#5426
PR-URL: nodejs#3063
evanlucas pushed a commit to nodejs/node that referenced this issue Jul 19, 2016
Whenever a timer is scheduled within another timer, there are a few
known issues that we are fixing:

* Whenever the timer being scheduled has the same timeout value as the
outer timer, the newly created timer can fire on the same tick of the
event loop instead of during the next tick of the event loop
* Whenever a timer is added in another timer's callback, its underlying
timer handle will be started with a timeout that is actually incorrect

This commit consists of
nodejs/node-v0.x-archive#17203 and
nodejs/node-v0.x-archive#25763.

Fixes: nodejs/node-v0.x-archive#9333
Fixes: nodejs/node-v0.x-archive#15447
Fixes: nodejs/node-v0.x-archive#25607
Fixes: #5426
PR-URL: #3063
evanlucas pushed a commit to nodejs/node that referenced this issue Jul 20, 2016
Whenever a timer is scheduled within another timer, there are a few
known issues that we are fixing:

* Whenever the timer being scheduled has the same timeout value as the
outer timer, the newly created timer can fire on the same tick of the
event loop instead of during the next tick of the event loop
* Whenever a timer is added in another timer's callback, its underlying
timer handle will be started with a timeout that is actually incorrect

This commit consists of
nodejs/node-v0.x-archive#17203 and
nodejs/node-v0.x-archive#25763.

Fixes: nodejs/node-v0.x-archive#9333
Fixes: nodejs/node-v0.x-archive#15447
Fixes: nodejs/node-v0.x-archive#25607
Fixes: #5426
PR-URL: #3063
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

4 participants