Test order is affected by a presence of a nested test #222

ArtemGovorov · 2015-12-10T06:41:39Z

The following code:

var test = require('tape');

test('first', function (t) {

  t.plan(2);

  setTimeout(function () {
    t.ok(1, 'first test');
  }, 200);

  t.test('second', function (t) {
    t.ok(1, 'second test');
    t.end();
  });
});

test('third', function (t) {
  setTimeout(function () {
    t.ok(1, 'third test');
    t.end();
  }, 100);
});

outputs:

ok 1 first test
ok 2 second test
ok 3 third test

and it's expected. But if I change it to its logical equivalent without plan:

var test = require('tape');

test('first', function (t) {

  setTimeout(function () {
    t.ok(1, 'first test');
    t.end();
  }, 200);

  t.test('second', function (t) {
    t.ok(1, 'second test');
    t.end();
  });
});

test('third', function (t) {
  setTimeout(function () {
    t.ok(1, 'third test');
    t.end();
  }, 100);
});

the the test order is broken:

ok 1 second test
ok 2 third test
ok 3 first test

While changing the order of the first and the second test is somewhat expected, ending the first test prematurely, and letting the third test execute before it, looks like a bug. Especially considering that commenting t.test('second'... test out restores the expected order, and make the third test wait until the first one ends.

The text was updated successfully, but these errors were encountered:

davegomez · 2015-12-16T08:56:48Z

Is anyone checking at this? @substack @ljharb

GuidoS · 2015-12-26T01:35:58Z

I think this may be tied to this other open issue #122

Raynos · 2015-12-26T01:38:11Z

This is not an ordering problem; this is a reporting problem.

t.end() will wait for all child tests to finish.

I agree this is confusing semantic difference between t.end() and t.plan().

I recommend adding console.log() to convince yourself of the actual order of execution and to not worry about which order tape reports TAP statements to stdout.

ArtemGovorov · 2015-12-26T01:47:24Z

@Raynos From what I see, it is definitely an ordering issue, not a reporting one.
Adding console.log (with timestamps just in case), produces the same result:

var test = require('tape');

test('first', function (t) {

  setTimeout(function () {
    console.log('first test', +new Date());
    t.ok(1);
    t.end();
  }, 200);

  t.test('second', function (t) {
    console.log('second test', +new Date());
    t.ok(1);
    t.end();
  });
});

test('third', function (t) {
  setTimeout(function () {
    console.log('third test', +new Date());
    t.ok(1);
    t.end();
  }, 100);
});

second test 1451094351565
third test 1451094351676
first test 1451094351756

Raynos · 2015-12-26T01:50:41Z

@ArtemGovorov

Your right; this is simply broken.

Raynos · 2015-12-26T01:53:15Z

I honestly do not know enough about what the semantics of sub tests are supposed to be to fix this...

It should either be:

second test
first test
third test

or

first test
second test
third test

ArtemGovorov · 2015-12-26T01:58:22Z

I think it should be

first test
second test
third test

because sub tests are not much different from the same level tests in a sense that they may need the previous/parent test to be finished. For example, nested tests may use a context that is set up in the parent test and may need the parent test to finish working with the context before a nested test can start using the context.

ljharb · 2015-12-26T07:11:41Z

I would expect second, first, third, based on my expectation of subtest ordering - the setTimeout means that t.end doesn't run on the current tick, so all the subtests of "first test" would run before the next tick starts, and "first test" gets logged.

tape follows JS scoping rules, so the ordering of the tests shouldn't affect anything unless a subtest mutates something another subtest depends on.

ArtemGovorov · 2015-12-26T07:29:17Z

@ljharb Why allowing the second test to run before the first one is finished? Except the nesting level, the second and the third tests are not much different conceptually, so why having different rules for the sub tests?

ArtemGovorov · 2015-12-26T07:38:31Z

In other words, the first test may be doing things in this scope:

test('first', function (t) {
  ...

as well as in this scope:

  setTimeout(function () {
     ...
  }, 200);

Allowing the second test to run before the first one is finished, means potentially interrupting the work of the first test. The isolation of the first test is broken.

ljharb · 2015-12-26T16:03:00Z

Using setTimeout in this fashion seems very strange to me - you're actually delaying creation of the subtest. If you do the setTimeout inside the subtest, I suspect you'll be much less surprised by the test ordering. Can you elaborate on why you're doing that?

Raynos · 2015-12-26T17:41:47Z

The bug here is that sub tests work differently purely on the existence of plan()

ArtemGovorov · 2015-12-27T00:00:06Z

@ljharb So I had a code like this:

var test = require('tape');

test('first', function (t) {

  setTimeout(function () {
    t.ok(1, 'first test');
    t.end();
  }, 200);
});

test('third', function (t) {
  setTimeout(function () {
    t.ok(1, 'third test');
    t.end();
  }, 100);
});

The first test was finishing before the third one. Then I added a subtest:

var test = require('tape');

test('first', function (t) {

  setTimeout(function () {
    t.ok(1, 'first test');
    t.end();
  }, 200);

  t.test('second', function (t) {
    t.ok(1, 'second test');
    t.end();
  });
});

test('third', function (t) {
  setTimeout(function () {
    t.ok(1, 'third test');
    t.end();
  }, 100);
});

Adding the second test introduced 2 issues:

the second test (subtest) is executed before the first test is finished;
the third test is now executed before the first test is finished.

Regarding the first issue - semantically, subtests are separate tests. They're executed after their parent test. That's what happens when the parent test in synchronous or has .plan call like here. Async parent test without the explicit .plan call behaviour is just inconsistent with the rest.

If you do the setTimeout inside the subtest

I know I may do the setTimeout inside the subtest, or may define the subtest inside the setTimeout, or may add .plan just like in the example I gave in the initial issue report. The issue with inconsistent behaviour remains though, as @Raynos said, and I agree, the bug is that sub tests work differently purely on the existence of plan().

Regarding the second issue - no matter how I structure the first test subtests, the third test should never execute before the first one is finished. It is inconsistent with tape's promise to execute tests sequentially.

ljharb · 2015-12-27T00:10:53Z

Gotcha - I always use plan so I haven't ran into it. I agree that the existence of plan shouldn't affect ordering of the tests.

kl0tl · 2016-02-07T17:26:45Z

This seems to be a consequence of supporting nested tests that implicitly end their parent.

See the test for this behavior.

Something that let any asynchronous process to settle instead of nextTick() should work. Any idea ?

sladiri · 2016-02-12T16:59:09Z

Hi, as a beginner and maybe missing some bigger picture:

This does not look like a bug, because that is how the runtime works. ~~I have not used plan and I did not know that it preserves the order somehow~~. If I do a setTimeout(), the order of the three tests in the example without plan() is exactly what I would expect to happen. Any other behaviour caused by tape would confuse me. I like the goal for tape not to introduce some magic like other test-runners do, and I think changing the current behaviour would introduce such magic.
Apparently plan() preservers order explicitly as an option and I am fine with that. That is actually a bonus I didn't even expect to get, although its name does not hint at this behaviour. For asynchronous functions I have been using Promises and then() for example for preserving some order. Now those introduce problems with stack traces but I guess that is a different story.
Actually, with asynchronous functions I have been using blue-tape, where I return a Promise from a test, and t.end() is called for me. This way I can easily avoid the anti-pattern of calling promise.resolve() explicitly (and the call to t.end() in the example with plan() is a analogous anti-pattern I think). A test using asynchronous functions is always wrapped inside a promise, and no errors can "disappear" for example.

ArtemGovorov · 2016-02-12T20:59:52Z

If I do a setTimeout(), the order of the three tests in the example without plan() is exactly what I would expect to happen. Any other behaviour caused by tape would confuse me.

The issue is that the third test starts before the first test is finished (with 3 tests), AND if you remove the second test, then the third test doesn't start the first test is finished.
Given that tape tests are executed serially, I think no matter sync or async, with our without plan, the third test should never start before the first one is finished. Starting a next test while a previous one is still not finished leads to issues with a shared context.

sladiri · 2016-02-12T22:38:46Z

Starting a next test while a previous one is still not finished leads to issues with a shared context.

Your example without plan() does the following in Promise-terms, where the order of execution is not deterministic (which is fine if there is no shared mutable context).

// pseudo code
Promise.all([ test1, test2, test3 ])

In order to execute test3 always after the first two, because it shares some mutable context with the others for example, you have to seperate their "async contexts", and I think that this should not be tape's responsibiltiy:

Promise.all([ test1, test2 ])
  .then(Promise.resolve( test3 ))

To depend on deterministic order for asynchronous execution is an anti pattern as far as I understand.

PS
I have not used plan() yet, I did not like to count my tests in order to have the correct number of subtests at the start. And I have not used sub tests for async functions like in your example either, because in my case blue-tape does not automatically fail a supertest when one of a subtests fails.

ljharb · 2016-02-12T22:46:37Z

(Promises aren't quite the same - a promise represents a future value for work that has already begun, whereas a callback is the work itself)

sladiri · 2016-02-12T22:52:48Z

(Promises aren't quite the same - a promise represents a future value for work that has already begun, whereas a callback is the work itself)

In the example you are not working with the callbacks directly, but passing them to seTimeout() and setTimeout() is some work to be done in the future, isn't it?

ljharb · 2016-02-12T22:54:48Z

that is true. but the test itself is not passed to setInterval, and it's not finished until "end()" is called (or the timeout reached).

sladiri · 2016-02-12T22:57:44Z

Of course, and I believe that the language itself provides solutions for this and that it is not tape's responsibility to solve this problem, that is all.

ArtemGovorov · 2016-02-12T23:08:29Z

@sladiri

Your example without plan() does the following in Promise-terms, where the order of execution is not deterministic (which is fine if there is no shared mutable context).

// pseudo code
Promise.all([ test1, test2, test3 ])

I don't think it's how most of test frameworks I know (including tape) work.

This code works as I expect and executes the first test before the third test:

var test = require('tape');

test('first', function (t) {

  setTimeout(function () {
    t.ok(1, 'first test');
    t.end();
  }, 200);
});

test('third', function (t) {
  t.ok(1, 'third test');
  t.end();
});

No matter how long does the first test take to execute, the third one will be executed after it.
It's when I insert the second subtest in the first test, it starts to show the unexpected behaviour.

kl0tl · 2016-02-13T20:14:21Z

Quick followup after my last post. Listening to beforeExit instead of calling nextTick (kl0tl@099488d) yields the expected order but causes nested-sync-noplan-noend to hang. I’m not really sure why.

I played a bit with Angular zone.js library. It works but it breaks even more tests :’D

It’s interesting to note that this simply doesn’t happen with blue-tape and Async Functions. It seems to be the most elegant solution to this issue. Could it be enough to document this use case and educate users ?

kl0tl · 2016-02-13T21:24:58Z

I fixed the failing test, listening to beforeExit should be enough to work around this issue !

kl0tl · 2016-02-13T23:14:52Z

Oh well, neither beforeExit nor timers unref() method work inside a browser environment. Looks like zones are the only way to preserve this behavior.

ljharb · 2016-02-15T16:13:20Z

Zones are not a standard (merely a stage 0 proposal) and are a non-option. Anything that monkeypatches anything in a reusable module is also a non-option, ever.
Documentation is great - a PR to update the readme would be welcome. It can always be removed when/if it is fixed.
any fix must work in both browsers and node
This is a bug that we all agree needs to be fixed. PRs welcome!

sladiri · 2016-02-15T20:17:30Z

I tried to create example which shows how to test ~~~a Koa.js server~~~ with blue-tape and promises.
#259

kl0tl · 2016-02-16T12:13:50Z

@ljharb Would a fix for Node only and an update to package.json with a browser field exporting the Iibrary without the fix ok? As far as I understand there is no way to fix this in a browser environement currently :( PR incoming !

@ArtemGovorov In the meantime you can try with github:kl0tl/tape#beforeExit instead of tape, it should be fixed there.

ljharb · 2016-02-16T17:38:45Z

@kl0tl I'd prefer the same bugs be present consistently across every environment. One of the primary use cases for tape is testing things in both browsers and node simultaneously, and differences across the two will be problematic.

kl0tl · 2016-02-17T20:30:40Z

@ljharb Yeah, consistency wins you’re right. I’ll document this behavior inside a new #Troubleshooting section of the readme.

What do you think about bringing Promises support into core now they’re standard ? I know there’s blue-tape but Async Functions are very convenient to use for async tests and obliterate this bug.

@ArtemGovorov Sadly this bug is here to stay :(

I hope it doesn’t prevent Wallaby to integrate with tape ! Would it be enough to encourage users facing this issue to avoid this pattern or use Async Functions if possible ?

ljharb · 2016-02-17T20:39:04Z

tape mirrors node core's assert - if assert gets a Promise interface, then so too should tape.

kl0tl · 2016-02-18T13:02:09Z

Promises support would still mirror assert. It’s not about modifying existing assertions but allowing users to return promises from their tests.

ljharb · 2016-02-18T17:26:48Z

Supporting that is tricky because we'd have to either a) assume Promise was available, which forces the user to know they need to shim it, b) allow explicit injection of Promise so they can pass bluebird or native Promises or whatever as they like (like .setPromiseConstructor() or something), or c) add a Promise lib as a dependency.

Solution a is dangerous, b is awkward, and c is heavyweight. If you want to open a new thread to discuss it (not on this one) we can, but I don't think it's likely.

kl0tl · 2016-02-18T18:08:40Z

Discussion on promises moved to #262.

kl0tl · 2016-03-28T14:42:06Z

I finally took time to document the issue!

It seems this bug can’t be fixed reliably across node and browsers :( Would it be conceivable to deprecate the implicit termination of parents tests and remove it in a future release? If not there’s not much more to do regarding this.

medikoo · 2017-04-06T10:57:55Z

If I understand problem correctly there are only two possible fixes:

A. Resign from support of no plan and no end test variants (as tested here) alternatively such behavior could be supported via extra option or alternative function.

B. Tweak setTimeout for time-being of callback being called, so internals have capability to track remaining callbacks, therefore wait for them before moving forward.

In my eyes A is cleaner, still I wonder how widely developers rely on that. If it's a no go, then it might be not solvable. B is really dirty, and one of main reasons that devs rely on Tape is because it doesn't mess with global state.

@ArtemGovorov can you explain why exactly it's a blocker for Wallaby? Aside of this issue, I can imagine some race conditions in asynchronous tests which will still make asserts to not go down always in same order (and it could be seen as acceptable).

nhamer · 2017-10-09T20:03:11Z

The problem isn't that 'first' actually starts terminating after 'third', it's that the entire block in setTimeout runs after 'first' has ended (implicitly, due to subtest without plan as @kl0tl noted)

test('implicitEnd', function(t) {
   setTimeout(function() {
     t.ok(!t.ended); // fails - t has already implicitly ended
     t.end(); // no failure, as there would be with multiple ends, but completely unnecessary
     t.ok(true, 'assert after end'); // succeeds, and shouldn't because t has ended 
   }, 200);
   t.test('subtest', function(t2) {
     t2.ok(true, 'subtest assertion succeeds');
     // t.ended === undefined
     t2.end(); // implicitly ends t
     // t.ended === true
   });
});

this is basically just #264

nhamer · 2017-10-13T21:10:58Z

This behavior was introduced in 35f29a as part of the fix for #33

I suggest that https://github.com/Kl0tl/tape/blob/master/test/nested-sync-noplan-noend.js#L1 should hang, since it doesn't have have a plan or end, which violates the documented behavior "If you forget to t.plan() out how many assertions you are going to run and you don't call t.end() explicitly, your test will hang."

I think the right fix here is to end the test upon subtest completion, but only if t.end() has been called already and conversely defer t.end() if subtests are queued. createHarness needs an option autoEnd for the exported harness to call t.end() after all subtests.

…t includes subtests Fixes tape-testing#222.

Fix premature end of tests (and running sibling tests) when test includes subtests Fixes #222.

[New] Implements TAP TODO directive (#254) [New] add alias 'notDeepEquals' to 'notDeepEqual' function (#411) [Fix] fix premature end of tests (and running sibling tests) when test includes subtests (#403, #222) [Fix] only use one test runner for results, even if multiple streams are created (#404, #361, #105) [Fix] windows: Show failure location even if driver letter is lowercase (#329) [Docs] link to mixed tape (#445) [Docs] Add electron-tap (#240) [Docs] Add tape-promise into 'other' (#210) [Docs] Mention [`flip-tape`](https://github.com/pguth/flip-tape/blob/master/README.md) in the section "other". (#359) [Docs] Add an alternative ES6 tape runner (#328)

Fixes #459. Reopens #222.

ljharb · 2019-02-13T22:52:40Z

Unfortunately the fix to this (#403) had to be reverted, since it caused #459.

@nhamer, it'd be great to try to address this again, but taking into account #459.

- [Fix] Partial revert of #403: fbe4b95 and 367b010 (#459, #222) - [Refactor] consistent spacing - [Deps] update `resolve`

ArtemGovorov mentioned this issue Dec 24, 2015

Support tape wallabyjs/public#2

Open

solidgoldpig mentioned this issue Jan 16, 2016

Migrate tests from Jasmine to Tap solidgoldpig/mondo-bank#15

Open

ljharb added the bug label Feb 12, 2016

kl0tl mentioned this issue Feb 13, 2016

Fix order of synchronous children of async tests #252

Closed

kl0tl mentioned this issue Feb 18, 2016

Promise support #262

Closed

kl0tl mentioned this issue Mar 28, 2016

Document unexpected t.end() behavior #279

Merged

mkermani144 mentioned this issue Apr 14, 2017

Test using async/await does not end #358

Closed

nhamer mentioned this issue Oct 15, 2017

Fix premature end of tests (and running sibling tests) when test includes subtests #403

Merged

ljharb pushed a commit to nhamer/tape that referenced this issue Jan 30, 2019

[Fix] Fix premature end of tests (and running sibling tests) when tes…

a4e804a

…t includes subtests Fixes tape-testing#222.

ljharb pushed a commit to nhamer/tape that referenced this issue Jan 30, 2019

[Fix] Fix premature end of tests (and running sibling tests) when tes…

e9036cb

…t includes subtests Fixes tape-testing#222.

ljharb pushed a commit to nhamer/tape that referenced this issue Jan 30, 2019

[Fix] Fix premature end of tests (and running sibling tests) when tes…

367b010

…t includes subtests Fixes tape-testing#222.

ljharb closed this as completed in #403 Jan 30, 2019

ljharb added a commit that referenced this issue Jan 30, 2019

Merge pull request #403 from nhamer/issue222

5d0d706

Fix premature end of tests (and running sibling tests) when test includes subtests Fixes #222.

ljharb added a commit that referenced this issue Feb 13, 2019

Partial revert of #403: fbe4b95 and 367b010

b74c4fd

Fixes #459. Reopens #222.

ljharb mentioned this issue Feb 13, 2019

Test exited without ending #459

Closed

ljharb reopened this Feb 13, 2019

ljharb added the pull request wanted label Feb 13, 2019

ljharb added a commit that referenced this issue Feb 14, 2019

v4.10.1

4708a3d

- [Fix] Partial revert of #403: fbe4b95 and 367b010 (#459, #222) - [Refactor] consistent spacing - [Deps] update `resolve`

snyk-bot mentioned this issue Aug 15, 2019

[Snyk] Upgrade tape from 4.9.1 to 4.11.0 WealthWizardsEngineering/hpropagate#10

Merged

Test order is affected by a presence of a nested test #222

Test order is affected by a presence of a nested test #222

Comments

ArtemGovorov commented Dec 10, 2015

davegomez commented Dec 16, 2015

GuidoS commented Dec 26, 2015

Raynos commented Dec 26, 2015

ArtemGovorov commented Dec 26, 2015

Raynos commented Dec 26, 2015

Raynos commented Dec 26, 2015

ArtemGovorov commented Dec 26, 2015

ljharb commented Dec 26, 2015

ArtemGovorov commented Dec 26, 2015

ArtemGovorov commented Dec 26, 2015

ljharb commented Dec 26, 2015

Raynos commented Dec 26, 2015

ArtemGovorov commented Dec 27, 2015

ljharb commented Dec 27, 2015

kl0tl commented Feb 7, 2016

sladiri commented Feb 12, 2016

ArtemGovorov commented Feb 12, 2016

sladiri commented Feb 12, 2016

ljharb commented Feb 12, 2016

sladiri commented Feb 12, 2016

ljharb commented Feb 12, 2016

sladiri commented Feb 12, 2016

ArtemGovorov commented Feb 12, 2016

kl0tl commented Feb 13, 2016

kl0tl commented Feb 13, 2016

kl0tl commented Feb 13, 2016

ljharb commented Feb 15, 2016

sladiri commented Feb 15, 2016

kl0tl commented Feb 16, 2016

ljharb commented Feb 16, 2016

kl0tl commented Feb 17, 2016

ljharb commented Feb 17, 2016

kl0tl commented Feb 18, 2016

ljharb commented Feb 18, 2016

kl0tl commented Feb 18, 2016

kl0tl commented Mar 28, 2016

medikoo commented Apr 6, 2017 • edited Loading

nhamer commented Oct 9, 2017 • edited Loading

nhamer commented Oct 13, 2017

ljharb commented Feb 13, 2019

medikoo commented Apr 6, 2017 •

edited

Loading

nhamer commented Oct 9, 2017 •

edited

Loading