Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test: strengthen test-worker-prof #26608

Merged
merged 1 commit into from
Mar 24, 2019

Conversation

gireeshpunathil
Copy link
Member

Force main and worker to stay for some deterministic time
Add some more validation check around profile file generation

Refs: #26401

Checklist
  • make -j4 test (UNIX), or vcbuild test (Windows) passes
  • tests and/or benchmarks are included
  • documentation is changed or added
  • commit message follows commit guidelines

@nodejs-github-bot nodejs-github-bot added the test Issues and PRs related to the tests. label Mar 12, 2019
test/parallel/test-worker-prof.js Outdated Show resolved Hide resolved
});
w.postMessage(data);
} else {
tmpdir.refresh();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An absolute micronit here but maybe add a comment above this (right below the else line) saying this is the parent process?

@Trott
Copy link
Member

Trott commented Mar 14, 2019

Looks good to me, but I'd want @addaleax's confirmed sign-off if at all possible first.

@refack refack removed their request for review March 14, 2019 21:29
@refack
Copy link
Contributor

refack commented Mar 14, 2019

I'm not confidant in my expertise to properly review this ¯\(ツ)

@gireeshpunathil
Copy link
Member Author

@refack - ack, np; and then I will critically depend on @addaleax's review on this.

In case if this explanation helps:
while the root cause of flakes (profile files missing, insufficient profile ticks) are not fully diagnosed, there is a reasonable and working explanation to that thus:

  • there was an issue in the worker in that it was creating isolate in the parent thread as opposed to worker thread itself
  • due to this the profile ticks were accounted against parent, not in the worker.
  • this test case is a regression test associated with the PR that fixed the issue.
  • the test makes use of a parent and a worker. Drive both to run equal amount of time, as much as possible.
  • collect the profile output. measure the tick counts from the profile file.
  • make sure that the tick counts are comparable.

The reason for the test failure in my assessment is due to unexpected interplay between the OS scheduler, thread execution order, and the program logic (tight loop until the time reaches a threshold) which could potentially result in the tight loop to exit prematurely without the thread really have consumed enough CPU - affecting either or both of the threads. (again, only theory, no real proof for it)

The new version attempts to run independent of time scale, instead in more deterministic manner , by shuttling some messages a predefined number of times, to get a right mix of CPU and IO. Scheduling and other environmental things should not affect the logic.

thanks!

test/parallel/test-worker-prof.js Outdated Show resolved Hide resolved
test/parallel/test-worker-prof.js Outdated Show resolved Hide resolved
@gireeshpunathil
Copy link
Member Author

{ cwd: tmpdir.path });
assert.strictEqual(spawnResult.stderr.toString(), '');
assert.strictEqual(spawnResult.status, 0);
assert.strictEqual(spawnResult.signal, null);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest swapping this line and the one above it so that the signal is checked first as that would be more useful to know if it is not null rather than the null exit code seen in #26401 (comment).

@gireeshpunathil
Copy link
Member Author

just pushed a change to that effect: added debug data in all three assertions + reversed the signal and exit status check order.

assert.strictEqual(spawnResult.stderr.toString(), '',
`child exited with an error: ${spawnResult}`);
assert.strictEqual(spawnResult.signal, null,
`child exited with signal: ${spawnResult}`);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

${spawnResult} will end up displaying [object Object] unfortunately. You might have to do something like ${util.inspect(spawnResult)} if you want something useful.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks. Is there any way to explode the internal arrays as well? with this change I get the primitive fields, but not the Buffers:

assert.strictEqual(spawnResult.status, 0,
                     `child exited with non-zero status: ${util.inspect(spawnResult, {depth: 3})}`);
{ status: 1,
  signal: null,
  output: [ null, <Buffer >, <Buffer > ],
  pid: 72177,
  stdout: <Buffer >,
  stderr: <Buffer > 
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about passing encoding: 'utf8' option to spawnSync so that stdout and stdin are String objects instead of Buffer objects?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@richardlau - addressed your review comments, ptal.

@gireeshpunathil
Copy link
Member Author

@refack @Trott - do you have any objections on landing this PR? I am asking this is in the light of reported crash in #26401 (comment) . In my opinion it is a container environment related; however, this test being a pure JS code cannot cause a crash in itself, so should be either native code in node, v8 or elsewhere - which is more of a reason this should land so that we can reproduce, investigate and fix at the source? please let me know!

@Trott
Copy link
Member

Trott commented Mar 21, 2019

do you have any objections on landing this PR?

I don't have any objections.

@Trott
Copy link
Member

Trott commented Mar 23, 2019

Is there anything stopping this up from landing at this point? Or should we land it?

@Trott
Copy link
Member

Trott commented Mar 23, 2019

I guess it needs a new CI run.

CI: https://ci.nodejs.org/job/node-test-pull-request/21823/

@gireeshpunathil
Copy link
Member Author

@Trott - I was waiting to see if @refack has a say. At this point I think we should land this, when CI shows us green.

@gireeshpunathil
Copy link
Member Author

@gireeshpunathil
Copy link
Member Author

the only failure parallel/test-async-hooks-http-parser-destroy is #26610

@gireeshpunathil gireeshpunathil added the author ready PRs that have at least one approval, no pending requests for changes, and a CI started. label Mar 23, 2019
@Trott
Copy link
Member

Trott commented Mar 23, 2019

Force main and worker to stay for some deterministic time
Add some more validation check around profile file generation

Fixes: nodejs#26401
PR-URL: nodejs#26608

Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Anna Henningsen <anna@addaleax.net>
Reviewed-By: Richard Lau <riclau@uk.ibm.com>
@gireeshpunathil gireeshpunathil removed the author ready PRs that have at least one approval, no pending requests for changes, and a CI started. label Mar 24, 2019
@gireeshpunathil
Copy link
Member Author

landed as ed849f8

@gireeshpunathil gireeshpunathil merged commit ed849f8 into nodejs:master Mar 24, 2019
targos pushed a commit to targos/node that referenced this pull request Mar 27, 2019
Force main and worker to stay for some deterministic time
Add some more validation check around profile file generation

Fixes: nodejs#26401
PR-URL: nodejs#26608

Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Anna Henningsen <anna@addaleax.net>
Reviewed-By: Richard Lau <riclau@uk.ibm.com>
targos pushed a commit that referenced this pull request Mar 27, 2019
Force main and worker to stay for some deterministic time
Add some more validation check around profile file generation

Fixes: #26401
PR-URL: #26608

Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Anna Henningsen <anna@addaleax.net>
Reviewed-By: Richard Lau <riclau@uk.ibm.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
test Issues and PRs related to the tests.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants