Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test: fix test-cluster-disconnect-leak.js for AIX #4674

Closed
wants to merge 1 commit into from
Closed

test: fix test-cluster-disconnect-leak.js for AIX #4674

wants to merge 1 commit into from

Conversation

imran-iq
Copy link
Contributor

lowered the number of spawned workers as EAGAIN errors were
being thrown.

lowered the number of spawned workers as EAGAIN errors were
being thrown.
@r-52 r-52 added cluster Issues and PRs related to the cluster subsystem. test Issues and PRs related to the tests. labels Jan 13, 2016
@Trott
Copy link
Member

Trott commented Jan 13, 2016

Would you be able to confirm that the modified test you are proposing here still fails on AIX if the three lines added to cluster.js in f9f1dd9 are removed? This test should fail without that fix, but should succeed if that fix is there.

If you have nvm or some other convenient way to switch between Node versions, you can simply check that the test fires an AssertionError when executed with Node.js 5.4.0 and passes when executed with Node.js 5.4.1. Here's what it looks like on my OS X machine:

$ nvm use 5.4.0
Now using node v5.4.0 (npm v3.3.12)
$ node --expose-internals test/sequential/test-cluster-disconnect-leak.js

assert.js:89
  throw new assert.AssertionError({
  ^
AssertionError: Resource leak detected.
    at removeWorker (cluster.js:317:7)
    at Worker.disconnect (cluster.js:408:5)
    at Worker.<anonymous> (/Users/trott/io.js/test/common.js:395:15)
    at emitNone (events.js:67:13)
    at Worker.emit (events.js:166:7)
    at online (cluster.js:438:12)
    at Worker.onmessage (cluster.js:425:7)
    at ChildProcess.<anonymous> (cluster.js:714:8)
    at emitTwo (events.js:92:20)
    at ChildProcess.emit (events.js:172:7)
$ nvm use 5.4.1
Now using node v5.4.1 (npm v3.3.12)
$ node --expose-internals test/sequential/test-cluster-disconnect-leak.js
$ 

@imran-iq
Copy link
Contributor Author

Hi, it seems that the issue is unrelated to the fix.
We ended up hitting the maximum number of processes allowed per user on default AIX.

/usr/sbin/lsattr -E -l sys0 | grep maxuproc
maxuproc        128                                  Maximum number of PROCESSES allowed per user      True 

Without my change set and increasing the maxuproc to 150 with

/usr/sbin/chdev -l sys0 -a maxuproc=150

allows the test to pass without my changes. So I guess the question is does the number have to be that high for when we have more than 8 cores? Or should I say why is there a need for 16 worker forks per cpu?

@santigimeno
Copy link
Member

@iwuzhere it's just an empiric value: it was chosen so that the test wuold fail without the changes in lib/cluster.js consistently in multiple platforms. What would be the max value that would be acceptable for you? We could try with it and see if it still fails in the rest of platforms

@Trott
Copy link
Member

Trott commented Jan 14, 2016

Alternatively, maybe we could also try switching to an exponential ramp-up in the test so that we're not relying on calibrating the number like this. The magic number may change as hardware and operating systems evolve, so it might be slightly more future-proof to do it that way.

@imran-iq
Copy link
Contributor Author

This is also happening in test-cluster-disconnect-suicide-race.js. The test fails with EAGAIN on AIX for spawning 128 ( 64 * 2 ) workers.

@Trott
Copy link
Member

Trott commented Jan 18, 2016

I think I've found a faster and more reliable/deterministic way to test this issue. PR coming in a bit...

@Trott
Copy link
Member

Trott commented Jan 18, 2016

See alternative at #4736

Trott added a commit to Trott/io.js that referenced this pull request Jan 18, 2016
Previously, test-cluster-disconnect-leak had two issues:

* Magic numbers: How many times to spawn a worker was determined through
empirical experimentation. This means that as new platforms and new
CPU/RAM configurations are tested, the magic numbers require more
and more refinement. This brings us to...

* Non-determinism: The test *seems* to fail all the time when the bug
it tests for is present, but it's really a judgment based on sampling.
"Oh, with 8 workers per CPU, it fails about 80% of the time. Let's try
16..."

This revised version of the test takes a different approach. The fix
for the bug that the test was written for means that the `disconnect`
event will fire reliably for a single worker. So we check for that and
the test still fails when the fix is not in the code base and succeeds
when it is.

Advantages of this approach include:

* The test runs much faster.
* The test now works on Windows. The previous version skipped Windows.
* The test should be reliable on any new platform regardless of CPU and
RAM.

Ref: nodejs#4674
Trott added a commit to Trott/io.js that referenced this pull request Jan 18, 2016
Previously, test-cluster-disconnect-suicide-race had two issues:

* Magic numbers: How many times to spawn a worker was determined through
empirical experimentation. This means that as new platforms and new
CPU/RAM configurations are tested, the magic numbers require more
and more refinement. This brings us to...

* Non-determinism: The test seems to fail all the time when the bug
it tests for is present, but it's really a judgment based on sampling.
"Oh, with 8 workers per CPU, it fails about 80% of the time. Let's try
16..."

This revised version of the test takes a different approach. The fix
for the bug that the test was written for means that the disconnect
event will fire on a subsequent tick. So we check for that and the test
still fails when the fix is not in the code base and succeeds when it
is.

Advantages of this approach include:

* The test runs much faster.
* The test should be reliable on any new platform regardless of CPU and
RAM.

Ref: nodejs#4674

cc @santigimeno @iwuzhere
Trott added a commit to Trott/io.js that referenced this pull request Jan 18, 2016
Previously, test-cluster-disconnect-suicide-race had two issues:

* Magic numbers: How many times to spawn a worker was determined through
empirical experimentation. This means that as new platforms and new
CPU/RAM configurations are tested, the magic numbers require more
and more refinement. This brings us to...

* Non-determinism: The test seems to fail all the time when the bug
it tests for is present, but it's really a judgment based on sampling.
"Oh, with 8 workers per CPU, it fails about 80% of the time. Let's try
16..."

This revised version of the test takes a different approach. The fix
for the bug that the test was written for means that the disconnect
event will fire on a subsequent tick. So we check for that and the test
still fails when the fix is not in the code base and succeeds when it
is.

Advantages of this approach include:

* The test runs much faster.
* The test should be reliable on any new platform regardless of CPU and
RAM.

Ref: nodejs#4674

cc @santigimeno @iwuzhere
@cjihrig
Copy link
Contributor

cjihrig commented Jan 18, 2016

Closing in favor of #4736. If that doesn't work out, we can reopen this one.

@cjihrig cjihrig closed this Jan 18, 2016
jasnell pushed a commit that referenced this pull request Jan 19, 2016
Previously, test-cluster-disconnect-leak had two issues:

* Magic numbers: How many times to spawn a worker was determined through
empirical experimentation. This means that as new platforms and new
CPU/RAM configurations are tested, the magic numbers require more
and more refinement. This brings us to...

* Non-determinism: The test *seems* to fail all the time when the bug
it tests for is present, but it's really a judgment based on sampling.
"Oh, with 8 workers per CPU, it fails about 80% of the time. Let's try
16..."

This revised version of the test takes a different approach. The fix
for the bug that the test was written for means that the `disconnect`
event will fire reliably for a single worker. So we check for that and
the test still fails when the fix is not in the code base and succeeds
when it is.

Advantages of this approach include:

* The test runs much faster.
* The test now works on Windows. The previous version skipped Windows.
* The test should be reliable on any new platform regardless of CPU and
RAM.

Ref: #4674

PR-URL: #4736
Reviewed-By: Ben Noordhuis <info@bnoordhuis.nl>
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Johan Bergström <bugs@bergstroem.nu>
evanlucas pushed a commit that referenced this pull request Jan 19, 2016
Previously, test-cluster-disconnect-leak had two issues:

* Magic numbers: How many times to spawn a worker was determined through
empirical experimentation. This means that as new platforms and new
CPU/RAM configurations are tested, the magic numbers require more
and more refinement. This brings us to...

* Non-determinism: The test *seems* to fail all the time when the bug
it tests for is present, but it's really a judgment based on sampling.
"Oh, with 8 workers per CPU, it fails about 80% of the time. Let's try
16..."

This revised version of the test takes a different approach. The fix
for the bug that the test was written for means that the `disconnect`
event will fire reliably for a single worker. So we check for that and
the test still fails when the fix is not in the code base and succeeds
when it is.

Advantages of this approach include:

* The test runs much faster.
* The test now works on Windows. The previous version skipped Windows.
* The test should be reliable on any new platform regardless of CPU and
RAM.

Ref: #4674

PR-URL: #4736
Reviewed-By: Ben Noordhuis <info@bnoordhuis.nl>
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Johan Bergström <bugs@bergstroem.nu>
Trott added a commit that referenced this pull request Jan 19, 2016
Previously, test-cluster-disconnect-suicide-race had two issues:

* Magic numbers: How many times to spawn a worker was determined through
empirical experimentation. This means that as new platforms and new
CPU/RAM configurations are tested, the magic numbers require more
and more refinement. This brings us to...

* Non-determinism: The test seems to fail all the time when the bug
it tests for is present, but it's really a judgment based on sampling.
"Oh, with 8 workers per CPU, it fails about 80% of the time. Let's try
16..."

This revised version of the test takes a different approach. The fix
for the bug that the test was written for means that the disconnect
event will fire on a subsequent tick. So we check for that and the test
still fails when the fix is not in the code base and succeeds when it
is.

Advantages of this approach include:

* The test runs much faster.
* The test should be reliable on any new platform regardless of CPU and
RAM.

PR-URL: #4739
Ref: #4674
Reviewed-By: Ben Noordhuis <info@bnoordhuis.nl>
Reviewed-By: James M Snell <jasnell@gmail.com>
evanlucas pushed a commit that referenced this pull request Jan 20, 2016
Previously, test-cluster-disconnect-leak had two issues:

* Magic numbers: How many times to spawn a worker was determined through
empirical experimentation. This means that as new platforms and new
CPU/RAM configurations are tested, the magic numbers require more
and more refinement. This brings us to...

* Non-determinism: The test *seems* to fail all the time when the bug
it tests for is present, but it's really a judgment based on sampling.
"Oh, with 8 workers per CPU, it fails about 80% of the time. Let's try
16..."

This revised version of the test takes a different approach. The fix
for the bug that the test was written for means that the `disconnect`
event will fire reliably for a single worker. So we check for that and
the test still fails when the fix is not in the code base and succeeds
when it is.

Advantages of this approach include:

* The test runs much faster.
* The test now works on Windows. The previous version skipped Windows.
* The test should be reliable on any new platform regardless of CPU and
RAM.

Ref: #4674

PR-URL: #4736
Reviewed-By: Ben Noordhuis <info@bnoordhuis.nl>
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Johan Bergström <bugs@bergstroem.nu>
rvagg pushed a commit that referenced this pull request Jan 25, 2016
Previously, test-cluster-disconnect-suicide-race had two issues:

* Magic numbers: How many times to spawn a worker was determined through
empirical experimentation. This means that as new platforms and new
CPU/RAM configurations are tested, the magic numbers require more
and more refinement. This brings us to...

* Non-determinism: The test seems to fail all the time when the bug
it tests for is present, but it's really a judgment based on sampling.
"Oh, with 8 workers per CPU, it fails about 80% of the time. Let's try
16..."

This revised version of the test takes a different approach. The fix
for the bug that the test was written for means that the disconnect
event will fire on a subsequent tick. So we check for that and the test
still fails when the fix is not in the code base and succeeds when it
is.

Advantages of this approach include:

* The test runs much faster.
* The test should be reliable on any new platform regardless of CPU and
RAM.

PR-URL: #4739
Ref: #4674
Reviewed-By: Ben Noordhuis <info@bnoordhuis.nl>
Reviewed-By: James M Snell <jasnell@gmail.com>
MylesBorins pushed a commit that referenced this pull request Jan 28, 2016
Previously, test-cluster-disconnect-leak had two issues:

* Magic numbers: How many times to spawn a worker was determined through
empirical experimentation. This means that as new platforms and new
CPU/RAM configurations are tested, the magic numbers require more
and more refinement. This brings us to...

* Non-determinism: The test *seems* to fail all the time when the bug
it tests for is present, but it's really a judgment based on sampling.
"Oh, with 8 workers per CPU, it fails about 80% of the time. Let's try
16..."

This revised version of the test takes a different approach. The fix
for the bug that the test was written for means that the `disconnect`
event will fire reliably for a single worker. So we check for that and
the test still fails when the fix is not in the code base and succeeds
when it is.

Advantages of this approach include:

* The test runs much faster.
* The test now works on Windows. The previous version skipped Windows.
* The test should be reliable on any new platform regardless of CPU and
RAM.

Ref: #4674

PR-URL: #4736
Reviewed-By: Ben Noordhuis <info@bnoordhuis.nl>
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Johan Bergström <bugs@bergstroem.nu>
MylesBorins pushed a commit that referenced this pull request Jan 28, 2016
Previously, test-cluster-disconnect-suicide-race had two issues:

* Magic numbers: How many times to spawn a worker was determined through
empirical experimentation. This means that as new platforms and new
CPU/RAM configurations are tested, the magic numbers require more
and more refinement. This brings us to...

* Non-determinism: The test seems to fail all the time when the bug
it tests for is present, but it's really a judgment based on sampling.
"Oh, with 8 workers per CPU, it fails about 80% of the time. Let's try
16..."

This revised version of the test takes a different approach. The fix
for the bug that the test was written for means that the disconnect
event will fire on a subsequent tick. So we check for that and the test
still fails when the fix is not in the code base and succeeds when it
is.

Advantages of this approach include:

* The test runs much faster.
* The test should be reliable on any new platform regardless of CPU and
RAM.

PR-URL: #4739
Ref: #4674
Reviewed-By: Ben Noordhuis <info@bnoordhuis.nl>
Reviewed-By: James M Snell <jasnell@gmail.com>
MylesBorins pushed a commit that referenced this pull request Feb 11, 2016
Previously, test-cluster-disconnect-leak had two issues:

* Magic numbers: How many times to spawn a worker was determined through
empirical experimentation. This means that as new platforms and new
CPU/RAM configurations are tested, the magic numbers require more
and more refinement. This brings us to...

* Non-determinism: The test *seems* to fail all the time when the bug
it tests for is present, but it's really a judgment based on sampling.
"Oh, with 8 workers per CPU, it fails about 80% of the time. Let's try
16..."

This revised version of the test takes a different approach. The fix
for the bug that the test was written for means that the `disconnect`
event will fire reliably for a single worker. So we check for that and
the test still fails when the fix is not in the code base and succeeds
when it is.

Advantages of this approach include:

* The test runs much faster.
* The test now works on Windows. The previous version skipped Windows.
* The test should be reliable on any new platform regardless of CPU and
RAM.

Ref: #4674

PR-URL: #4736
Reviewed-By: Ben Noordhuis <info@bnoordhuis.nl>
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Johan Bergström <bugs@bergstroem.nu>
MylesBorins pushed a commit that referenced this pull request Feb 11, 2016
Previously, test-cluster-disconnect-suicide-race had two issues:

* Magic numbers: How many times to spawn a worker was determined through
empirical experimentation. This means that as new platforms and new
CPU/RAM configurations are tested, the magic numbers require more
and more refinement. This brings us to...

* Non-determinism: The test seems to fail all the time when the bug
it tests for is present, but it's really a judgment based on sampling.
"Oh, with 8 workers per CPU, it fails about 80% of the time. Let's try
16..."

This revised version of the test takes a different approach. The fix
for the bug that the test was written for means that the disconnect
event will fire on a subsequent tick. So we check for that and the test
still fails when the fix is not in the code base and succeeds when it
is.

Advantages of this approach include:

* The test runs much faster.
* The test should be reliable on any new platform regardless of CPU and
RAM.

PR-URL: #4739
Ref: #4674
Reviewed-By: Ben Noordhuis <info@bnoordhuis.nl>
Reviewed-By: James M Snell <jasnell@gmail.com>
MylesBorins pushed a commit to MylesBorins/node that referenced this pull request Feb 11, 2016
Previously, test-cluster-disconnect-leak had two issues:

* Magic numbers: How many times to spawn a worker was determined through
empirical experimentation. This means that as new platforms and new
CPU/RAM configurations are tested, the magic numbers require more
and more refinement. This brings us to...

* Non-determinism: The test *seems* to fail all the time when the bug
it tests for is present, but it's really a judgment based on sampling.
"Oh, with 8 workers per CPU, it fails about 80% of the time. Let's try
16..."

This revised version of the test takes a different approach. The fix
for the bug that the test was written for means that the `disconnect`
event will fire reliably for a single worker. So we check for that and
the test still fails when the fix is not in the code base and succeeds
when it is.

Advantages of this approach include:

* The test runs much faster.
* The test now works on Windows. The previous version skipped Windows.
* The test should be reliable on any new platform regardless of CPU and
RAM.

Ref: nodejs#4674

PR-URL: nodejs#4736
Reviewed-By: Ben Noordhuis <info@bnoordhuis.nl>
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Johan Bergström <bugs@bergstroem.nu>
MylesBorins pushed a commit to MylesBorins/node that referenced this pull request Feb 11, 2016
Previously, test-cluster-disconnect-suicide-race had two issues:

* Magic numbers: How many times to spawn a worker was determined through
empirical experimentation. This means that as new platforms and new
CPU/RAM configurations are tested, the magic numbers require more
and more refinement. This brings us to...

* Non-determinism: The test seems to fail all the time when the bug
it tests for is present, but it's really a judgment based on sampling.
"Oh, with 8 workers per CPU, it fails about 80% of the time. Let's try
16..."

This revised version of the test takes a different approach. The fix
for the bug that the test was written for means that the disconnect
event will fire on a subsequent tick. So we check for that and the test
still fails when the fix is not in the code base and succeeds when it
is.

Advantages of this approach include:

* The test runs much faster.
* The test should be reliable on any new platform regardless of CPU and
RAM.

PR-URL: nodejs#4739
Ref: nodejs#4674
Reviewed-By: Ben Noordhuis <info@bnoordhuis.nl>
Reviewed-By: James M Snell <jasnell@gmail.com>
MylesBorins pushed a commit to MylesBorins/node that referenced this pull request Feb 13, 2016
Previously, test-cluster-disconnect-leak had two issues:

* Magic numbers: How many times to spawn a worker was determined through
empirical experimentation. This means that as new platforms and new
CPU/RAM configurations are tested, the magic numbers require more
and more refinement. This brings us to...

* Non-determinism: The test *seems* to fail all the time when the bug
it tests for is present, but it's really a judgment based on sampling.
"Oh, with 8 workers per CPU, it fails about 80% of the time. Let's try
16..."

This revised version of the test takes a different approach. The fix
for the bug that the test was written for means that the `disconnect`
event will fire reliably for a single worker. So we check for that and
the test still fails when the fix is not in the code base and succeeds
when it is.

Advantages of this approach include:

* The test runs much faster.
* The test now works on Windows. The previous version skipped Windows.
* The test should be reliable on any new platform regardless of CPU and
RAM.

Ref: nodejs#4674

PR-URL: nodejs#4736
Reviewed-By: Ben Noordhuis <info@bnoordhuis.nl>
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Johan Bergström <bugs@bergstroem.nu>
MylesBorins pushed a commit to MylesBorins/node that referenced this pull request Feb 13, 2016
Previously, test-cluster-disconnect-suicide-race had two issues:

* Magic numbers: How many times to spawn a worker was determined through
empirical experimentation. This means that as new platforms and new
CPU/RAM configurations are tested, the magic numbers require more
and more refinement. This brings us to...

* Non-determinism: The test seems to fail all the time when the bug
it tests for is present, but it's really a judgment based on sampling.
"Oh, with 8 workers per CPU, it fails about 80% of the time. Let's try
16..."

This revised version of the test takes a different approach. The fix
for the bug that the test was written for means that the disconnect
event will fire on a subsequent tick. So we check for that and the test
still fails when the fix is not in the code base and succeeds when it
is.

Advantages of this approach include:

* The test runs much faster.
* The test should be reliable on any new platform regardless of CPU and
RAM.

PR-URL: nodejs#4739
Ref: nodejs#4674
Reviewed-By: Ben Noordhuis <info@bnoordhuis.nl>
Reviewed-By: James M Snell <jasnell@gmail.com>
MylesBorins pushed a commit to MylesBorins/node that referenced this pull request Feb 15, 2016
Previously, test-cluster-disconnect-leak had two issues:

* Magic numbers: How many times to spawn a worker was determined through
empirical experimentation. This means that as new platforms and new
CPU/RAM configurations are tested, the magic numbers require more
and more refinement. This brings us to...

* Non-determinism: The test *seems* to fail all the time when the bug
it tests for is present, but it's really a judgment based on sampling.
"Oh, with 8 workers per CPU, it fails about 80% of the time. Let's try
16..."

This revised version of the test takes a different approach. The fix
for the bug that the test was written for means that the `disconnect`
event will fire reliably for a single worker. So we check for that and
the test still fails when the fix is not in the code base and succeeds
when it is.

Advantages of this approach include:

* The test runs much faster.
* The test now works on Windows. The previous version skipped Windows.
* The test should be reliable on any new platform regardless of CPU and
RAM.

Ref: nodejs#4674

PR-URL: nodejs#4736
Reviewed-By: Ben Noordhuis <info@bnoordhuis.nl>
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Johan Bergström <bugs@bergstroem.nu>
MylesBorins pushed a commit to MylesBorins/node that referenced this pull request Feb 15, 2016
Previously, test-cluster-disconnect-suicide-race had two issues:

* Magic numbers: How many times to spawn a worker was determined through
empirical experimentation. This means that as new platforms and new
CPU/RAM configurations are tested, the magic numbers require more
and more refinement. This brings us to...

* Non-determinism: The test seems to fail all the time when the bug
it tests for is present, but it's really a judgment based on sampling.
"Oh, with 8 workers per CPU, it fails about 80% of the time. Let's try
16..."

This revised version of the test takes a different approach. The fix
for the bug that the test was written for means that the disconnect
event will fire on a subsequent tick. So we check for that and the test
still fails when the fix is not in the code base and succeeds when it
is.

Advantages of this approach include:

* The test runs much faster.
* The test should be reliable on any new platform regardless of CPU and
RAM.

PR-URL: nodejs#4739
Ref: nodejs#4674
Reviewed-By: Ben Noordhuis <info@bnoordhuis.nl>
Reviewed-By: James M Snell <jasnell@gmail.com>
scovetta pushed a commit to scovetta/node that referenced this pull request Apr 2, 2016
Previously, test-cluster-disconnect-leak had two issues:

* Magic numbers: How many times to spawn a worker was determined through
empirical experimentation. This means that as new platforms and new
CPU/RAM configurations are tested, the magic numbers require more
and more refinement. This brings us to...

* Non-determinism: The test *seems* to fail all the time when the bug
it tests for is present, but it's really a judgment based on sampling.
"Oh, with 8 workers per CPU, it fails about 80% of the time. Let's try
16..."

This revised version of the test takes a different approach. The fix
for the bug that the test was written for means that the `disconnect`
event will fire reliably for a single worker. So we check for that and
the test still fails when the fix is not in the code base and succeeds
when it is.

Advantages of this approach include:

* The test runs much faster.
* The test now works on Windows. The previous version skipped Windows.
* The test should be reliable on any new platform regardless of CPU and
RAM.

Ref: nodejs#4674

PR-URL: nodejs#4736
Reviewed-By: Ben Noordhuis <info@bnoordhuis.nl>
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Johan Bergström <bugs@bergstroem.nu>
scovetta pushed a commit to scovetta/node that referenced this pull request Apr 2, 2016
Previously, test-cluster-disconnect-suicide-race had two issues:

* Magic numbers: How many times to spawn a worker was determined through
empirical experimentation. This means that as new platforms and new
CPU/RAM configurations are tested, the magic numbers require more
and more refinement. This brings us to...

* Non-determinism: The test seems to fail all the time when the bug
it tests for is present, but it's really a judgment based on sampling.
"Oh, with 8 workers per CPU, it fails about 80% of the time. Let's try
16..."

This revised version of the test takes a different approach. The fix
for the bug that the test was written for means that the disconnect
event will fire on a subsequent tick. So we check for that and the test
still fails when the fix is not in the code base and succeeds when it
is.

Advantages of this approach include:

* The test runs much faster.
* The test should be reliable on any new platform regardless of CPU and
RAM.

PR-URL: nodejs#4739
Ref: nodejs#4674
Reviewed-By: Ben Noordhuis <info@bnoordhuis.nl>
Reviewed-By: James M Snell <jasnell@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cluster Issues and PRs related to the cluster subsystem. test Issues and PRs related to the tests.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants