Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate flaky test-regress-GH-4027 on Windows #13800

Closed
Trott opened this issue Jun 19, 2017 · 3 comments
Closed

Investigate flaky test-regress-GH-4027 on Windows #13800

Trott opened this issue Jun 19, 2017 · 3 comments
Labels
fs Issues and PRs related to the fs subsystem / file system. test Issues and PRs related to the tests. windows Issues and PRs related to the Windows platform.

Comments

@Trott
Copy link
Member

Trott commented Jun 19, 2017

  • Version: 9.0.0-pre
  • Platform: win2012r2,vs2015
  • Subsystem: test

https://ci.nodejs.org/job/node-test-binary-windows/9310/RUN_SUBSET=1,VS_VERSION=vs2015,label=win2012r2/console

not ok 387 sequential/test-regress-GH-4027
  ---
  duration_ms: 5.421
  severity: fail
  stack: |-
    assert.js:60
      throw new errors.AssertionError({
      ^
    
    AssertionError [ERR_ASSERTION]: 0 === 1
        at StatWatcher.<anonymous> (c:\workspace\node-test-binary-windows\RUN_SUBSET\1\VS_VERSION\vs2015\label\win2012r2\test\sequential\test-regress-GH-4027.js:35:10)
        at StatWatcher.<anonymous> (c:\workspace\node-test-binary-windows\RUN_SUBSET\1\VS_VERSION\vs2015\label\win2012r2\test\common\index.js:517:15)
        at emitTwo (events.js:125:13)
        at StatWatcher.emit (events.js:213:7)
        at StatWatcher._handle.onchange (fs.js:1454:10)

/cc @nodejs/testing @nodejs/platform-windows

Perhaps a race condition triggered by system load, which probably explains why it's in sequential and not parallel in the first place?

@Trott Trott added fs Issues and PRs related to the fs subsystem / file system. test Issues and PRs related to the tests. windows Issues and PRs related to the Windows platform. labels Jun 19, 2017
@refack
Copy link
Contributor

refack commented Jun 19, 2017

Perhaps a race condition triggered by system load, which probably explains why it's in sequential and not parallel in the first place?

Yes this is a timing balancing act, supposed to do

  1. fs.writeFileSync
  2. setTimeout(fs.unlinkSync, 100
  3. fs.watchFile(filename, { interval: 50 }

@Trott
Copy link
Member Author

Trott commented Jun 25, 2017

Another one:

https://ci.nodejs.org/job/node-test-binary-windows/9429/RUN_SUBSET=3,VS_VERSION=vs2015,label=win2012r2/console

not ok 387 sequential/test-regress-GH-4027
  ---
  duration_ms: 6.187
  severity: fail
  stack: |-
    assert.js:60
      throw new errors.AssertionError({
      ^
    
    AssertionError [ERR_ASSERTION]: 0 === 1
        at StatWatcher.<anonymous> (c:\workspace\node-test-binary-windows\RUN_SUBSET\3\VS_VERSION\vs2015\label\win2012r2\test\sequential\test-regress-GH-4027.js:35:10)
        at StatWatcher.<anonymous> (c:\workspace\node-test-binary-windows\RUN_SUBSET\3\VS_VERSION\vs2015\label\win2012r2\test\common\index.js:520:15)
        at emitTwo (events.js:125:13)
        at StatWatcher.emit (events.js:213:7)
        at StatWatcher._handle.onchange (fs.js:1454:10)
  ...

@jaimecbernardo
Copy link
Contributor

Investigated this issue. Managed to replicate it on a VM by capping the CPU. The issue seems to be that the watched file is getting deleted before the internals of watchFile are able to get the first state of the file, so watchFile behaves as if the file doesn't exist and calls the callback with prev.nlink and curr.nlink at 0.

Increasing the delay before unlinking to 300ms and calling setTimeout after watchFile caused it to stop reproducing.

Created a PR: #14010

@Trott Trott closed this as completed in f1c890a Jul 3, 2017
addaleax pushed a commit that referenced this issue Jul 11, 2017
The sequential/test-regress-GH-4027 test is flaky with an increased
system load, failing when the watched file is unlinked before the
first state of the watched file is retrieved.

After increasing the delay before unlinking and calling setTimeout
after watchFile, the flakiness stopped reproducing.

PR-URL: #14010
Fixes: #13800
Reviewed-By: Rich Trott <rtrott@gmail.com>
Reviewed-By: Refael Ackermann <refack@gmail.com>
addaleax pushed a commit that referenced this issue Jul 18, 2017
The sequential/test-regress-GH-4027 test is flaky with an increased
system load, failing when the watched file is unlinked before the
first state of the watched file is retrieved.

After increasing the delay before unlinking and calling setTimeout
after watchFile, the flakiness stopped reproducing.

PR-URL: #14010
Fixes: #13800
Reviewed-By: Rich Trott <rtrott@gmail.com>
Reviewed-By: Refael Ackermann <refack@gmail.com>
Fishrock123 pushed a commit that referenced this issue Jul 19, 2017
The sequential/test-regress-GH-4027 test is flaky with an increased
system load, failing when the watched file is unlinked before the
first state of the watched file is retrieved.

After increasing the delay before unlinking and calling setTimeout
after watchFile, the flakiness stopped reproducing.

PR-URL: #14010
Fixes: #13800
Reviewed-By: Rich Trott <rtrott@gmail.com>
Reviewed-By: Refael Ackermann <refack@gmail.com>
MylesBorins pushed a commit that referenced this issue Aug 15, 2017
The sequential/test-regress-GH-4027 test is flaky with an increased
system load, failing when the watched file is unlinked before the
first state of the watched file is retrieved.

After increasing the delay before unlinking and calling setTimeout
after watchFile, the flakiness stopped reproducing.

PR-URL: #14010
Fixes: #13800
Reviewed-By: Rich Trott <rtrott@gmail.com>
Reviewed-By: Refael Ackermann <refack@gmail.com>
MylesBorins pushed a commit that referenced this issue Aug 15, 2017
The sequential/test-regress-GH-4027 test is flaky with an increased
system load, failing when the watched file is unlinked before the
first state of the watched file is retrieved.

After increasing the delay before unlinking and calling setTimeout
after watchFile, the flakiness stopped reproducing.

PR-URL: #14010
Fixes: #13800
Reviewed-By: Rich Trott <rtrott@gmail.com>
Reviewed-By: Refael Ackermann <refack@gmail.com>
MylesBorins pushed a commit that referenced this issue Aug 16, 2017
The sequential/test-regress-GH-4027 test is flaky with an increased
system load, failing when the watched file is unlinked before the
first state of the watched file is retrieved.

After increasing the delay before unlinking and calling setTimeout
after watchFile, the flakiness stopped reproducing.

PR-URL: #14010
Fixes: #13800
Reviewed-By: Rich Trott <rtrott@gmail.com>
Reviewed-By: Refael Ackermann <refack@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
fs Issues and PRs related to the fs subsystem / file system. test Issues and PRs related to the tests. windows Issues and PRs related to the Windows platform.
Projects
None yet
Development

No branches or pull requests

3 participants