Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

smartos: test-http-res-write-after-end - possible intermittent failure #10592

Closed
mhdawson opened this issue Jan 3, 2017 · 6 comments
Closed
Labels
http Issues or PRs related to the http subsystem. smartos Issues and PRs related to the SmartOS platform. test Issues and PRs related to the tests.

Comments

@mhdawson
Copy link
Member

mhdawson commented Jan 3, 2017

  • Version: head
  • Platform: smartos14-64
  • Subsystem: http

This test failed in a recent run for unrelated changes:
https://ci.nodejs.org/job/node-test-commit-smartos/6112/nodes=smartos14-64/console

not ok 1004 parallel/test-http-res-write-after-end
  ---
  duration_ms: 60.117
  severity: fail
  stack: |-
    timeout
  ...
@mhdawson
Copy link
Member Author

mhdawson commented Jan 3, 2017

@mscdex mscdex added smartos Issues and PRs related to the SmartOS platform. http Issues or PRs related to the http subsystem. test Issues and PRs related to the tests. labels Jan 3, 2017
@geek
Copy link
Member

geek commented Jan 3, 2017

I am unable to reproduce this failure on a local SmartOS instance.

There are also lots of node processes that exist before the build starts:

+ pgrep node
864
517
1431
975
1210
1541
158
1763
289
633
401
1657
1092
743
1321

Do you think we should start killing all node procs before we do these runs?

@misterdjules
Copy link

To paraphrase what I wrote in #11026:

So far, I haven't been able to reproduce the problem described by any of the issues listed above.

In order to be able to get more information and investigate future spurious failures, I submitted a PR that sends SIGABRT instead of SIGTERM to test processes that timeout. This will allow us to take a look at core files generated from these processes with tools such as llnode and mdb_v8, and will potentially help us root cause these issues.

In the meantime I'll continue trying to reproduce and investigate those issues, I'll keep you posted.

@misterdjules
Copy link

Again, pasting what I wrote in #11026 to make sure anyone coming here is aware of the current progress on this issue.

#11086 was merged and nodejs/build#613 was created to make tests that time out generate a core file that could be inspected to help root cause these failures.

Thus, I'd suggest that we mark this test (and the other flaky ones mentioned in #11026 (comment)) as flaky on SmartOS.

Then we should make sure that when a build is marked "unstable" on SmartOS, we grab the core files that are generated and upload them somewhere (we can use Joyent's manta for that, as this is part of the resources donated by Joyent to the project) where they won't be cleaned up for further investigation.

How does that sound?

@mhdawson
Copy link
Member Author

mhdawson commented Feb 3, 2017

Sounds like a good plan for me.

@Trott
Copy link
Member

Trott commented Jul 16, 2017

Haven't seen this in months. Closing. Feel free to re-open or comment if it should be open.

@Trott Trott closed this as completed Jul 16, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
http Issues or PRs related to the http subsystem. smartos Issues and PRs related to the SmartOS platform. test Issues and PRs related to the tests.
Projects
None yet
Development

No branches or pull requests

5 participants