-
Notifications
You must be signed in to change notification settings - Fork 29.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Investigate flaky test-http-server-request-timeout-keepalive #42741
Comments
I will take care of that tomorrow morning. Out of the box I just have to fine tune timers. @nodejs/build Can I get access to that BSD machine? |
Refs: nodejs#41263 PR-URL: nodejs#42846 Fixes: nodejs#42741 Reviewed-By: Rich Trott <rtrott@gmail.com> Reviewed-By: Luigi Pinca <luigipinca@gmail.com> Reviewed-By: Mohammed Keyvanzadeh <mohammadkeyvanzade94@gmail.com> Reviewed-By: Darshan Sen <raisinten@gmail.com> Reviewed-By: Paolo Insogna <paolo@cowtech.it>
Looks like #42846 reduced the error rate by some amount but it's still flaky. We can probably further increase the base timeout and see if that helps, even if the test will become unbearably slow. |
I have a further change in #42812 which should reduce it even more. Anyway given the success rate is above 99% I think we might think about closing this (since retries in CI are normal) and reopen only if there is consistent failure. |
This issue is tracking that the test is flaky, which it still appears to be so I'm reopening. |
It's interesting that it seems to only happen on macOS and FreeBSD. I might run a few stress tests. |
Patch (assuming the issue is timing-related): diff --git a/test/parallel/test-http-server-headers-timeout-keepalive.js b/test/parallel/test-http-server-headers-timeout-keepalive.js
index 05531087ed83..ba82cb0e0c9c 100644
--- a/test/parallel/test-http-server-headers-timeout-keepalive.js
+++ b/test/parallel/test-http-server-headers-timeout-keepalive.js
@@ -24,7 +24,7 @@ function performRequestWithDelay(client, firstDelay, secondDelay, closeAfter) {
}, firstDelay + secondDelay).unref();
}
-const headersTimeout = common.platformTimeout(2000);
+const headersTimeout = common.platformTimeout(5000);
const server = createServer({
headersTimeout,
requestTimeout: 0,
diff --git a/test/parallel/test-http-server-request-timeout-keepalive.js b/test/parallel/test-http-server-request-timeout-keepalive.js
index 2466e1ee7a95..eb6d64b2f2bb 100644
--- a/test/parallel/test-http-server-request-timeout-keepalive.js
+++ b/test/parallel/test-http-server-request-timeout-keepalive.js
@@ -24,7 +24,7 @@ function performRequestWithDelay(client, firstDelay, secondDelay, closeAfter) {
}, firstDelay + secondDelay).unref();
}
-const requestTimeout = common.platformTimeout(2000);
+const requestTimeout = common.platformTimeout(5000);
const server = createServer({
headersTimeout: 0,
requestTimeout, First run (master and with patch):
Second run (master and with patch):
Unless I'm mistaken, increasing the timeout fixes the issues on FreeBSD but somehow makes it worse on macOS? I started a third run with fresh revisions because I find this hard to believe (master and with patch). Update: Third run (master and with patch):
|
Median success rate (refer to patch and separate tables above):
|
@richardlau Thanks, I think one commit closed this by mistake. |
@tniessen This is very strange. I code on MacOS and never got a failure... |
@richardlau Based on #42741 (comment), would you be okay with further increasing the timeout to fix the test on FreeBSD and with marking it as flaky on macOS? Do GitHub actions respect our list of flaky tests? |
I would be okay increasing the timeout and marking macOS flaky although we'd still keep this open for tracking as we do for other tests marked flaky.
They should do: node/.github/workflows/test-macos.yml Line 34 in 68fb0bf
Lines 526 to 530 in 68fb0bf
|
I've no idea if it's the same underlying cause, but I've just spotted on LinuxONE:
|
I think this might have been an error from me, I have:
So being called two times is legit if things are not closed fast enough, |
Should I mark this test and all tests in #42893 (without raising the timeout)? (just to clarify: they pass on most platforms so I don't think we're ignoring a bug) |
@richardlau I opened #42962 to mark the two known problematic tests as flaky on macOS. Increasing the timeout seems to fix the FreeBSD failures.
The strange part is that the error rate seems to increase when the timeout increases (see #42741 (comment)), so I am not sure if macOS being slow is the issue here. |
@tniessen You are right. I actually spotted a bug while working on #42812, see changes in https://github.com/nodejs/node/pull/42812/files#diff-bbcf7a115fe4680f3a7c076a487c89bd4ce935e7d2a726cfcb3f69331705e25eR260.
Both related PR are now in master. Can you please try again with 2 and/or 5 seconds? |
@tniessen Any update on this? Are tests improving? |
It seems the flake is still there: nodejs/reliability#273
Example
|
I just comprehended what had happened here. Checking with I deeply apologize for this embarassing incident. In the future, I'll keep anything like that on a separate GitHub account. @aduh95 thanks for reopening this. |
I got a failure in #44741 (comment) |
Getting some failures again on osx11 ([test-http-server-request-timeouts-mixed](https://ci.nodejs.org/job/node-test-pull-request/59813/, https://ci.nodejs.org/job/node-test-pull-request/59820/) |
Test
test-http-server-request-timeout-keepalive
Platform
FreeBSD, macOS
Console output
Build links
Additional information
Test was recently updated by #41263.
cc @ShogunPanda
The text was updated successfully, but these errors were encountered: