-
Notifications
You must be signed in to change notification settings - Fork 29.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Investigate flaky parallel/test-tls-sni-option #26910
Comments
I've been seeing this locally a bunch |
The culprit seems to be this commit: 42dbaed. Easily reproducible on my machine with:
If cc: @sam-github |
@lpinca What kind of machine do you have? I can't repro with your test.py invocation, or by running:
three times in parallel, on linux x64. That said, this test looks like it depends on the order of completion of the handshakes on client relative to server, I am prepping a PR. |
I read the top in more detail, it's on OS X, I''ll try on that. |
Yes I can't reproduce on Linux, only on macOS. |
The following patch fixes it but I'm not sure why. diff --git a/test/parallel/test-tls-sni-option.js b/test/parallel/test-tls-sni-option.js
index 3a6a231b47..6ccf631169 100644
--- a/test/parallel/test-tls-sni-option.js
+++ b/test/parallel/test-tls-sni-option.js
@@ -115,7 +115,7 @@ let clientError;
const server = tls.createServer(serverOptions, function(c) {
serverResults.push({ sni: c.servername, authorized: c.authorized });
- c.end();
+ // c.end();
});
server.on('tlsClientError', function(err) {
@@ -137,7 +137,8 @@ function startTest() {
client.authorizationError &&
(client.authorizationError === 'ERR_TLS_CERT_ALTNAME_INVALID'));
- next();
+ client.on('close', next);
+ client.end();
});
client.on('error', function(err) { |
PR-URL: nodejs#27225 Refs: nodejs#26910 Refs: nodejs#27219 Refs: nodejs#26938 Refs: nodejs#23089 Reviewed-By: Richard Lau <riclau@uk.ibm.com> Reviewed-By: Ruben Bridgewater <ruben@bridgewater.de> Reviewed-By: Luigi Pinca <luigipinca@gmail.com> Reviewed-By: Yongsheng Zhang <zyszys98@gmail.com>
Without injecting some tracing into the TLS code, I don't know exactly why that is happening, but its not surprising. TLS1.3 continues to exchange some packages after handshake, and handshake completes with different timings, so for some extra messages to arrive after the socket is closed, but for it to be racy, doesn't surprise me. With your change, the close goes from client, to server, then back to client, emits close, and then it's done, so the roundtrip occurs. I fixed bugs like this in other tests, but this one being only a bit racy I never noticed. |
Yes but isn't it strange that the race only occurs (or is exacerbated) on macOS? |
Systems are different. localhost TCP on Linux doesn't actually do a full TCP protocol, for example, it knows packet loss is impossible because its all in the local host's memory. Also, after a I could reintroduce my packet tracing to see exactly what packet is being sent, but that will take a bit to PR, and I'm not sure what the API should look like. I could wireshark, but since the link is encrypted and we lack the feature of dumping the master secret, I won't see the actual packet. I've been meaning to PR an improvement to that, too. Both are open feature requests. Since I made a number of fixes like this, but with tests that failed more like 1/3 of the time, I'm OK with just doing a graceful TLS end. If we want to wait until TLS debugging features land, that's OK with me, too. |
I'm fine with anything as long as the test is not changed from its original intent. |
The purpose of the test was to check the assertions related to SNI, which isn't affected by how the connection is destroyed. |
I reworked my refactor in #27300 so that it uses the same close sequence as original, and the test appears to be stable. |
On
osx1011
(test-macstadium-macos10.11-x64-1):The text was updated successfully, but these errors were encountered: