Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HTTP Server close is taking up to 5 seconds in node 18 #50188

Open
DanielRamosAcosta opened this issue Oct 14, 2023 · 21 comments
Open

HTTP Server close is taking up to 5 seconds in node 18 #50188

DanielRamosAcosta opened this issue Oct 14, 2023 · 21 comments
Labels
http Issues or PRs related to the http subsystem.

Comments

@DanielRamosAcosta
Copy link

DanielRamosAcosta commented Oct 14, 2023

Version

v18.18.2

Platform

Darwin MacBook-Pro-de-Daniel.local 23.0.0 Darwin Kernel Version 23.0.0: Fri Sep 15 14:43:05 PDT 2023; root:xnu-10002.1.13~1/RELEASE_ARM64_T6020 arm64

Subsystem

http

What steps will reproduce the bug?

  1. Create an http server
  2. Start listening
  3. Take at least one request
  4. Try to close the server
  5. It takes exactly 5 seconds

How often does it reproduce? Is there a required condition?

I have only reproduced the bug in node 18, but is deterministic, it happens every time

What is the expected behavior? Why is that the expected behavior?

In node 19 and 20 it's only taking 0.5 seconds. Thats a normal time that I would expect.

What do you see instead?

I see 5 seconds of delay in order to close

Additional information

I've reproduced the error in this repository with a bare minimum with Github Actions: DanielRamosAcosta/nodejs-close-server-bug

Here is the CI report:

This is affecting to a library I'm maintaining similar to supertest. If you see the CI there, Node 18 tests are taking very long due to the servers being closed.

@H4ad
Copy link
Member

H4ad commented Oct 14, 2023

Started on 18.18.1, the 18.18.0 does not have this weird delay.

EDIT: it has the delay also on 18.18.0

@himself65
Copy link
Member

image

Started on 18.18.1, the 18.18.0 does not have this weird delay.

No, I think all 18 version has such issue

@himself65
Copy link
Member

Related PR: #48383

@himself65
Copy link
Member

The hot fix:

import * as http from "node:http";
import { promisify } from "node:util";

const server = http.createServer(async (req, res) => {
    res.writeHead(200, { "Content-Type": "text/plain" });
    res.write("Hello world");
    res.end();
});
const listenPromisied = promisify(server.listen.bind(server));
const closePromisied = promisify(server.close.bind(server));

await listenPromisied(0, "127.0.0.1");
const address = server.address();
await fetch(`http://${address.address}:${address.port}`);
console.time("server close");
+ server.closeAllConnections();
await closePromisied();
console.timeEnd("server close");

@himself65 himself65 added the http Issues or PRs related to the http subsystem. label Oct 15, 2023
@H4ad
Copy link
Member

H4ad commented Oct 15, 2023

Should we bisect or closeAllConnections is expected to be called?

@himself65
Copy link
Member

The latest behavior is to close all sockets, I think there are some backport issues on 18. Let me see if I can do the backport

@H4ad
Copy link
Member

H4ad commented Oct 15, 2023

The 19.0.0 looks like the first release without this delay.

I will try bisect to see if I can find the commit that fixes the delay and then we can backport.

The #48383 was released on 20.4.0, and this issue started on 18.0.0 (or even earlier, I didn't try rewrite the fetch call)

@H4ad
Copy link
Member

H4ad commented Oct 15, 2023

@himself65 bisect will not help in this case:

The merge base 19064bec341185a8c15fc438cfcf0df8633a179e is bad.
This means the bug has been fixed between 19064bec341185a8c15fc438cfcf0df8633a179e and [cc993fb2760d01457955f5b9ff787d559ed1c34e].

From what I understand, 18.x and 19.x have different git histories, so there's no way to find the bad commit using bisect.
At least, I don't know a way to do it, I'm open to suggestions.

@himself65
Copy link
Member

I tried nodejs 16.x. now I think there might have a regression on nodejs 16 -> 18

import * as http from "node:http";
import { promisify } from "node:util";
import fetch from 'node-fetch'

const server = http.createServer(async (req, res) => {
    res.writeHead(200, { "Content-Type": "text/plain" });
    res.write("Hello world");
    res.end();
});
const listenPromisied = promisify(server.listen.bind(server));
const closePromisied = promisify(server.close.bind(server));

await listenPromisied(0, "127.0.0.1");
const address = server.address();
await fetch(`http://${address.address}:${address.port}`);
console.time("server close");
await closePromisied();
console.timeEnd("server close");
➜  nodejs git:(main) ✗ node index.mjs
server close: 0.158ms
➜  nodejs git:(main) ✗ node -v
v16.20.2

@himself65 himself65 added the regression Issues related to regressions. label Oct 15, 2023
@H4ad
Copy link
Member

H4ad commented Oct 15, 2023

It's a regression from 17.x -> 18.x

h4ad:node-copy-4/ (main✗) $ node test-close-2.mjs                                                                                                                                                                                                                    
server close: 0.108ms
h4ad:node-copy-4/ (main✗) $ node -v
v17.9.1

@H4ad
Copy link
Member

H4ad commented Oct 15, 2023

Based on https://github.com/nodejs/node/pull/49091/files#diff-d692ac4524379ec6a1201165e8ff8d3267c8130e07014e8221ebf7e6f80c6641R1560-R1567, and this little change:

const server = http.createServer({ keepAliveTimeout: 200 }, async (req, res) => {
h4ad:node-copy-4/ (main✗) $ node test-close-2.mjs
server close: 0.255ms
h4ad:node-copy-4/ (main✗) $ node -v
v18.18.1

@himself65 I think you are right, the #48383 should fix the issue, I thought it landed on 18.18.0 but it did not, sorry for the confusion.

Let me know if you will do the backport, if not, I can do that.

@H4ad
Copy link
Member

H4ad commented Oct 15, 2023

I built 18.18.0 with fix #48383, it solves the delay.

@himself65
Copy link
Member

Based on https://github.com/nodejs/node/pull/49091/files#diff-d692ac4524379ec6a1201165e8ff8d3267c8130e07014e8221ebf7e6f80c6641R1560-R1567, and this little change:

const server = http.createServer({ keepAliveTimeout: 200 }, async (req, res) => {
h4ad:node-copy-4/ (main✗) $ node test-close-2.mjs
server close: 0.255ms
h4ad:node-copy-4/ (main✗) $ node -v
v18.18.1

@himself65 I think you are right, the #48383 should fix the issue, I thought it landed on 18.18.0 but it did not, sorry for the confusion.

Let me know if you will do the backport, if not, I can do that.

I might don't have time on backport you can try that.

I read some CIGTM code, i have no idea on this #48383 (comment)

@DanielRamosAcosta
Copy link
Author

DanielRamosAcosta commented Oct 15, 2023

I think I can bring more insights for the problem. I've tried dierent ways of fetching to the server in different node versions. Basically I'm trying with the native fetch (native), with node-fetch (library) and with an old plain request from the http package (request). Here are the results (you can also find them in Github Actions):

  • node 17.9.1 native: (nothing)
  • node 17.9.1 library: 0.235ms
  • node 17.9.1 request: 0.235ms
  • node 18.0.0 native: 3.996s
  • node 18.0.0 library: 0.318ms
  • node 18.0.0 request: 0.297ms
  • node 18.18.0 native: 4.958s
  • node 18.18.0 library: 0.284ms
  • node 18.18.0 request: 0.329ms
  • node 18.18.1 native: 4.965s
  • node 18.18.1 library: 0.325ms
  • node 18.18.1 request: 0.29ms
  • node 18.18.2 native: 4.989s
  • node 18.18.2 library: 0.316ms
  • node 18.18.2 request: 0.285ms
  • node 19.0.0 native: 0.5ms
  • node 19.0.0 library: 0.457ms
  • node 19.0.0 request: 0.557ms
  • node 20.0.0 native: 0.46ms
  • node 20.0.0 library: 0.558ms
  • node 20.0.0 request: 0.547ms
  • node 20.8.1 native: 0.497ms
  • node 20.8.1 library: 0.454ms
  • node 20.8.1 request: 0.455ms

As you can see, there seems to be a problem with fetch in node 18.x, but solved since 19

@himself65
Copy link
Member

I manually implement a function that won't close the connections. Now It's back to 5s in node.js 19/20

import * as http from "node:http";
import { promisify } from "node:util";
import * as net from "node:net";

const server = http.createServer(async (req, res) => {
  res.writeHead(200, { "Content-Type": "text/plain" });
  res.write("Hello world");
  res.end();
});

const listenPromisied = promisify(server.listen.bind(server));
const closePromisied = promisify(net.Server.prototype.close.bind(server));

await listenPromisied(0, "127.0.0.1");
const address = server.address();
await fetch(`http://${address.address}:${address.port}`);
console.time("server close");
await closePromisied();
console.timeEnd("server close");

@himself65
Copy link
Member

Now I believe this is a kind of bug in undici

@himself65 himself65 added fetch Issues and PRs related to the Fetch API and removed http Issues or PRs related to the http subsystem. regression Issues related to regressions. labels Oct 15, 2023
@himself65
Copy link
Member

I think this is not a regression. From undici 4.4.1 (first version that has fetch), the delay is always there

@himself65
Copy link
Member

Hotfix

res.writeHead(200, { 'Content-Type': 'text/plain', 'Connection': 'close' })

@himself65
Copy link
Member

Thanks to nodejs/undici#2348 (comment)

The real reason is keep-alive is by default in somewhere

@himself65 himself65 added http Issues or PRs related to the http subsystem. and removed fetch Issues and PRs related to the Fetch API labels Oct 16, 2023
@himself65
Copy link
Member

himself65 commented Oct 16, 2023

this.shouldKeepAlive = true;

globalAgent: new Agent({ keepAlive: true, scheduling: 'lifo', timeout: 5000 }),

@benjamingr
Copy link
Member

The code fetches and then does not read or dispose of the response. This is expected behavior I think? You only waited for headers, you might still read the body of the response.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
http Issues or PRs related to the http subsystem.
Projects
None yet
Development

No branches or pull requests

4 participants