Why node.js server takes longer for each new set of requests #1179

ORESoftware · 2018-03-22T21:14:35Z

This regards Node.js versions 9+

I have a simple Node.js Express server running, I issued this command history:

round 1:

$ node server.js

$ node /Users/alexamil/perf.test.js
passed in these millis: 3405
$ node /Users/alexamil/perf.test.js
passed in these millis: 4010
$ node /Users/alexamil/perf.test.js
passed in these millis: 5290
$ node /Users/alexamil/perf.test.js
passed in these millis: 21170

round 2:

$ node server.js

$ node /Users/alexamil/perf.test.js
passed in these millis: 4407
$ node /Users/alexamil/perf.test.js
passed in these millis: 4865
$ node /Users/alexamil/perf.test.js
passed in these millis: 4874
$ node /Users/alexamil/perf.test.js
passed in these millis: 20808

I am just curious what the explanation is for the server taking more time, even after finishing all requests. For example, I run perf.test.js, it finishes all requests. Then I manually run the same command again, about 1 or 2 seconds after the previous finished. Yet the response time still increases. We also see a big increase upon the fourth command.

The express server is like so:

'use strict';

import {ErrorRequestHandler} from "express";
import * as express from 'express';
import haven from 'domain-haven';

const app = express();

app.use(function (req, res, next) {
  req.havenData = JSON.parse(req.query.haven || '{}');
  next();
});

if (process.env.use_haven === 'yes') {
  console.log('using haven');
  app.use(haven());
}


app.use(function(req,res,next){
  res.json({success: true});
});

app.use(function (err, req, res, next) {
  err && console.error(err.message || err);
  if (!res.headersSent) {
    setTimeout(function () {
      if (!res.headersSent) {
        res.json({error: 'hit final error middleware', value: err && err.stack || err || null})
      }
    }, 10);
  }
  
});

app.listen(6969, function () {
  console.log('app is listening.');
});

the test code is like so:

import http = require('http');
import async = require('async');
import request = require('request');
import {HavenData} from "./simple-server";
import assert = require('assert');
import util = require('util');

const tasks = Array.apply(null, Array(5000)).map(function () {
  return function (cb: Function) {
    
    const opts = {
      json: true,
      qs: {haven: null as any}
    };
    
    request.get('http://localhost:6969', opts, function (err, resp, v) {
      
      if (err) {
        return cb(err);
      }
      
      cb(null);
    });
    
  }
});

const now = Date.now();

async.parallelLimit(tasks, 15, function (err) {
  if (err) throw err;
  console.log('passed in these millis:', Date.now() - now);
});

The text was updated successfully, but these errors were encountered:

gireeshpunathil · 2018-03-23T08:31:15Z

Reproducible with this small code, in MAC. Tested in Linux, AIX and Windows, and did not observe this behavior. @ORESoftware - can you confirm you are on Mac to make sure we are attacking the same issue?

const h = require('http');
h.createServer((q, s) => {
  s.end('g')
}).listen(6969, () => {
  const now = Date.now()
  let count = 0
  function foo() {
    if(count++ === 5000) {
      process.stdout.write(`time in millis: ${Date.now() - now}\n`)
      process.exit(0)
    }
    h.get('http://localhost:6969', foo)
  }
  foo()
})

Remarkable difference in the profile is:

good (total ticks:7150):

   ticks parent  name
   3698   51.7%  T ___channel_get_opt

bad (total ticks: 10537):

   ticks parent  name
   7193   68.3%  T ___channel_get_opt

#find /usr/include/ | xargs grep "___channel_get_opt" 2> /dev/null
/usr/include//sys/syscall.h:#define SYS___channel_get_opt 513

Looks like this is an opaque darwin kernel routine. No documentation available.

None of the TCP system calls seem to carry the burden / reflect the delay - probably the connect and accept being non-blocking, are not spending visible time while ___channel_get_opt is taking up CPU.

The latency increases only if tested back-to-back (lot of pileups in the netstat -na | wc -l) . If you wait for it to subside, the issue is not present - hinting at some optimization in the I/O channel level.

I don't have an inference or recommendation. /cc @nodejs/performance @bnoordhuis

bnoordhuis · 2018-03-23T09:03:47Z

Performance benchmarking on macos is a waste of time. If it reproduces on linux or windows, I'd be interested to see it.

gibfahn · 2018-03-23T11:45:10Z

Performance benchmarking on macos is a waste of time.

Why (if you'll excuse my curiosity)?

bnoordhuis · 2018-03-23T13:58:00Z

Apple's kernel team spends very little effort on making things fast. That's at least the impression I get from reading the xnu sources.

shellberg · 2018-03-23T16:03:40Z

Unfortunate, but also been reported before too

ORESoftware · 2018-03-23T17:54:06Z

yeah I am on MacOS...I can try it on my Ubuntu machine and see if I get the same thing, but haven't done that yet. The behavior seems consistent tho..It seemed strange to me that after all requests were done, it would have 0 requests. And then 1 second later, it would get a new set of 5000 requests (no more than 15 at a time), and take longer.

gireeshpunathil · 2018-04-08T10:23:43Z

@ORESoftware - were you able to make any observations in Linux?

ORESoftware · 2018-04-08T11:19:48Z

not yet but I will try it in a week

gireeshpunathil · 2018-05-06T09:59:05Z

@ORESoftware - let me know if anything is outstanding on this.

ORESoftware · 2018-05-06T18:39:08Z

let me try this on my linux machine right now

gireeshpunathil · 2018-05-31T08:06:33Z

ping @ORESoftware - were you able to validate in Linux?

ORESoftware · 2018-05-31T08:12:08Z

oh yes, give me another day, will test this tomorrow, I still haven't got to this but will have time tomorrow

kingnebby · 2018-06-23T05:26:56Z

I don't know if it helps, but I can reproduce the T ___channel_get_opt issue with this.

let intervalLimit = 100
let runCount = 0
const intervalId = setInterval(() => {
  if (runCount++ > intervalLimit) {
    clearInterval(intervalId)
    return
  }
}, 0)

Profile Output

[Summary]:
   ticks  total  nonlib   name
      4    2.4%    2.4%  JavaScript
    157   92.4%   95.7%  C++
    ...

[Bottom up (heavy) profile]:
  Note: percentage shows a share of a particular caller in the total
  amount of its parent calls.
  Callers occupying less than 1.0% are not shown.

   ticks parent  name
     89   52.4%  T ___channel_get_opt

Node.js 10.5.0
Darwin Kernel Version 16.7.0

kingnebby · 2018-06-23T07:08:38Z

Ok, pretty convinced this is just setInterval(). I pain-stakingly when through my dependencies and excluded modules/services that contained setInterval(). This really just came down to restify (the smallest part of our app) and some custom functions that were scanning the file system or doing heartbeat kinds of stuff.

Anyhoot, in a performance run with about ~2000 ticks the two outputs are as follows:

// With Intervals
    538   20.9%   21.1%  T ___channel_get_opt

// Without Intervals
     58    3.9%  T ___channel_get_opt

Btw, ran the same script above on Ubuntu 14 and got a similar picture but with syscall instead of T ___channel_get_opt

No idea what a 'solution' would like.. if any exists.

gireeshpunathil · 2018-06-23T08:03:48Z

@kingnebby - thanks for pitching in. However, I am not convinced on your theory around setInterval are you able to run your code and see degradation in consecutive runs? If so by how much? can you please provide that output?

mine is here (MAC):

#time node 1179.js

real	0m0.237s
user	0m0.079s
sys	0m0.022s
#time node 1179.js

real	0m0.226s
user	0m0.081s
sys	0m0.025s
#time node 1179.js

real	0m0.229s
user	0m0.075s
sys	0m0.021s
#time node 1179.js

real	0m0.231s
user	0m0.076s
sys	0m0.024s
#

Though this is little slower than Linux overall,the values are consistent across runs, and there is no cascading effect of the previous run on the next run - that was the originally reported problem.

kingnebby · 2018-06-23T08:53:52Z

@gireeshpunathil Indeed- I don't know how it would impact the original issue. Stumbled across channel_get_opt in my own performance tests and was just trying to understand what that business is and whether or not it would impact our different issues.

If it doesn't help solve the original issue then feel free to ignore it :)

ORESoftware · 2018-06-23T09:14:42Z

Working on finally replicating the problem on linux...give me 10 mins

ORESoftware · 2018-06-23T09:23:04Z

Finally got to this, sorry it took so long.
The problem only seems to occur on MacOS, not Ubuntu.

I created a project that can run this test:
https://github.com/ORESoftware/node-perf

You can run:

npm install -g @oresoftware/node.perf

then:

$ node_perf_server
$ node_perf_test
$ node_perf_test
$ node_perf_test
 # ...
$ node_perf_test

etc.

On linux, each time the test is run, the time required goes down slightly over time. On MacOS it goes up a ton over time. Weird.

andresdiaz29 · 2024-10-06T21:41:34Z

Hi everyone, I am facing the same issue on Windows and Node +20. Calling an endpoint with a long loop a few times, after the third request it is triplicating the execution time

gireeshpunathil mentioned this issue May 21, 2018

IPC channel stops delivering messages to cluster workers nodejs/node#9706

Closed

ORESoftware closed this as completed Jun 23, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Why node.js server takes longer for each new set of requests #1179

Why node.js server takes longer for each new set of requests #1179

ORESoftware commented Mar 22, 2018

gireeshpunathil commented Mar 23, 2018

bnoordhuis commented Mar 23, 2018

gibfahn commented Mar 23, 2018

bnoordhuis commented Mar 23, 2018

shellberg commented Mar 23, 2018

ORESoftware commented Mar 23, 2018 •

edited

Loading

gireeshpunathil commented Apr 8, 2018

ORESoftware commented Apr 8, 2018

gireeshpunathil commented May 6, 2018

ORESoftware commented May 6, 2018

gireeshpunathil commented May 31, 2018

ORESoftware commented May 31, 2018

kingnebby commented Jun 23, 2018

kingnebby commented Jun 23, 2018 •

edited

Loading

gireeshpunathil commented Jun 23, 2018

kingnebby commented Jun 23, 2018 •

edited

Loading

ORESoftware commented Jun 23, 2018

ORESoftware commented Jun 23, 2018 •

edited

Loading

andresdiaz29 commented Oct 6, 2024 •

edited

Loading

Why node.js server takes longer for each new set of requests #1179

Why node.js server takes longer for each new set of requests #1179

Comments

ORESoftware commented Mar 22, 2018

gireeshpunathil commented Mar 23, 2018

bnoordhuis commented Mar 23, 2018

gibfahn commented Mar 23, 2018

bnoordhuis commented Mar 23, 2018

shellberg commented Mar 23, 2018

ORESoftware commented Mar 23, 2018 • edited Loading

gireeshpunathil commented Apr 8, 2018

ORESoftware commented Apr 8, 2018

gireeshpunathil commented May 6, 2018

ORESoftware commented May 6, 2018

gireeshpunathil commented May 31, 2018

ORESoftware commented May 31, 2018

kingnebby commented Jun 23, 2018

kingnebby commented Jun 23, 2018 • edited Loading

gireeshpunathil commented Jun 23, 2018

kingnebby commented Jun 23, 2018 • edited Loading

ORESoftware commented Jun 23, 2018

ORESoftware commented Jun 23, 2018 • edited Loading

andresdiaz29 commented Oct 6, 2024 • edited Loading

ORESoftware commented Mar 23, 2018 •

edited

Loading

kingnebby commented Jun 23, 2018 •

edited

Loading

kingnebby commented Jun 23, 2018 •

edited

Loading

ORESoftware commented Jun 23, 2018 •

edited

Loading

andresdiaz29 commented Oct 6, 2024 •

edited

Loading