Skip to content
This repository has been archived by the owner on Apr 22, 2023. It is now read-only.

setTimeout drifts after busy wait #8105

Closed
bjouhier opened this issue Aug 8, 2014 · 21 comments
Closed

setTimeout drifts after busy wait #8105

bjouhier opened this issue Aug 8, 2014 · 21 comments
Assignees
Labels

Comments

@bjouhier
Copy link

bjouhier commented Aug 8, 2014

if you call setTimeout after a busy wait the delay can be much longer than expected.

I created a gist: https://gist.github.com/bjouhier/c7adb5f54caf9fc86d6e

OSX 10.9.4, node 0.10.27

@micnic
Copy link

micnic commented Aug 8, 2014

Win8, node 0.10.29:

busy:   expected 300, got 300
idle:   expected 100, got 114
busy:   expected 300, got 300
idle:   expected 100, got 104
busy:   expected 300, got 300
idle:   expected 100, got 113
busy:   expected 300, got 300
idle:   expected 100, got 104
DONE: 1650

@loyd
Copy link

loyd commented Aug 8, 2014

Linux, node 0.10.30:

busy:   expected 300, got 300
idle:   expected 100, got 1
busy:   expected 300, got 300
idle:   expected 100, got 100
busy:   expected 300, got 300
idle:   expected 100, got 100
busy:   expected 300, got 300
idle:   expected 100, got 100
DONE: 1511

WIth v0.13.0-pre, the same as gist.

@bjouhier
Copy link
Author

bjouhier commented Aug 8, 2014

Upgraded to 0.10.29 on OSX, to compare with Win8. Same result as 0.10.27:

busy:   expected 300, got 300
idle:   expected 100, got 101
busy:   expected 300, got 300
idle:   expected 100, got 402 SURPRISE!!
busy:   expected 300, got 300
idle:   expected 100, got 400 SURPRISE!!
busy:   expected 300, got 300
idle:   expected 100, got 401 SURPRISE!!
DONE: 2518

@almike
Copy link

almike commented Aug 8, 2014

normal behavior if your cpu is busy.

chrome while cpu is busy:

busy:   expected 300, got 300 VM111:8
idle:   expected 100, got 293 SURPRISE!! VM111:16
busy:   expected 300, got 300 VM111:8
idle:   expected 100, got 699 SURPRISE!! VM111:16
busy:   expected 300, got 300 VM111:8
idle:   expected 100, got 700 SURPRISE!! VM111:16
busy:   expected 300, got 300 VM111:8
idle:   expected 100, got 700 SURPRISE!! VM111:16
DONE: 3593 

what is not normal is the first idle being below 100ms, either 0 or 1 on my computer.

ubuntu/10.30:

busy:   expected 300, got 300
idle:   expected 100, got 0
busy:   expected 300, got 300
idle:   expected 100, got 100
busy:   expected 300, got 300
idle:   expected 100, got 100
busy:   expected 300, got 300
idle:   expected 100, got 100
DONE: 1506

@bjouhier
Copy link
Author

bjouhier commented Aug 8, 2014

@almike
Problem is not the computer being busy. CPU is idle during the idleWait call. Yet it takes 400 ms instead of 100 ms.

Also I can control the drift precisely: if I change the busy wait to X ms, the abnormal idleWait calls take X + 100 ms. Results are reproducible with a few ms margin if the computer is idle otherwise.

@almike
Copy link

almike commented Aug 8, 2014

what happens if you run it for a longer time?

function combined(a) {
    busyWait(500, function() {
        idleWait(200,function(){
            a < 50 && combined(++a);
        });
    }); 
}
combined(0);

also not sure if that is normal:

console.log(Date.now());
setTimeout(function(){
    console.log(Date.now());
    setTimeout(function(){
        console.log(Date.now());
    },100);
},100);

gives:

1407491392450
1407491392513
1407491392614

the first timeout is 63ms but it should be equal or above 100

@bjouhier
Copy link
Author

bjouhier commented Aug 8, 2014

Result:

busy:   expected 500, got 500
idle:   expected 200, got 203
busy:   expected 500, got 500
idle:   expected 200, got 702 SURPRISE!!
busy:   expected 500, got 500
idle:   expected 200, got 701 SURPRISE!!
busy:   expected 500, got 500
idle:   expected 200, got 701 SURPRISE!!
busy:   expected 500, got 500
idle:   expected 200, got 701 SURPRISE!!
busy:   expected 500, got 500
...

@bjouhier
Copy link
Author

bjouhier commented Aug 8, 2014

Also, if I use process.nextTick(cb) instead of cb() to chain at the end of busyWait then the problem goes away.

@almike
Copy link

almike commented Aug 8, 2014

ah so your idle time is actually busy time + idle time on your computer.
maybe while it is looping date is not updated https://codereview.chromium.org/173793002

@trevnorris
Copy link

Can you update the test to use process.hrtime() for comparison with
Date.now()?

@bjouhier
Copy link
Author

bjouhier commented Aug 8, 2014

If date was not updated during busy loop idleWait would elapse earlier rather than later. This is the bug that we observe on Linux, in the first iteration. On OSX it looks like the first iteration is properly compensated but that the compensation is applied twice in the following iterations.

The Chromium link points to another problem, related to timezone offsets.

@bjouhier
Copy link
Author

bjouhier commented Aug 8, 2014

@trevnorris
I updated the gist to measure the elapsed times with process.hrtime(). I get the same results.

The problem does not seem to be with Date.now() but rather with setTimeout. If I use a long busyWait I can see that the second and subsequent idleTimeoutcalls take much longer than the first one.

@almike
Copy link

almike commented Aug 8, 2014

but later if it happens in the loop, I thought it could be some caching for optimization... but you are right I looked at that chromium diff the changes do not match the title.

@almike
Copy link

almike commented Aug 8, 2014

ok, tried with v0.10.26. still on ubuntu:

busy:   expected 300, got 300
idle:   expected 100, got 100
busy:   expected 300, got 300
idle:   expected 100, got 400 SURPRISE!!
busy:   expected 300, got 300
idle:   expected 100, got 400 SURPRISE!!
busy:   expected 300, got 300
...

seems ok @ v0.10.30, except for the first timeout which is triggered too fast.

@vkurchatkin
Copy link

My results:

0.10.28:

busy:   expected 300, got 300
idle:   expected 100, got 102
busy:   expected 300, got 300
idle:   expected 100, got 401 SURPRISE!!
busy:   expected 300, got 300
idle:   expected 100, got 401 SURPRISE!!
busy:   expected 300, got 300

0.10.30:

busy:   expected 300, got 300
idle:   expected 100, got 0
busy:   expected 300, got 300
idle:   expected 100, got 101
busy:   expected 300, got 300
idle:   expected 100, got 100

0.11.13

busy:   expected 300, got 300
idle:   expected 100, got 100
busy:   expected 300, got 300
idle:   expected 100, got 403 SURPRISE!!
busy:   expected 300, got 300
idle:   expected 100, got 402 SURPRISE!!
busy:   expected 300, got 300

@trevnorris
Copy link

For reference I'm leaving this modified script. Will revisit later:

"use strict";

function millis(hr) {
  return hr[0] * 1e3 + ((hr[1] / 1e6) >>> 0);
}

function busyWait(ms, busyWaitCb) {
  process._rawDebug('>>> busyWait begin');
  var hr0 = process.hrtime();
  while (ms > millis(process.hrtime(hr0)));
  process._rawDebug('>>> busyWait end');
  busyWaitCb();
}

function idleWait(ms, idleWaitCb) {
  process._rawDebug('>>> idleWait begin');
  setTimeout(idleSetTimeout, ms, idleWaitCb);
  setImmediate(runSetImmediate);
  process._rawDebug('>>> idleWait end');
}

function idleSetTimeout(idleWaitCb) {
  process._rawDebug('>>> idleSetTimeout');
  idleWaitCb();
}

function runSetImmediate() {
  process._rawDebug('>>> runSetImmediate');
}

(function combined(a) {
  busyWait(300, function busyWaitCb() {
    idleWait(100, function idleWaitCb(){
      a < 50 && combined(++a);
    });
  });
}(0));

@loyd
Copy link

loyd commented Aug 13, 2014

@natbro
Copy link

natbro commented Dec 5, 2014

I'm seeing this behavior in another use-case, which smells the same to me because the drifts are exactly the same as the sleep/work happening inside the timeout function. Mine is perhaps a bit simpler to understand than the ping-ponging/nesting of the original bug/gist. mine is: https://gist.github.com/natbro/bf872ac473a1bba04008.

as you can see i'm just chaining setTimeout()'s and putting sleep (or any activity) within the callback. i see drift of exactly the amount of sleep/work i did within the callback, i.e. if i do 100ms of work in the callback and at the callback's tail use setTimeout() to run the callback again 200ms later it actually won't fire until 300ms later. although i thought this was the busy-sleep, it reproduced with other long-running computations, wasn't GC and wasn't other CPU activity. eventually on a whim i cleared the current timeout (the one i was within) before chaining to the next callback and the drift disappeared. I am reading through the node.js sources but haven't found anything suspicious in the timer.js/Timer class yet, but that's where I suspect the problem is given this is happening platform-agnostic.

@trevnorris trevnorris self-assigned this Jan 22, 2015
@misterdjules
Copy link

Fixed by d38e865. Released in v0.10.39 and v0.12.5.

@natbro
Copy link

natbro commented Aug 8, 2015

nice fix - thanks & sorry i didn't spot it.

@misterdjules
Copy link

@natbro No worries, and thank you for investigating and reporting your findings 👍

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

8 participants