Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

timers: assorted cleanup + improvements #18486

Closed

Conversation

apapirovski
Copy link
Member

@apapirovski apapirovski commented Jan 31, 2018

There a few changes in this PR, all to timers, and it's best to review each commit independently. It will make way more sense that way.

  1. Instead of using nextTick to process failed lists, just attempt to process them again from C++ if the process is still alive. This also allows the removal of domain specific code in timers. (The current behaviour is not quite ideal as it means that all lists after the failed one will process on an arbitrary nextTick, even if they're — say — not due to fire for another 2 days...)

  2. Pass in Timer.now() as an argument of kOnTimeout instead of always re-entering C++ to get it. Also don't constantly call Timer.now() from ontimeout, even when it isn't needed. Improves performance on our pooled benchmark by roughly 40%.

  3. Switch to use const where appropriate.

  4. Instead of calling stop + start from the refresh method of Timeout, instead just call start as libuv already calls stop if necessary.

  5. When an interval takes exactly as long to run as its timeout setting, it can sometimes block the event loop if exactly 1ms elapses before start() is called. This commit fixes that.

Checklist
  • make -j4 test (UNIX), or vcbuild test (Windows) passes
  • tests and/or benchmarks are included
  • commit message follows commit guidelines
Affected core subsystem(s)

timers

@apapirovski apapirovski added c++ Issues and PRs that require attention from people who are familiar with C++. domain Issues and PRs related to the domain subsystem. timers Issues and PRs related to the timers subsystem / setImmediate, setInterval, setTimeout. semver-major PRs that contain breaking changes and should be released in the next major version. lib / src Issues and PRs related to general changes in the lib or src directory. labels Jan 31, 2018
@nodejs-github-bot nodejs-github-bot added c++ Issues and PRs that require attention from people who are familiar with C++. lib / src Issues and PRs related to general changes in the lib or src directory. labels Jan 31, 2018
Copy link
Member

@BridgeAR BridgeAR left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

@Fishrock123 Fishrock123 self-requested a review February 1, 2018 18:14
Copy link
Contributor

@Fishrock123 Fishrock123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some smaller things. I noticed that the commit descriptions are less informative than what's in your PR here, think you could update them?

Speaking of that, it's pretty nuts that removing Timer.now() has that kind of perf impact, however your PR says 40% but you commit says 80%.. which is it? :P

Oh and, could you explain 5. a bit more? I think I have a hint of what's going on there but I suspect it'd be pretty difficult to understand in any detail.

message: 'Timeout Error'
}));

let called = false;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

common.mustCall() doesn't work here? It checks for one and only one call by default.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It can potentially cause an infinite loop since it only checks at exit and this tests the fact that calling ref on unref'd handle doesn't cause the C++ loop to continue forever.

This is related to the owner check in C++. Someone could, in the future, think to optimize that by checking whether the handle is ref or unref, instead of checking for presence of an owner, but that would actually be incorrect and would cause an infinite loop on this test.

@@ -9,9 +9,16 @@ const t = setInterval(() => {
cntr++;
if (cntr === 1) {
common.busyLoop(100);
// ensure that the event loop passes before the second interval
setImmediate(() => assert.strictEqual(cntr, 1));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should these new callbacks also have common.mustCall()?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I opted not to include since setImmediate is guaranteed to run and if it doesn't, others tests will fail.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mmm yes but I'm not sure that's guaranteed to be true for all APIs so we do need to be careful about that path of doing things,

@apapirovski
Copy link
Member Author

Some smaller things. I noticed that the commit descriptions are less informative than what's in your PR here, think you could update them?

Yep, no worries. Will update later today.

Speaking of that, it's pretty nuts that removing Timer.now() has that kind of perf impact, however your PR says 40% but you commit says 80%.. which is it? :P

Not sure. I got 80% on my local machine but I'm getting 40% on benchmark CI but also 10% on a couple of other tests which I didn't see locally. From doing other benchmark runs in the past, I suspect that some of our performance gains can be platform specific (for example, frequency of GC would be one consideration).

In terms of why the big jump, the function itself does very little so constantly re-entering C++ accounts for a huge part of it.

Oh and, could you explain 5. a bit more? I think I have a hint of what's going on there but I suspect it'd be pretty difficult to understand in any detail.

There was a PR recently (e647c5d#diff-0a5d4868b2b9b17cf9e2c11f1bd1311e) that fixed the case where an interval executing longer than its timeout value would block the event loop. It solved that issue for most cases except where the trip from rearm to the code here: https://github.com/apapirovski/node/blob/4746614757b8881523211c40dc200fcc2f459084/lib/timers.js#L237-L245 takes exactly 1ms to complete. Then the timeRemaining = 0 and the interval would execute on the same cycle of the event loop.

@Fishrock123
Copy link
Contributor

Not sure. I got 80% on my local machine but I'm getting 40% on benchmark CI but also 10% on a couple of other tests which I didn't see locally. From doing other benchmark runs in the past, I suspect that some of our performance gains can be platform specific (for example, frequency of GC would be one consideration).

Imo put down something like up to 80% improvement or 40%-80% improvement, or 40%+ improvement.

  1. (...)

Ah yes, thanks.

Instead of using nextTick to process failed lists, just attempt to
process them again from C++ if the process is still alive.

This also allows the removal of domain specific code in timers.

The current behaviour is not quite ideal as it means that all lists
after the failed one will process on an arbitrary nextTick, even if
they're — say — not due to fire for another 2 days...
Pass in Timer.now() as an argument of kOnTimeout instead of always
re-entering C++ to get it. Also don't constantly call Timer.now()
from ontimeout, even when it isn't needed. Improves performance
on our pooled benchmark by upwards of 40%.
Instead of calling stop + start from the refresh method
of Timeout, instead just call start as libuv already
calls stop if necessary.
When an interval takes as long or longer to run as its timeout setting
and the roundtrip from rearm() to its deferal takes exactly 1ms, that
interval can then block the event loop. This is an edge case of another
recently fixed bug (which in itself was an edge case).

Refs: nodejs#15072
@apapirovski
Copy link
Member Author

I've updated the commit messages with a bit more info.

@apapirovski
Copy link
Member Author

Landed in 9b8e1c2, a986158, 71c0d03, d7894f3, and 47a984a

@apapirovski apapirovski closed this Feb 4, 2018
@apapirovski apapirovski deleted the patch-timers-improvements branch February 4, 2018 16:05
apapirovski added a commit that referenced this pull request Feb 4, 2018
Instead of using nextTick to process failed lists, just attempt to
process them again from C++ if the process is still alive.

This also allows the removal of domain specific code in timers.

The current behaviour is not quite ideal as it means that all lists
after the failed one will process on an arbitrary nextTick, even if
they're — say — not due to fire for another 2 days...

PR-URL: #18486
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Ruben Bridgewater <ruben@bridgewater.de>
Reviewed-By: Jeremiah Senkpiel <fishrock123@rocketmail.com>
apapirovski added a commit that referenced this pull request Feb 4, 2018
Pass in Timer.now() as an argument of kOnTimeout instead of always
re-entering C++ to get it. Also don't constantly call Timer.now()
from ontimeout, even when it isn't needed. Improves performance
on our pooled benchmark by upwards of 40%.

PR-URL: #18486
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Ruben Bridgewater <ruben@bridgewater.de>
Reviewed-By: Jeremiah Senkpiel <fishrock123@rocketmail.com>
apapirovski added a commit that referenced this pull request Feb 4, 2018
PR-URL: #18486
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Ruben Bridgewater <ruben@bridgewater.de>
Reviewed-By: Jeremiah Senkpiel <fishrock123@rocketmail.com>
apapirovski added a commit that referenced this pull request Feb 4, 2018
Instead of calling stop + start from the refresh method
of Timeout, instead just call start as libuv already
calls stop if necessary.

PR-URL: #18486
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Ruben Bridgewater <ruben@bridgewater.de>
Reviewed-By: Jeremiah Senkpiel <fishrock123@rocketmail.com>
apapirovski added a commit that referenced this pull request Feb 4, 2018
When an interval takes as long or longer to run as its timeout setting
and the roundtrip from rearm() to its deferal takes exactly 1ms, that
interval can then block the event loop. This is an edge case of another
recently fixed bug (which in itself was an edge case).

PR-URL: #18486
Refs: #15072
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Ruben Bridgewater <ruben@bridgewater.de>
Reviewed-By: Jeremiah Senkpiel <fishrock123@rocketmail.com>
@dyatlov
Copy link
Contributor

dyatlov commented Apr 8, 2018

When it's going to be included in Node release?

MayaLekova pushed a commit to MayaLekova/node that referenced this pull request May 8, 2018
Instead of using nextTick to process failed lists, just attempt to
process them again from C++ if the process is still alive.

This also allows the removal of domain specific code in timers.

The current behaviour is not quite ideal as it means that all lists
after the failed one will process on an arbitrary nextTick, even if
they're — say — not due to fire for another 2 days...

PR-URL: nodejs#18486
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Ruben Bridgewater <ruben@bridgewater.de>
Reviewed-By: Jeremiah Senkpiel <fishrock123@rocketmail.com>
MayaLekova pushed a commit to MayaLekova/node that referenced this pull request May 8, 2018
Pass in Timer.now() as an argument of kOnTimeout instead of always
re-entering C++ to get it. Also don't constantly call Timer.now()
from ontimeout, even when it isn't needed. Improves performance
on our pooled benchmark by upwards of 40%.

PR-URL: nodejs#18486
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Ruben Bridgewater <ruben@bridgewater.de>
Reviewed-By: Jeremiah Senkpiel <fishrock123@rocketmail.com>
MayaLekova pushed a commit to MayaLekova/node that referenced this pull request May 8, 2018
PR-URL: nodejs#18486
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Ruben Bridgewater <ruben@bridgewater.de>
Reviewed-By: Jeremiah Senkpiel <fishrock123@rocketmail.com>
MayaLekova pushed a commit to MayaLekova/node that referenced this pull request May 8, 2018
Instead of calling stop + start from the refresh method
of Timeout, instead just call start as libuv already
calls stop if necessary.

PR-URL: nodejs#18486
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Ruben Bridgewater <ruben@bridgewater.de>
Reviewed-By: Jeremiah Senkpiel <fishrock123@rocketmail.com>
MayaLekova pushed a commit to MayaLekova/node that referenced this pull request May 8, 2018
When an interval takes as long or longer to run as its timeout setting
and the roundtrip from rearm() to its deferal takes exactly 1ms, that
interval can then block the event loop. This is an edge case of another
recently fixed bug (which in itself was an edge case).

PR-URL: nodejs#18486
Refs: nodejs#15072
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Ruben Bridgewater <ruben@bridgewater.de>
Reviewed-By: Jeremiah Senkpiel <fishrock123@rocketmail.com>
@ChALkeR ChALkeR mentioned this pull request Aug 9, 2018
3 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
c++ Issues and PRs that require attention from people who are familiar with C++. domain Issues and PRs related to the domain subsystem. lib / src Issues and PRs related to general changes in the lib or src directory. semver-major PRs that contain breaking changes and should be released in the next major version. timers Issues and PRs related to the timers subsystem / setImmediate, setInterval, setTimeout.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants