Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

timers: fix the order of timers under a long loop #46644

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

XadillaX
Copy link
Contributor

@XadillaX XadillaX commented Feb 14, 2023

@nodejs-github-bot nodejs-github-bot added needs-ci PRs that need a full CI run. timers Issues and PRs related to the timers subsystem / setImmediate, setInterval, setTimeout. labels Feb 14, 2023
@mscdex mscdex added the needs-benchmark-ci PR that need a benchmark CI run. label Feb 14, 2023
@Linkgoron
Copy link
Member

Linkgoron commented Feb 14, 2023

This is supposed to fix this issue right? #37226 (comment)

Suppose that this is how the world "looks" like:
Timer number 1 has a timeout of `20` and needs to run at time 25
Timer number 2 has a timeout of `10` and needs to run at time 30
Timer number 3 has a timeout of `20` and needs to run at time 35

The correct order should be 1->2->3. However, my PC is so slow that it executes on time `40`. 
The `20` list is the first in the priority queue (`timerListQueue.peek()`), because it has a lower expiry.
We take it and run all of the timers in the list (`listOnTimeout` method), which are timers `1` and `3`, 
and it only then executes the `10` timers list, i.e. timer `2`, resulting in a 1->3->2 execution.

@XadillaX
Copy link
Contributor Author

This is supposed to fix this issue right? #37226 (comment)

Suppose that this is how the world "looks" like:
Timer number 1 has a timeout of `20` and needs to run at time 25
Timer number 2 has a timeout of `10` and needs to run at time 30
Timer number 3 has a timeout of `20` and needs to run at time 35

The correct order should be 1->2->3. However, my PC is so slow that it executes on time `40`. 
The `20` list is the first in the priority queue (`timerListQueue.peek()`), because it has a lower expiry.
We take it and run all of the timers in the list (`listOnTimeout` method), which are timers `1` and `3`, 
and it only then executes the `10` timers list, i.e. timer `2`, resulting in a 1->3->2 execution.

It seems to be.

@XadillaX XadillaX force-pushed the timeout-with-long-loop branch from ea62bbf to cb2f09d Compare February 15, 2023 02:48
@XadillaX
Copy link
Contributor Author

For my macbook M1, I ran several times. The original logic of benchmark/timers/timers-timeout-pooled.js is about:

timers/timers-timeout-pooled.js n=10000000: 22,148,042.717886783

The logic of current PR is about:

timers/timers-timeout-pooled.js n=10000000: 21,785,169.364714507

The unpooled one is about:

// original
timers/timers-timeout-unpooled.js n=1000000: 19,307,131.535027202

// current PR
timers/timers-timeout-unpooled.js n=1000000: 19,102,850.664916743

Copy link
Member

@apapirovski apapirovski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't believe this change addresses the issue holistically because there's nothing stopping any n-th list from having a timer that is supposed to expire next. This was an intentional sacrifice made back when the ring of timers based on duration idea was conceived.

https://github.com/nodejs/node/blob/main/lib/internal/timers.js#LL37C5-L37C5

@Linkgoron
Copy link
Member

Linkgoron commented Feb 15, 2023

I don't believe this change addresses the issue holistically because there's nothing stopping any n-th list from having a timer that is supposed to expire next. This was an intentional sacrifice made back when the ring of timers based on duration idea was conceived.

https://github.com/nodejs/node/blob/main/lib/internal/timers.js#LL37C5-L37C5

That's what I thought at first, but If I understand the code correctly, the heap reorders itself after every specific list, so it shouldn't have the nth list issue because of line 531?

https://github.com/nodejs/node/pull/46644/files#diff-41813f5e6561d74a1162a1a326c914dd05d298c44c90653726d3997bdc1f9df1L531

For example, in the test you go to list 10 and then list 15 and then list 10 again, but in theory you could go anywhere. You could, however, zigzag between all kinds of lists - but I think that it's an edge case.

@apapirovski
Copy link
Member

apapirovski commented Feb 15, 2023

That's what I thought at first, but If I understand the code correctly, the heap reorders itself after every specific list, so it shouldn't have the nth list issue because of line 531?

Yes, but secondary function would need to compare every single list in order to find the one with the next-most expiry.

Edit: Ehh, maybe not. Let me think about this more.

Edit2: Yes, initial assumption was right. Because once you check the next list, what's stopping the next expiry from being further down? So your pool of possible lists to check keeps growing.

@Linkgoron
Copy link
Member

Linkgoron commented Feb 15, 2023

That's what I thought at first, but If I understand the code correctly, the heap reorders itself after every specific list, so it shouldn't have the nth list issue because of line 531?

Yes, but secondary function would need to compare every single list in order to find the one with the next-most expiry.

I'm not sure I'm following. The next expiry must either be the next one in the current list, or the next minimum expiry in the heap (which is heap[2] or heap[3]). (as a wave-hand proof: If I understand how the heap is working, if the "correct" next element were in a different list then that list would actually have a lower expiry than heap[2] and heap[3] - and it wouldn't be deeper than them in the heap).

Do you mean that the extra cost is the fact that we have to fix the heap after every list run (interpolate) - which can be very costly if we have to zig/zag.

Edit2: Yes, initial assumption was right. Because once you check the next list, what's stopping the next expiry from being further down? So your pool of possible lists to check keeps growing.

If I understand correctly, Line 531 is what's supposed to ensure that.

@apapirovski
Copy link
Member

The next expiry must either be the next one in the current list, or the next minimum expiry in the heap (which is heap[2] or heap[3]).

How about after that? :)

@Linkgoron
Copy link
Member

The next expiry must either be the next one in the current list, or the next minimum expiry in the heap (which is heap[2] or heap[3]).

How about after that? :)

If I understand the code correctly, Line 531 is what's supposed to ensure that.

@apapirovski
Copy link
Member

apapirovski commented Feb 15, 2023

Yes, which is terribly inefficient. The proposal here is resorting binary heap after every timer. Then we might as well not use a binary heap.

The whole purpose here is to ensure that the most common use-cases of Node.js are performance efficient. Which means that even if you run into the scenario where you have a high number of lists and high number of expired timers in each, you're not having to sort the full list but rather have more granular increment at which sorting & insertion happens.

@Linkgoron
Copy link
Member

Linkgoron commented Feb 15, 2023

Yes, which is terribly inefficient. The proposal here is resorting binary heap after every timer. Then we might as well not use a binary heap.

I was just advocating for the correctness of the code, not the complexity. Note that that specific line was already there and is already happening, but as you say it might happen much more now (as previously a list ran at most once, and you were limited by the amount of lists, and now a specific list can run many more times).

@XadillaX
Copy link
Contributor Author

XadillaX commented Feb 16, 2023

The original code will update the order after each inserting, removing, or changing value. So just only to compare heap 2 or 3. And the only complexity added is comparing with the secondary one. No more percolateDown or percolateUp.

Edit: Each switching between 2 lists will do a percolateDown, which I think it's a unusual situation (also is the situation I try to solve in this PR).

Edit2: Not every tick will last long to expire several timers in a single list.

@XadillaX XadillaX requested a review from apapirovski February 16, 2023 14:24
@XadillaX XadillaX force-pushed the timeout-with-long-loop branch from cb2f09d to 7f63aa2 Compare February 17, 2023 10:30
@XadillaX
Copy link
Contributor Author

/ping @apapirovski

@theanarkh
Copy link
Contributor

I think the code is fine. But it may need to adjust the binary heap constantly to make sure the timers execute in the correct order. But it seems more important to the user to execute the timer in order ?

@Linkgoron
Copy link
Member

Linkgoron commented Feb 19, 2023

I think that a better benchmark should be thought of that shows how much this impacts the timers, in order to understand how much this affects performance.

@apapirovski
Copy link
Member

Edit: Each switching between 2 lists will do a percolateDown, which I think it's a unusual situation (also is the situation I try to solve in this PR).

Percolate is a worst-case O(log N) operation. You're potentially performing it after every timer in the list. There's simply no way this PR can land given the performance implications for the worst-case scenario.

Node.js already clearly specifies that timer order is NOT guaranteed. Timer order isn't even guaranteed in linux which can have this exact scenario playout given how the timer wheel works.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs-benchmark-ci PR that need a benchmark CI run. needs-ci PRs that need a full CI run. timers Issues and PRs related to the timers subsystem / setImmediate, setInterval, setTimeout.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

investigate flaky test-timers-promisified
6 participants