-
Notifications
You must be signed in to change notification settings - Fork 5.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor(timers): Refactor timers to use one async op per timer #12862
Conversation
This change also makes the timers implementation closer to the spec, and sets up the stage to implement `AbortSignal.timeout()` (whatwg/dom#1032). Fixes denoland#8965. Fixes denoland#10974. Fixes denoland#11398.
12a3dfd
to
0b69692
Compare
It seems like the |
@lucacasonato please advise |
I don't think so. That sounds like a hack. The tests are probably not correct if they fail if the timer is actually 0 sec (1 evt loop turn). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@andreubotella thank you for the PR. It fixes a lot of long outstanding bugs which is great. At the same time it's touching a crucial piece of infrastructure so special care must be taken when reviewing and landing it.
The implementation mostly looks good to me, I have a few question for bits that are not immediately obvious to me.
Before landing I'd like to get more eyes on it.
@bnoordhuis @lucacasonato please review
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR, Andreu. I can't really intuit how this affects performance. Do we have timer benchmarks?
ext/timers/01_timers.js
Outdated
// 4. If timeout is less than 0, then set timeout to 0. | ||
// 5. If nesting level is greater than 5, and timeout is less than 4, then | ||
// set timeout to 4. | ||
timeout = MathMax(timeout, (timerNestingLevel > 5) ? 4 : 0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two comments:
-
5 is a magic constant. Why not 4, 6 or 42? The comment doesn't explain it.
-
I find it harder to read than the equivalent
if
statement (which is shorter to boot):
if (timerNestingLevel > 5 && timeout < 4) timeout = 4;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note step 4 and the ternary operator. The equivalent would be:
if (timeout < 0) timeout = 0;
if (timerNestingLevel > 5 && timeout < 4) timeout = 4;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
5 is a magic constant. Why not 4, 6 or 42? The comment doesn't explain it.
It's spec-mandated. I suspect due to compat concerns.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if (timeout < 0) timeout = 0;
if (timerNestingLevel > 5 && timeout < 4) timeout = 4;
This one is a bit easier to read, so maybe let's go with it?
It's spec-mandated. I suspect due to compat concerns.
Right, can you add a comment or reference to the spec here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, can you add a comment or reference to the spec here?
This comment is the spec text:
https://github.com/andreubotella/deno/blob/timers/ext/timers/01_timers.js#L131-L136
/** | ||
* A doubly linked list of timers. | ||
* @type { { head: ScheduledTimer | null, tail: ScheduledTimer | null } } */ | ||
const scheduledTimers = { head: null, tail: null }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can get rid of all the null checks if you turn this into a list head:
// ideally give the list head and timer objects the same object shape
// to keep property reads and writes monomorphic
const scheduledTimers = { head: null, tail: null };
scheduledTimers.head = scheduledTimers;
scheduledTimers.tail = scheduledTimers;
function isEmpty() {
return scheduledTimers.tail === scheduledTimers;
}
function append(timer) {
timer.prev = scheduledTimers.tail;
timer.next = scheduledTimers;
scheduledTimers.tail.next = timer;
scheduledTimers.tail = timer;
}
function remove(timer) {
timer.prev.next = timer.next;
timer.next.prev = timer.prev;
// timer.prev = timer.next = null;
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@andreubotella do you want to address this bit before landing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be nice to have now and I'd definitely want to see it addressed eventually but I'm okay with landing this as-is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That seems harder to reason through, especially if we go all the way and make scheduledTimers
monomorphic, since scheduledTimers.prev
would be the tail and scheduledTimers.next
would be the head. And if we keep the property names head
and tail
, we'd still need checks to append and remove.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it's good enough for nginx, libuv, linux and the BSDs, then it's certainly good enough for deno. :-)
(Also node - but I'm reasonably certain I was the one who introduced that pattern there.)
} | ||
// 1. | ||
PromisePrototypeThen( | ||
core.opAsync("op_sleep", millis, cancelRid), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This means there's an op call per timer? To keep overhead down JS timers would ideally all hang off a single tokio::time::sleep()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The trade-off here is either:
a) Having at most one pending timer op at a time, but having to call into Rust to start another op whenever a timer expires.
b) Only calling into Rust when starting or canceling timers, but having multiple pending timer ops.
I haven't measured the difference between these, and due to the very nature of timers, measuring performance isn't easy. Also, if we compare with the current timers implementation, there's the additional confounding factor of #10974. So I'm not sure how to benchmark this.
But note that these async ops do very little work, and I expect that the work of checking the time would be done by tokio's global timer, not when polling the Sleep
futures.
And in any case, unrefing timers would be much easier with b) – see #12953.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that we should go with this approach, especially in light of ref/unref for timers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Node has regular JS timers hanging off a single C++ timer (N->1, common case) and unref'd JS timers backed by individual C++ timers (N->N, uncommon case.)
For Node, that's a worthwhile optimization because programs often have thousands of active timers, usually one or more per TCP connection.
For Deno, I'm not sure. TBH, I don't know how (or if) we deal with read/write timeouts or slowloris style attacks.
At any rate, it's an area worth investigating but it doesn't have to hold up this PR.
cli/tests/unit/timers_test.ts
Outdated
setTimeout(() => array.push(6)); | ||
}, 0); | ||
|
||
await delay(100); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks still a little bit flaky from my observation. ref: https://github.com/denoland/deno/runs/4418361252?check_suite_focus=true
How about creating a promise for each setTimeout call and waiting for all of them instead of delay(N)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like a genuine bug. A race would be missing [4,5,6] but it's only missing 6. Merits deeper investigation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
According to the spec, a timer mustn't go off before other timers that were both scheduled before it and have a lesser or equal timeout. So the array must not be out of order, and the delay must expire after 3 is pushed, but the delay doesn't necessarily have to go off after any of the 4/5/6 timers do. And that's what I find when I test this – either [1,2,3], [1,2,3,4], [1,2,3,4,5] or [1,2,3,4,5,6], but not [1,2] or anything out of order.
Note that the spec doesn't say that a timeout of 0 must queue the action in the timer task queue immediately – step 5.3 in "run steps after a timeout" allows for an arbitrary additional wait. And it seems like an op_sleep
with a timeout of 0 doesn't necessarily resolve at the next event loop turn. Should we queue the task immediately in that case?
In any case, I'll be taking @kt3k's suggestion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we queue the task immediately in that case?
Come to think about it, no. This would let timers starve other parts of the event loop. Whereas if timers are always backed by ops, the earliest that a task can be pushed to the timers queue is on the next event loop turn.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't follow your line of reasoning.
We have timers A, B and C that expire at time T+0, who arm three more timers D, E, and F that expire at time T+1 (or T+4, doesn't matter), and a delay G that expires at time T+100.
Under normal conditions, they expire in order A to G because 0 < 1 < 100.
Under abnormal conditions, say an event loop stall or an OS scheduler fluke that pauses the program until T+200, then either A-G or A-C, G, D-F are reasonable behavior.
But A-E, G, F? No, never.
A timer's expiry time is set to now() + timeout
rather than event_loop_tick_start_time() + timeout
. Put another way: timers have no shared time base.
Under normal conditions that doesn't matter because computers are fast: now()
doesn't change meaningfully in the time it takes to execute an event loop "tick".
But if there is a stall of duration N between B and C, then there is an equivalent gap between E and F: E expires at time T+1 but F at time T+N.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know how tokio's timers are implemented –I have tried to follow the code and can't really make sense of it–, or how they interact with FuturesUnordered
–which is probably more relevant for how they show up here. The spec requires that timers don't resolve before now() + timeout
–which I'm assuming tokio does correctly–, and it also requires that a timer doesn't resolve before any timers that were scheduled before it and that have a lower or equal timeout. The code around scheduledTimers
guarantees that that condition holds, but it doesn't guarantee anything else. And, purely in terms of the spec requirements, A-E, G, F is valid, because D-F are not required to be added to the timer task queue the instant setTimeout()
is called.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For posterity: Bartek and I talked this through OOB.
IMO, it's a quality-of-implementation issue - expiry order should follow the Principle of Least Surprise, regardless of how relaxed the spec is - but it's pre-existing, not a regression, and therefore doesn't need to hold up this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM if the test failure that @kt3k pointed out can be resolved.
} | ||
// 1. | ||
PromisePrototypeThen( | ||
core.opAsync("op_sleep", millis, cancelRid), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Node has regular JS timers hanging off a single C++ timer (N->1, common case) and unref'd JS timers backed by individual C++ timers (N->N, uncommon case.)
For Node, that's a worthwhile optimization because programs often have thousands of active timers, usually one or more per TCP connection.
For Deno, I'm not sure. TBH, I don't know how (or if) we deal with read/write timeouts or slowloris style attacks.
At any rate, it's an area worth investigating but it doesn't have to hold up this PR.
/** | ||
* A doubly linked list of timers. | ||
* @type { { head: ScheduledTimer | null, tail: ScheduledTimer | null } } */ | ||
const scheduledTimers = { head: null, tail: null }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be nice to have now and I'd definitely want to see it addressed eventually but I'm okay with landing this as-is.
Thanks for the PR, Andreu! |
This change also makes the timers implementation closer to the spec, and sets up the stage to implement
AbortSignal.timeout()
(whatwg/dom#1032).Fixes #8965.
Fixes #10974.
Fixes #11398.