-
-
Notifications
You must be signed in to change notification settings - Fork 144
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(console): add task scheduled times histogram #409
Conversation
There are 2 widgets which display the poll times for a task in the detail view. The poll times percentiles are always displayed and if UTF-8 is available, then a sparkline histogram is also shown to the right. The logic for displaying these two widgets is quite long and is currently interspersed within the `render` function for the task detail view plus helper functions. Additionally, it is not easy to add a second set of widgets showing the time between waking and being polled for a task which is planned for #409. This change factors out that logic into separate widgets. There was already a separate widget `MiniHistogram`. Some of the logic that was previously in the task detail view has been moved here. A new widget `Percentiles` has been added to encapsulate the logic for preparing and displaying the percentiles. A top level `Durations` widget occupies the entire width of the task detail view and control the horizontal layout of the `Percentiles` and `MiniHistogram` widgets. The new widget will also supress the histogram if there isn't at least enough room to display the legend at the bottom
This is kind of a silly naming bikeshed, but I'm not sure if I love the term "woken duration" for this measurement. It's not super clear to me whether this refers to the "duration from wake to poll" (which is what it's used to refer to here), the "duration from wake to yield, including the poll" (arguably, the task is still "woken" while it's being polled), or "the duration from yield to wake" (which is what we refer to as "idle duration" but could be referred to as "duration until woken"). As an alternative naming suggestion, I'd propose "queued duration", since what we are measuring here is the amount of time that a woken task is held in a queue by the runtime before being polled. |
Actually, I just took a look at |
I agree on both counts, scheduled is a better name and if it's already used in tokio-metrics then best to align anyway. I'll rework both these PRs and let you know when it's done. If you haven't looked at the code too much, then I'll force push the update with a new commit message to keep the PR a bit neater. |
I haven't looked super closely at this change yet, so feel free to force push! :) |
Each task displays the sum of the time it has been idle and busy, as well as the total. The idle time includes the time between when a task is woken, and when the runtime actually polls that task. There are cases where a task may be scheduled for a long time after being woken, before it is polled. Especially if many tasks are woken at the same time and don't yield back to the runtime quickly. To add visibility to this, the total time that a task is scheduled before being polled has been added. Additionally, a new task state `Scheduled` has been added. This is displayed in both the tasks table and in the task detail view. In the `console-api`, the `last_wake` time has been moved from `tasks::Stats` to `common::PollStats`, where the total `scheduled_time` for the task has also been added. This is the first of two parts. In the second part (#409), a histogram for scheduled time will be added, the equivalent of the poll time histogram which is already available on the task detail screen. To show a pathological case which may lead to needing to see the scheduled time, a new example has been added to the `console-subscriber` examples.
edit: Not only did I name it wrong, but I had the wrong description in as well. facepalm @hawkw I only realized later that this is the second PR in the set (I named it wrong, so both PRs had the same name). I've now updated the 1st PR (#406) to the new naming (scheduled), please review that one first. In the meantime, I'll update the naming on this one too. I've marked it as a draft in the meantime. |
Display the scheduled time percentiles and sparklines for the histogram of scheduled times. The schduled time is the time between when a task is woken and when it is next polled. The scheduled time, which was already calculated, is now stored in a histogram and sent over the wire in together with the task details. This is used to draw percentiles and sparklines on the task details view, in the same way that is done for the poll times histogram. The refactoring done in #408 has been used to more easily display two sets of durations (percentiles and histogram where possible).
Each task displays the sum of the time it has been idle and busy, as well as the total. The idle time includes the time between when a task is woken, and when the runtime actually polls that task. There are cases where a task may be scheduled for a long time after being woken, before it is polled. Especially if many tasks are woken at the same time and don't yield back to the runtime quickly. To add visibility to this, the total time that a task is scheduled before being polled has been added. Additionally, a new task state `Scheduled` has been added. This is displayed in both the tasks table and in the task detail view. In the `console-api`, the total `scheduled_time` for the task has been added to the `TaskStats` message in `tasks.proto`. This is the first of two parts. In the second part (#409), a histogram for scheduled time will be added, the equivalent of the poll time histogram which is already available on the task detail screen. To show a pathological case which may lead to needing to see the scheduled time, a new example has been added to the `console-subscriber` ## PR Notes This PR does something adjacent to what is described in #50, but not quite. The unicode character used for a `SCHED` task is ⏫. The second PR (#409) will record a scheduled time histogram the same as it recorded for poll times. These two changes should go in together (and certainly shouldn't be released separately). However, this PR is already quite big, so they'll be separated out. The idea is that this PR isn't merged until the next one has also been reviewed and approved. It would be good to get some feedback at this stage though. The task list view with the new column for `Sched` time: <img width="1032" alt="a tasks table view for the long-scheduled example" src="https://user-images.githubusercontent.com/89589/232456977-2921f884-4673-420f-ba4f-3646627d44db.png"> The `Task` block in the task detail view showing the new `Scheduled` time entry. <img width="510" alt="The task block on the task detail view for the rx task in the long-scheduled example" src="https://user-images.githubusercontent.com/89589/232457332-e455e086-9468-42c9-8fda-7965d8d1e6f8.png">
Okay, #406 has merged, and I'll be excited to review this as soon as it's rebased! :) |
Great, thanks! I've also got to adapt the wire format to the changes we made in #406, hopefully sone tomorrow! |
CI lints failure seems unrelated (and looks wacky): https://github.com/tokio-rs/console/actions/runs/4756530428/jobs/8452202802?pr=409 |
@hawkw this is ready for review now. The CI failure is a new clippy lint in 1.69 affecting |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall, this looks great. I think an accidental protobuf change may have slipped in while rebasing after #406 merged, so we'll need to undo that. Other than that, I had a few small questions, but no other blockers.
console-api/proto/common.proto
Outdated
// The timestamp of the most recent time this object was woken. | ||
// | ||
// If this is `None`, the object has not yet been woken. | ||
optional google.protobuf.Timestamp last_wake = 7; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks like this needs to be removed, since we changed #406 to add this to TaskStats
instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I seem to have ended up with last_wake
and scheduled_time
twice somehow. Sorry.
let percentiles_width = match self.percentiles_width { | ||
// Fixed width | ||
width if width > 0 => width, | ||
// Long enough for the title or for a single line | ||
// like "p99: 544.77µs" (13) (and borders on the sides). | ||
_ => cmp::max(self.percentiles_title.len() as u16, 13_u16) + 2, | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where is the Durations widget currently constructed with a width of 0? Is that used for the resource details view, where we only have one durations widget?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually (and I had to go and check this to be sure), the resource details view doesn't have a Durations widget currently (although it will be much easier to add after the refactoring). The width=0
value here is set as a sensible default if a fixed width isn't set.
The fixed width was introduced in this PR so that the percentiles and sparklines widgets are the same width for both the Durations widgets irrespective of the title length (so that it looks nicer).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like we could change the constructor to require that a non-zero width is always provided, and remove this behavior, if we intend to always construct the widget with a fixed width? This might be a bit simpler. Not a blocker though...
I'm going to go ahead and merge this, we can address potential refactors to the way the |
There are 2 widgets which display the poll times for a task in the detail view. The poll times percentiles are always displayed and if UTF-8 is available, then a sparkline histogram is also shown to the right. The logic for displaying these two widgets is quite long and is currently interspersed within the `render` function for the task detail view plus helper functions. Additionally, it is not easy to add a second set of widgets showing the time between waking and being polled for a task which is planned for #409. This change factors out that logic into separate widgets. There was already a separate widget `MiniHistogram`. Some of the logic that was previously in the task detail view has been moved here. A new widget `Percentiles` has been added to encapsulate the logic for preparing and displaying the percentiles. A top level `Durations` widget occupies the entire width of the task detail view and control the horizontal layout of the `Percentiles` and `MiniHistogram` widgets. The new widget will also supress the histogram if there isn't at least enough room to display the legend at the bottom
Each task displays the sum of the time it has been idle and busy, as well as the total. The idle time includes the time between when a task is woken, and when the runtime actually polls that task. There are cases where a task may be scheduled for a long time after being woken, before it is polled. Especially if many tasks are woken at the same time and don't yield back to the runtime quickly. To add visibility to this, the total time that a task is scheduled before being polled has been added. Additionally, a new task state `Scheduled` has been added. This is displayed in both the tasks table and in the task detail view. In the `console-api`, the total `scheduled_time` for the task has been added to the `TaskStats` message in `tasks.proto`. This is the first of two parts. In the second part (#409), a histogram for scheduled time will be added, the equivalent of the poll time histogram which is already available on the task detail screen. To show a pathological case which may lead to needing to see the scheduled time, a new example has been added to the `console-subscriber` ## PR Notes This PR does something adjacent to what is described in #50, but not quite. The unicode character used for a `SCHED` task is ⏫. The second PR (#409) will record a scheduled time histogram the same as it recorded for poll times. These two changes should go in together (and certainly shouldn't be released separately). However, this PR is already quite big, so they'll be separated out. The idea is that this PR isn't merged until the next one has also been reviewed and approved. It would be good to get some feedback at this stage though. The task list view with the new column for `Sched` time: <img width="1032" alt="a tasks table view for the long-scheduled example" src="https://user-images.githubusercontent.com/89589/232456977-2921f884-4673-420f-ba4f-3646627d44db.png"> The `Task` block in the task detail view showing the new `Scheduled` time entry. <img width="510" alt="The task block on the task detail view for the rx task in the long-scheduled example" src="https://user-images.githubusercontent.com/89589/232457332-e455e086-9468-42c9-8fda-7965d8d1e6f8.png">
Display the scheduled time percentiles and sparklines for the histogram of scheduled times. The schduled time is the time between when a task is woken and when it is next polled. The scheduled time, which was already calculated, is now stored in a histogram and sent over the wire in together with the task details. This is used to draw percentiles and sparklines on the task details view, in the same way that is done for the poll times histogram. The refactoring done in #408 has been used to more easily display two sets of durations (percentiles and histogram where possible). ## PR Notes The PR depends on both #406, which adds initial support for recording the scheduled (wake-to-poll) time, and #408, which refactors the percentile and histogram widgets to make them easier to reuse. It shouldn't really be reviewed in depth until those two have been merged as it contains a lot of duplication and will need to be rebased. Here are some examples of the scheduled times durations on the task detail view: <img width="1037" alt="task detail view for the sender task in the long-scheduled example" src="https://user-images.githubusercontent.com/89589/232608774-d8ac48a7-3fe7-4742-a75b-e11bdb23abaa.png"> <img width="1043" alt="task detail view for the burn task in the app example" src="https://user-images.githubusercontent.com/89589/232608864-637f4f52-d4a6-468d-88fc-8fe1d53fdff9.png">
# Changelog All notable changes to this project will be documented in this file. This project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html). ## console-subscriber-v0.2.0 - (2023-09-29) [0b0c1af](https://github.com/tokio-rs/console/commit/0b0c1aff18c3260d3a45a78f6c0d6f4206af1cbb)...[0b0c1af](https://github.com/tokio-rs/console/commit/0b0c1aff18c3260d3a45a78f6c0d6f4206af1cbb) ### <a id = "console-subscriber-v0.2.0-breaking"></a>Breaking Changes - **Update Tonic and Prost dependencies ([#364](#364 ([f9b8e03](https://github.com/tokio-rs/console/commit/f9b8e03bd7ee1d0edb441c94a93a350d5b06ed3b))<br />This commit updates the public dependencies `prost` and `tonic` to semver-incompatible versions (v0.11.0 and v0.8.0, respectively). This is a breaking change for users who are integrating the `console-api` protos with their own `tonic` servers or clients. - **Update `tonic` to v0.10 and increase MSRV to 1.64 ([#464](#464 ([96e62c8](https://github.com/tokio-rs/console/commit/96e62c83ef959569bb062dc8fee98fa2b2461e8d))<br />This is a breaking change for users of `console-api` and `console-subscriber`, as it changes the public `tonic` dependency to a semver-incompatible version. This breaks compatibility with `tonic` 0.9.x and `prost` 0.11.x. ### Added - [**breaking**](#console-subscriber-v0.2.0-breaking) Update Tonic and Prost dependencies ([#364](#364)) ([f9b8e03](f9b8e03)) - Add support for Unix domain sockets ([#388](#388)) ([a944dbc](a944dbc), closes [#296](#296)) - Add scheduled time per task ([#406](#406)) ([f280df9](f280df9)) - Add task scheduled times histogram ([#409](#409)) ([d92a399](d92a399)) - Update `tonic` to 0.9 ([#420](#420)) ([48af1ee](48af1ee)) - Update MSRV to Rust 1.60.0 ([b18ee47](b18ee47)) - Expose server parts ([#451](#451)) ([e51ac5a](e51ac5a)) - Add cfg `console_without_tokio_unstable` ([#446](#446)) ([7ed6673](7ed6673)) - Add warning for tasks that never yield ([#439](#439)) ([d05fa9e](d05fa9e)) - [**breaking**](#console-subscriber-v0.2.0-breaking) Update `tonic` to v0.10 and increase MSRV to 1.64 ([#464](#464)) ([96e62c8](96e62c8)) ### Documented - Fix unclosed code block ([#463](#463)) ([362bdbe](362bdbe)) - Update MSRV version docs to 1.64 ([#467](#467)) ([94a5a51](94a5a51)) ### Fixed - Fix build on tokio 1.21.0 ([#374](#374)) ([c34ac2d](c34ac2d)) - Fix off-by-one indexing for `callsites` ([#391](#391)) ([43891ab](43891ab)) - Bump minimum Tokio version ([#397](#397)) ([bbb8f25](bbb8f25), fixes [#386](#386)) - Fix self wakes count ([#430](#430)) ([d308935](d308935)) - Remove clock skew warning in `start_poll` ([#434](#434)) ([4a88b28](4a88b28)) - Do not report excessive polling ([#378](#378)) ([#440](#440)) ([8b483bf](8b483bf), closes [#378](#378)) - Correct retain logic ([#447](#447)) ([36ffc51](36ffc51)) Signed-off-by: Eliza Weisman <eliza@buoyant.io>
# Changelog All notable changes to this project will be documented in this file. This project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html). ## tokio-console-v0.1.10 - (2023-09-29) [7d009f8](https://github.com/tokio-rs/console/commit/7d009f87120ce0c89f5f9c5311f05b6756ca770f)...[7d009f8](https://github.com/tokio-rs/console/commit/7d009f87120ce0c89f5f9c5311f05b6756ca770f) ### <a id = "tokio-console-v0.1.10-breaking"></a>Breaking Changes - **Update Tonic and Prost dependencies ([#364](#364 ([f9b8e03](https://github.com/tokio-rs/console/commit/f9b8e03bd7ee1d0edb441c94a93a350d5b06ed3b))<br />This commit updates the public dependencies `prost` and `tonic` to semver-incompatible versions (v0.11.0 and v0.8.0, respectively). This is a breaking change for users who are integrating the `console-api` protos with their own `tonic` servers or clients. - **Update `tonic` to v0.10 and increase MSRV to 1.64 ([#464](#464 ([96e62c8](https://github.com/tokio-rs/console/commit/96e62c83ef959569bb062dc8fee98fa2b2461e8d))<br />This is a breaking change for users of `console-api` and `console-subscriber`, as it changes the public `tonic` dependency to a semver-incompatible version. This breaks compatibility with `tonic` 0.9.x and `prost` 0.11.x. ### Added - [**breaking**](#tokio-console-v0.1.10-breaking) Update Tonic and Prost dependencies ([#364](#364)) ([f9b8e03](f9b8e03)) - Only suggest opening issues for panics ([#365](#365)) ([da2a89c](da2a89c)) - Init error handling before subcmds ([#365](#365)) ([ec66eda](ec66eda)) - Filter out boring frames in backtraces ([#365](#365)) ([95a5e54](95a5e54)) - Include config options in autogenerated issues ([#365](#365)) ([3244a1f](3244a1f)) - Reduce decimal digits in UI ([#402](#402)) ([c13085e](c13085e)) - Use tokio task ids in task views ([#403](#403)) ([f5b06d2](f5b06d2)) - Add support for Unix domain sockets ([#388](#388)) ([a944dbc](a944dbc), closes [#296](#296)) - Add scheduled time per task ([#406](#406)) ([f280df9](f280df9)) - Add task scheduled times histogram ([#409](#409)) ([d92a399](d92a399)) - Update `tonic` to 0.9 ([#420](#420)) ([48af1ee](48af1ee)) - Update MSRV to Rust 1.60.0 ([b18ee47](b18ee47)) - Migrate to `ratatui` and update `crossterm` ([#425](#425)) ([b209dd6](b209dd6)) - Help view modal ([#432](#432)) ([359a4e7](359a4e7)) - Add way to inspect details of task from resource view ([#449](#449)) ([132ed4e](132ed4e), closes [#448](#448)) - Add warning for tasks that never yield ([#439](#439)) ([d05fa9e](d05fa9e)) - [**breaking**](#tokio-console-v0.1.10-breaking) Update `tonic` to v0.10 and increase MSRV to 1.64 ([#464](#464)) ([96e62c8](96e62c8)) ### Documented - Update screenshots in README ([#419](#419)) ([e9bcd67](e9bcd67)) - Revert "update screenshots in README ([#419](#419))" ([993a3d9](993a3d9)) - Update screenshots in README ([#421](#421)) ([8a27f96](8a27f96)) - Add column descriptions for all tables ([#431](#431)) ([e3cf82b](e3cf82b)) - Update MSRV version docs to 1.64 ([#467](#467)) ([94a5a51](94a5a51)) ### Fixed - Fix ascii-only flipped input ([#377](#377)) ([652ac34](652ac34)) - Declare `tokio-console` bin as `default-run` ([#379](#379)) ([9ce60ec](9ce60ec)) - Make `retain_for` default to 6s if not specfied ([#383](#383)) ([0a6012b](0a6012b), fixes [#382](#382)) - Enable view-switching keystrokes on details views ([#387](#387)) ([f417d7a](f417d7a)) - Fix `ViewOptions` default lang' ([#394](#394)) ([a1cf1b8](a1cf1b8), fixes [#393](#393)) - Remove `tracing-subscriber` 0.2 from dependencies ([#404](#404)) ([768534a](768534a)) - Fix calculation of busy time during poll ([#405](#405)) ([e2c536a](e2c536a)) - Remove histogram minimum count ([#424](#424)) ([02cf8a6](02cf8a6)) - Remove trailing space from task/resource location ([#443](#443)) ([90e5918](90e5918)) - Make long locations readable ([#441](#441)) ([9428d7f](9428d7f), closes [#411](#411)) - Fix task detail view Id to display remote tokio::task::Id ([#455](#455)) ([70c3952](70c3952)) Signed-off-by: Eliza Weisman <eliza@buoyant.io>
Display the scheduled time percentiles and sparklines for the histogram
of scheduled times. The schduled time is the time between when a task is
woken and when it is next polled.
The scheduled time, which was already calculated, is now stored in a
histogram and sent over the wire in together with the task details.
This is used to draw percentiles and sparklines on the task details
view, in the same way that is done for the poll times histogram.
The refactoring done in #408 has been used to more easily display two
sets of durations (percentiles and histogram where possible).
PR Notes
The PR depends on both #406, which adds initial support for recording the
scheduled (wake-to-poll) time, and #408, which refactors the percentile and
histogram widgets to make them easier to reuse.
It shouldn't really be reviewed in depth until those two have been merged as
it contains a lot of duplication and will need to be rebased.
Here are some examples of the scheduled times durations on the task detail view: