add notice of excessive scheduling time to time macro etc. #55103

IanButterworth · 2024-07-11T13:26:14Z

Replaces #55094

Balancing number of tasks vs. task duration is a key optimization aspect, so this aims to bring a warning when it's notably imbalanced to the most common place that a user might try to see why their function is taking so long etc.

If more than 10% of wall time is spent (on average across all worker threads) specifically scheduling tasks then warn in @time.

julia> function foo(n; spawn=true)
          for i in 1:n
              if spawn
                  fetch(Threads.@spawn global j = i)
              else
                  global j = i
              end
           end
       end
foo (generic function with 4 methods)

julia> @time foo(1000)
  0.018877 seconds (10.71 k allocations: 396.266 KiB, 99.81% scheduling time, 11.34% compilation time)

julia> @time foo(1000)
  0.017071 seconds (9.97 k allocations: 358.875 KiB, 99.71% scheduling time)

julia> @time foo(1000; spawn=false)
  0.000004 seconds (489 allocations: 7.641 KiB)

julia> @time foo(1000; spawn=false)
  0.000004 seconds (489 allocations: 7.641 KiB)

A real world example that doesn't show the schedule warning.

julia> @time Pkg.update()
    Updating registry at `~/.julia/registries/General.toml`
   Installed JpegTurbo_jll ── v3.0.4+0
   Installed LZO_jll ──────── v2.10.2+1
   Installed Zstd_jll ─────── v1.5.6+1
   Installed Glib_jll ─────── v2.80.5+0
   Installed GtkObservables ─ v2.1.3
   Installed Revise ───────── v3.6.0
  Installing artifacts  ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 4/4
    Updating `~/.julia/environments/v1.12/Project.toml`
  [295af30f] ↑ Revise v3.5.18 ⇒ v3.6.0
    Updating `~/.julia/environments/v1.12/Manifest.toml`
  [8710efd8] ↑ GtkObservables v2.1.2 ⇒ v2.1.3
  [295af30f] ↑ Revise v3.5.18 ⇒ v3.6.0
  [7746bdde] ↑ Glib_jll v2.80.2+0 ⇒ v2.80.5+0
  [aacddb02] ↑ JpegTurbo_jll v3.0.3+0 ⇒ v3.0.4+0
  [dd4b983a] ↑ LZO_jll v2.10.2+0 ⇒ v2.10.2+1
  [3161d3a3] ↑ Zstd_jll v1.5.6+0 ⇒ v1.5.6+1
Precompiling packages finished.
  97 dependencies successfully precompiled in 37 seconds. 34 already precompiled.
 44.095062 seconds (18.75 M allocations: 1.120 GiB, 0.56% gc time, 71 lock conflicts, 8.85% compilation time: 18% of which was recompilation)

IanButterworth · 2024-10-06T02:37:30Z

Updated the examples in the OP. With the new more precise timing a complex Pkg.update doesn't trigger the report.

IanButterworth · 2024-10-12T16:20:13Z

It'd be good to get this in to try it out on wild code to see if it reports meaningfully and hone the threshold if necessary.

But it'd be good to get a review for the timing measurement logic by someone familiar with the scheduler first.

@NHDaly

Close #47351 (builds on top of #48416) Adds two per-task metrics: - running time = amount of time the task was actually running (according to our scheduler). Note: currently inclusive of GC time, but would be good to be able to separate that out (in a future PR) - wall time = amount of time between the scheduler becoming aware of this task and the task entering a terminal state (i.e. done or failed). We record running time in `wait()`, where the scheduler stops running the task as well as in `yield(t)`, `yieldto(t)` and `throwto(t)`, which bypass the scheduler. Other places where a task stops running (for `Channel`, `ReentrantLock`, `Event`, `Timer` and `Semaphore` are all implemented in terms of `wait(Condition)`, which in turn calls `wait()`. `LibuvStream` similarly calls `wait()`. This should capture everything (albeit, slightly over-counting task CPU time by including any enqueuing work done before we hit `wait()`). The various metrics counters could be a separate inlined struct if we think that's a useful abstraction, but for now i've just put them directly in `jl_task_t`. They are all atomic, except the `metrics_enabled` flag itself (which we now have to check on task start/switch/done even if metrics are not enabled) which is set on task construction and marked `const` on the julia side. In future PRs we could add more per-task metrics, e.g. compilation time, GC time, allocations, potentially a wait-time breakdown (time waiting on locks, channels, in the scheduler run queue, etc.), potentially the number of yields. Perhaps in future there could be ways to enable this on a per-thread and per-task basis. And potentially in future these same timings could be used by `@time` (e.g. writing this same timing data to a ScopedValue like in #55103 but only for tasks lexically scoped to inside the `@time` block). Timings are off by default but can be turned on globally via starting Julia with `--task-metrics=yes` or calling `Base.Experimental.task_metrics(true)`. Metrics are collected for all tasks created when metrics are enabled. In other words, enabling/disabling timings via `Base.Experimental.task_metrics` does not affect existing `Task`s, only new `Task`s. The other new APIs are `Base.Experimental.task_running_time_ns(::Task)` and `Base.Experimental.task_wall_time_ns(::Task)` for retrieving the new metrics. These are safe to call on any task (including the current task, or a task running on another thread). All these are in `Base.Experimental` to give us room to change up the APIs as we add more metrics in future PRs (without worrying about release timelines). cc @NHDaly @kpamnany @d-netto --------- Co-authored-by: Pete Vilter <pete.vilter@gmail.com> Co-authored-by: K Pamnany <kpamnany@users.noreply.github.com> Co-authored-by: Nathan Daly <nathan.daly@relational.ai> Co-authored-by: Valentin Churavy <vchuravy@users.noreply.github.com>

@NHDaly

Close JuliaLang#47351 (builds on top of JuliaLang#48416) Adds two per-task metrics: - running time = amount of time the task was actually running (according to our scheduler). Note: currently inclusive of GC time, but would be good to be able to separate that out (in a future PR) - wall time = amount of time between the scheduler becoming aware of this task and the task entering a terminal state (i.e. done or failed). We record running time in `wait()`, where the scheduler stops running the task as well as in `yield(t)`, `yieldto(t)` and `throwto(t)`, which bypass the scheduler. Other places where a task stops running (for `Channel`, `ReentrantLock`, `Event`, `Timer` and `Semaphore` are all implemented in terms of `wait(Condition)`, which in turn calls `wait()`. `LibuvStream` similarly calls `wait()`. This should capture everything (albeit, slightly over-counting task CPU time by including any enqueuing work done before we hit `wait()`). The various metrics counters could be a separate inlined struct if we think that's a useful abstraction, but for now i've just put them directly in `jl_task_t`. They are all atomic, except the `metrics_enabled` flag itself (which we now have to check on task start/switch/done even if metrics are not enabled) which is set on task construction and marked `const` on the julia side. In future PRs we could add more per-task metrics, e.g. compilation time, GC time, allocations, potentially a wait-time breakdown (time waiting on locks, channels, in the scheduler run queue, etc.), potentially the number of yields. Perhaps in future there could be ways to enable this on a per-thread and per-task basis. And potentially in future these same timings could be used by `@time` (e.g. writing this same timing data to a ScopedValue like in JuliaLang#55103 but only for tasks lexically scoped to inside the `@time` block). Timings are off by default but can be turned on globally via starting Julia with `--task-metrics=yes` or calling `Base.Experimental.task_metrics(true)`. Metrics are collected for all tasks created when metrics are enabled. In other words, enabling/disabling timings via `Base.Experimental.task_metrics` does not affect existing `Task`s, only new `Task`s. The other new APIs are `Base.Experimental.task_running_time_ns(::Task)` and `Base.Experimental.task_wall_time_ns(::Task)` for retrieving the new metrics. These are safe to call on any task (including the current task, or a task running on another thread). All these are in `Base.Experimental` to give us room to change up the APIs as we add more metrics in future PRs (without worrying about release timelines). cc @NHDaly @kpamnany @d-netto --------- Co-authored-by: Pete Vilter <pete.vilter@gmail.com> Co-authored-by: K Pamnany <kpamnany@users.noreply.github.com> Co-authored-by: Nathan Daly <nathan.daly@relational.ai> Co-authored-by: Valentin Churavy <vchuravy@users.noreply.github.com>

@NHDaly

* Add per-task metrics (JuliaLang#56320) Close JuliaLang#47351 (builds on top of JuliaLang#48416) Adds two per-task metrics: - running time = amount of time the task was actually running (according to our scheduler). Note: currently inclusive of GC time, but would be good to be able to separate that out (in a future PR) - wall time = amount of time between the scheduler becoming aware of this task and the task entering a terminal state (i.e. done or failed). We record running time in `wait()`, where the scheduler stops running the task as well as in `yield(t)`, `yieldto(t)` and `throwto(t)`, which bypass the scheduler. Other places where a task stops running (for `Channel`, `ReentrantLock`, `Event`, `Timer` and `Semaphore` are all implemented in terms of `wait(Condition)`, which in turn calls `wait()`. `LibuvStream` similarly calls `wait()`. This should capture everything (albeit, slightly over-counting task CPU time by including any enqueuing work done before we hit `wait()`). The various metrics counters could be a separate inlined struct if we think that's a useful abstraction, but for now i've just put them directly in `jl_task_t`. They are all atomic, except the `metrics_enabled` flag itself (which we now have to check on task start/switch/done even if metrics are not enabled) which is set on task construction and marked `const` on the julia side. In future PRs we could add more per-task metrics, e.g. compilation time, GC time, allocations, potentially a wait-time breakdown (time waiting on locks, channels, in the scheduler run queue, etc.), potentially the number of yields. Perhaps in future there could be ways to enable this on a per-thread and per-task basis. And potentially in future these same timings could be used by `@time` (e.g. writing this same timing data to a ScopedValue like in JuliaLang#55103 but only for tasks lexically scoped to inside the `@time` block). Timings are off by default but can be turned on globally via starting Julia with `--task-metrics=yes` or calling `Base.Experimental.task_metrics(true)`. Metrics are collected for all tasks created when metrics are enabled. In other words, enabling/disabling timings via `Base.Experimental.task_metrics` does not affect existing `Task`s, only new `Task`s. The other new APIs are `Base.Experimental.task_running_time_ns(::Task)` and `Base.Experimental.task_wall_time_ns(::Task)` for retrieving the new metrics. These are safe to call on any task (including the current task, or a task running on another thread). All these are in `Base.Experimental` to give us room to change up the APIs as we add more metrics in future PRs (without worrying about release timelines). cc @NHDaly @kpamnany @d-netto --------- Co-authored-by: Pete Vilter <pete.vilter@gmail.com> Co-authored-by: K Pamnany <kpamnany@users.noreply.github.com> Co-authored-by: Nathan Daly <nathan.daly@relational.ai> Co-authored-by: Valentin Churavy <vchuravy@users.noreply.github.com> * Address review comments --------- Co-authored-by: Pete Vilter <pete.vilter@gmail.com> Co-authored-by: K Pamnany <kpamnany@users.noreply.github.com> Co-authored-by: Nathan Daly <nathan.daly@relational.ai> Co-authored-by: Valentin Churavy <vchuravy@users.noreply.github.com>

This comment has been minimized.

Sign in to view

IanButterworth force-pushed the ib/time_schedule_efficiency branch from 7fe2b02 to 13c2d17 Compare July 11, 2024 13:35

This comment has been minimized.

Sign in to view

IanButterworth mentioned this pull request Sep 26, 2024

Wall-time/all tasks profiler #55889

Merged

IanButterworth force-pushed the ib/time_schedule_efficiency branch 3 times, most recently from 2f2121f to d371291 Compare October 3, 2024 02:46

This comment has been minimized.

Sign in to view

IanButterworth force-pushed the ib/time_schedule_efficiency branch 3 times, most recently from 320a2f8 to 86a6aa1 Compare October 6, 2024 02:32

IanButterworth marked this pull request as ready for review October 6, 2024 02:38

This comment was marked as resolved.

Sign in to view

IanButterworth force-pushed the ib/time_schedule_efficiency branch 3 times, most recently from a2e76bf to 46a7504 Compare October 7, 2024 13:15

IanButterworth requested a review from vchuravy October 12, 2024 16:18

IanButterworth added 4 commits October 14, 2024 13:02

add excessive scheduling time to time macro etc.

7c6d016

fix scopedvalues tests that assume no scope

2659cff

don't count time sleeping

83b4487

start to expand docs

f7d3e8b

IanButterworth force-pushed the ib/time_schedule_efficiency branch from 46a7504 to f7d3e8b Compare October 14, 2024 17:42

IanButterworth added 2 commits October 14, 2024 13:55

more docs/news

8ed7c61

fix 32bit

81522a8

IanButterworth force-pushed the ib/time_schedule_efficiency branch from 8ea0531 to 81522a8 Compare October 15, 2024 01:04

nickrobinson251 mentioned this pull request Oct 24, 2024

Add per-task metrics #56320

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add notice of excessive scheduling time to time macro etc. #55103

add notice of excessive scheduling time to time macro etc. #55103

IanButterworth commented Jul 11, 2024 •

edited

Loading

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

IanButterworth commented Oct 6, 2024

This comment was marked as resolved.

IanButterworth commented Oct 12, 2024 •

edited

Loading

add notice of excessive scheduling time to time macro etc. #55103

Are you sure you want to change the base?

add notice of excessive scheduling time to time macro etc. #55103

Conversation

IanButterworth commented Jul 11, 2024 • edited Loading

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

IanButterworth commented Oct 6, 2024

This comment was marked as resolved.

IanButterworth commented Oct 12, 2024 • edited Loading

IanButterworth commented Jul 11, 2024 •

edited

Loading

IanButterworth commented Oct 12, 2024 •

edited

Loading