Cache System Tracing Spans #9390

hymm · 2023-08-09T00:11:03Z

Objective

Reduce the overhead of tracing by caching the system spans.

Yellow is this pr. Red is main.

james7132

Huzzah for less overhead! The impact is bigger than I expected. LGTM, just a few nits on implementation.

crates/bevy_ecs/src/schedule/executor/multi_threaded.rs

james7132 · 2023-08-09T00:36:30Z

Another thought: should we do this for the command application spans too?

hymm · 2023-08-09T00:42:50Z

Another thought: should we do this for the command application spans too?

I'll try it.

hymm · 2023-08-09T00:44:34Z

This is actually getting close to perf with the spans deleted. So it might fix #4892.

superdump · 2023-08-09T07:19:01Z

Oh nice!

james7132 · 2023-08-09T17:28:19Z

Another, potentially harder, area to check if we can cache spans might be in QueryParIter and friends, but we can leave that to a follow-up PR if need be.

hymm · 2023-08-10T00:13:06Z

caching the commands span helped some more:

Another, potentially harder, area to check if we can cache spans might be in QueryParIter and friends, but we can leave that to a follow-up PR if need be.

Probably better to push off to a follow up. I tried to do something quick and reuse the same span for all the tasks, but it ended up running slower. There's probably some contention happening somewhere, so might need to make a separate span for each task.

github-actions · 2023-08-15T20:17:49Z

Example alien_cake_addict failed to run, please try running it locally and check the result.

hymm · 2023-08-15T20:34:53Z

did something that might be slightly more controversial and moved the system spans into run_unsafe from System instead of in the executor. Felt like it belonged there more with being able to put the span in SystemMeta. perf didnt change.

github-actions · 2023-08-15T20:58:46Z

Example alien_cake_addict failed to run, please try running it locally and check the result.

aevyrie

This is a really nice improvement.

mockersf · 2023-08-24T18:44:02Z

I worry there's a trap we're missing here, seems almost too good to be true!

Did you check with the tracing dev if there's no issue? That could even be mentioned in their doc.

hymm · 2023-08-24T22:49:47Z

from this comment #4892 (comment), I assume this is avoiding hitting the registry which they say is the slow thing.

alice-i-cecile · 2023-09-13T19:10:05Z

Really simple and nice!

SkiFire13 · 2023-09-13T19:13:28Z

crates/bevy_ecs/src/schedule/executor/multi_threaded.rs

+        let task = task.instrument(
+            self.system_task_metadata[system_index]
+                .system_task_span
+                .clone(),
+        );


This clone makes me wonder, do we really need to instrument the Future as well? Normally this is done because Futures can suspend execution during .await points and that would mess up spans, but here we are never awaiting and moreover we're already measuring the run_unsafe internally, which should account for most of the execution time.

The instrumentation here does help measure the additional scheduler overhead, which is thankfully very low right now. However, I do see how this both adds even more profiling overhead and how it might not be all that useful to the typical user. Not sure how to best approach toggling this on or off though.

My worry is that the time needed for the instrumentation itself here may be in the same order of magniture as the one it is measuring (in addition to the one already measured inside the run_unsafe). That is, it is measuring how much time is needed to run the catch_unwind, doing a non-blocking send on a channel and checking if the catch_unwind returned an error, all of which are pretty fast.

djeedai

Minor doc could be improved.

djeedai · 2023-09-13T19:30:46Z

crates/bevy_ecs/src/schedule/executor/multi_threaded.rs

@@ -62,6 +62,9 @@ struct SystemTaskMetadata {
    is_send: bool,
    /// Is `true` if the system is exclusive.
    is_exclusive: bool,
+    /// Cached tracing span for system task


Suggested change

/// Cached tracing span for system task

/// Tracing span for system task, cached for performance.

# Objective We cached system spans in #9390, but another common span seen in most Bevy apps when enabling tracing are Query::par_iter(_mut) related spans. ## Solution Cache them in QueryState. The one downside to this is that we pay for the memory for every Query(State) instantiated, not just those that are used for parallel iteration, but this shouldn't be a significant cost unless the app is creating hundreds of thousands of Query(State)s regularly. ## Metrics Tested against `cargo run --profile stress-test --features trace_tracy --example many_cubes`. Yellow is this PR, red is main. `sync_simple_transforms`: ![image](https://github.com/bevyengine/bevy/assets/3137680/d60f6d69-5586-4424-9d78-aac78992aacd) `check_visibility`: ![image](https://github.com/bevyengine/bevy/assets/3137680/096a58d2-a330-4a32-b806-09cd524e6e15) Full frame: ![image](https://github.com/bevyengine/bevy/assets/3137680/3b088cf8-9487-4bc7-a308-026e172d6672)

# Objective We cached system spans in bevyengine#9390, but another common span seen in most Bevy apps when enabling tracing are Query::par_iter(_mut) related spans. ## Solution Cache them in QueryState. The one downside to this is that we pay for the memory for every Query(State) instantiated, not just those that are used for parallel iteration, but this shouldn't be a significant cost unless the app is creating hundreds of thousands of Query(State)s regularly. ## Metrics Tested against `cargo run --profile stress-test --features trace_tracy --example many_cubes`. Yellow is this PR, red is main. `sync_simple_transforms`: ![image](https://github.com/bevyengine/bevy/assets/3137680/d60f6d69-5586-4424-9d78-aac78992aacd) `check_visibility`: ![image](https://github.com/bevyengine/bevy/assets/3137680/096a58d2-a330-4a32-b806-09cd524e6e15) Full frame: ![image](https://github.com/bevyengine/bevy/assets/3137680/3b088cf8-9487-4bc7-a308-026e172d6672)

# Objective - Reduce the overhead of tracing by caching the system spans. Yellow is this pr. Red is main. ![image](https://github.com/bevyengine/bevy/assets/2180432/fe9bb7c2-ae9a-4522-80a9-75a943a562b6)

# Objective We cached system spans in bevyengine#9390, but another common span seen in most Bevy apps when enabling tracing are Query::par_iter(_mut) related spans. ## Solution Cache them in QueryState. The one downside to this is that we pay for the memory for every Query(State) instantiated, not just those that are used for parallel iteration, but this shouldn't be a significant cost unless the app is creating hundreds of thousands of Query(State)s regularly. ## Metrics Tested against `cargo run --profile stress-test --features trace_tracy --example many_cubes`. Yellow is this PR, red is main. `sync_simple_transforms`: ![image](https://github.com/bevyengine/bevy/assets/3137680/d60f6d69-5586-4424-9d78-aac78992aacd) `check_visibility`: ![image](https://github.com/bevyengine/bevy/assets/3137680/096a58d2-a330-4a32-b806-09cd524e6e15) Full frame: ![image](https://github.com/bevyengine/bevy/assets/3137680/3b088cf8-9487-4bc7-a308-026e172d6672)

hymm requested a review from james7132 August 9, 2023 00:11

alice-i-cecile added C-Performance A change motivated by improving speed, memory usage or compile times A-Diagnostics Logging, crash handling, error reporting and performance analysis labels Aug 9, 2023

james7132 approved these changes Aug 9, 2023

View reviewed changes

crates/bevy_ecs/src/schedule/executor/multi_threaded.rs Outdated Show resolved Hide resolved

crates/bevy_ecs/src/schedule/executor/multi_threaded.rs Outdated Show resolved Hide resolved

hymm force-pushed the cache-system-spans branch from 140b8cc to 3fec176 Compare August 9, 2023 22:24

hymm added 6 commits August 15, 2023 12:48

cache system spans

32ac77d

move spans to task meta data struct

cabaece

cache spans for commands

987b987

cargo fmt

c6fdcb9

clippy

2859b4d

use cached spans for exclusive systems

5abf79e

hymm force-pushed the cache-system-spans branch from 95b76aa to 4e5ac92 Compare August 15, 2023 20:07

move span into system run

79484d8

hymm force-pushed the cache-system-spans branch from 4e5ac92 to 79484d8 Compare August 15, 2023 20:46

aevyrie approved these changes Aug 24, 2023

View reviewed changes

james7132 added the S-Ready-For-Final-Review This PR has been approved by the community. It's ready for a maintainer to consider merging it label Sep 13, 2023

alice-i-cecile approved these changes Sep 13, 2023

View reviewed changes

alice-i-cecile added this pull request to the merge queue Sep 13, 2023

alice-i-cecile mentioned this pull request Sep 13, 2023

Tracing's overhead may be too high for profiling CPU-bound operations #4892

Closed

SkiFire13 reviewed Sep 13, 2023

View reviewed changes

Merged via the queue into bevyengine:main with commit 324c057 Sep 13, 2023

djeedai approved these changes Sep 13, 2023

View reviewed changes

james7132 mentioned this pull request Sep 28, 2023

Cache parallel iteration spans #9950

Merged

hymm deleted the cache-system-spans branch October 5, 2023 16:12

cart mentioned this pull request Oct 13, 2023

News: Release 0.12 bevyengine/bevy-website#754

Merged

43 tasks

james7132 mentioned this pull request Jan 1, 2024

Fix tracing spans for system #11169

Closed

DJMcNab mentioned this pull request Oct 21, 2024

Don't use full tracing spans in full-tree passes by default linebender/xilem#687

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cache System Tracing Spans #9390

Cache System Tracing Spans #9390

hymm commented Aug 9, 2023

james7132 left a comment

james7132 commented Aug 9, 2023

hymm commented Aug 9, 2023

hymm commented Aug 9, 2023

superdump commented Aug 9, 2023

james7132 commented Aug 9, 2023

hymm commented Aug 10, 2023

github-actions bot commented Aug 15, 2023

hymm commented Aug 15, 2023 •

edited

Loading

github-actions bot commented Aug 15, 2023

aevyrie left a comment

mockersf commented Aug 24, 2023

hymm commented Aug 24, 2023

alice-i-cecile commented Sep 13, 2023

SkiFire13 Sep 13, 2023

james7132 Sep 13, 2023

SkiFire13 Sep 13, 2023

djeedai left a comment

djeedai Sep 13, 2023

	/// Cached tracing span for system task
	/// Tracing span for system task, cached for performance.

Cache System Tracing Spans #9390

Cache System Tracing Spans #9390

Conversation

hymm commented Aug 9, 2023

Objective

james7132 left a comment

Choose a reason for hiding this comment

james7132 commented Aug 9, 2023

hymm commented Aug 9, 2023

hymm commented Aug 9, 2023

superdump commented Aug 9, 2023

james7132 commented Aug 9, 2023

hymm commented Aug 10, 2023

github-actions bot commented Aug 15, 2023

hymm commented Aug 15, 2023 • edited Loading

github-actions bot commented Aug 15, 2023

aevyrie left a comment

Choose a reason for hiding this comment

mockersf commented Aug 24, 2023

hymm commented Aug 24, 2023

alice-i-cecile commented Sep 13, 2023

SkiFire13 Sep 13, 2023

Choose a reason for hiding this comment

james7132 Sep 13, 2023

Choose a reason for hiding this comment

SkiFire13 Sep 13, 2023

Choose a reason for hiding this comment

djeedai left a comment

Choose a reason for hiding this comment

djeedai Sep 13, 2023

Choose a reason for hiding this comment

hymm commented Aug 15, 2023 •

edited

Loading