-
Notifications
You must be signed in to change notification settings - Fork 721
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
inline
several methods to make tracing::event!
smaller
#2555
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's interesting that inlining these functions actually improves binary size — they must all compile to fewer instructions than a function call, in release mode.
Out of curiosity, would you mind running the tracing
crate's benchmarks before and after this change? It would be interesting to see if there's a meaningful performance delta in those micro benchmarks.
Yes, there are some perf win:
I retried benchmark several times and the improvements seem to be fairly stable. raw log: https://gist.github.com/ldm0/6573935f4979d2645fbcf5bde7361386 |
Thanks, that looks great! I'll merge this once CI passes. :) |
## Motivation Make `tracing::event!` codegen smaller ## Solution Add `inline` to several functions called by `tracing::event!`. Simple example: https://github.com/ldm0/tracing_test After inlining, executable size drops from 746kb to 697kb (`cargo build --release + strip`), saves 50 bytes per `event!`. Test environment: ``` toolchain: nightly-aarch64-apple-darwin rustc-version: rustc 1.70.0-nightly (88fb1b922 2023-04-10) ``` There are also performance improvements in the benchmarks: ``` event/scoped [-40.689% -40.475% -40.228%] event/scoped_recording [-14.972% -14.685% -14.410%] event/global [-48.412% -48.217% -48.010%] span_fields/scoped [-25.317% -24.876% -24.494%] span_fields/global [-39.695% -39.488% -39.242%] span_repeated/global [-27.514% -26.633% -25.298%] static/baseline_single_threaded [-32.275% -32.032% -31.808%] static/single_threaded [-29.628% -29.376% -29.156%] static/enabled_one [-29.777% -29.544% -29.305%] static/enabled_many [-30.901% -30.504% -30.140%] dynamic/baseline_single_threaded [-20.157% -19.880% -19.603%] ``` I retried benchmark several times and the improvements seem to be fairly stable. raw log: https://gist.github.com/ldm0/6573935f4979d2645fbcf5bde7361386
## Motivation Make `tracing::event!` codegen smaller ## Solution Add `inline` to several functions called by `tracing::event!`. Simple example: https://github.com/ldm0/tracing_test After inlining, executable size drops from 746kb to 697kb (`cargo build --release + strip`), saves 50 bytes per `event!`. Test environment: ``` toolchain: nightly-aarch64-apple-darwin rustc-version: rustc 1.70.0-nightly (88fb1b922 2023-04-10) ``` There are also performance improvements in the benchmarks: ``` event/scoped [-40.689% -40.475% -40.228%] event/scoped_recording [-14.972% -14.685% -14.410%] event/global [-48.412% -48.217% -48.010%] span_fields/scoped [-25.317% -24.876% -24.494%] span_fields/global [-39.695% -39.488% -39.242%] span_repeated/global [-27.514% -26.633% -25.298%] static/baseline_single_threaded [-32.275% -32.032% -31.808%] static/single_threaded [-29.628% -29.376% -29.156%] static/enabled_one [-29.777% -29.544% -29.305%] static/enabled_many [-30.901% -30.504% -30.140%] dynamic/baseline_single_threaded [-20.157% -19.880% -19.603%] ``` I retried benchmark several times and the improvements seem to be fairly stable. raw log: https://gist.github.com/ldm0/6573935f4979d2645fbcf5bde7361386
# 0.1.38 (April 25th, 2023) This `tracing` release changes the `Drop` implementation for `Instrumented` `Future`s so that the attached `Span` is entered when dropping the `Future`. This means that events emitted by the `Future`'s `Drop` implementation will now be recorded within its `Span`. It also adds `#[inline]` hints to methods called in the `event!` macro's expansion, for an improvement in both binary size and performance. Additionally, this release updates the `tracing-attributes` dependency to [v0.1.24][attrs-0.1.24], which updates the [`syn`] dependency to v2.x.x. `tracing-attributes` v0.1.24 also includes improvements to the `#[instrument]` macro; see [the `tracing-attributes` 0.1.24 release notes][attrs-0.1.24] for details. ### Added - `Instrumented` futures will now enter the attached `Span` in their `Drop` implementation, allowing events emitted when dropping the future to occur within the span (#2562) - `#[inline]` attributes for methods called by the `event!` macros, making generated code smaller (#2555) - **attributes**: `level` argument to `#[instrument(err)]` and `#[instrument(ret)]` to override the level of the generated return value event (#2335) - **attributes**: Improved compiler error message when `#[instrument]` is added to a `const fn` (#2418) ### Changed - `tracing-attributes`: updated to [0.1.24][attrs-0.1.24] - Removed unneeded `cfg-if` dependency (#2553) - **attributes**: Updated [`syn`] dependency to 2.0 (#2516) ### Fixed - **attributes**: Fix `clippy::unreachable` warnings in `#[instrument]`-generated code (#2356) - **attributes**: Removed unused "visit" feature flag from `syn` dependency (#2530) ### Documented - **attributes**: Documented default level for `#[instrument(err)]` (#2433) - **attributes**: Improved documentation for levels in `#[instrument]` (#2350) Thanks to @nitnelave, @jsgf, @Abhicodes-crypto, @LukeMathWalker, @andrewpollack, @quad, @klensy, @davidpdrsn, @dbidwell94, @ldm0, @NobodyXu, @ilsv, and @daxpedda for contributing to this release! [`syn`]: https://crates.io/crates/syn [attrs-0.1.24]: https://github.com/tokio-rs/tracing/releases/tag/tracing-attributes-0.1.24
# 0.1.38 (April 25th, 2023) This `tracing` release changes the `Drop` implementation for `Instrumented` `Future`s so that the attached `Span` is entered when dropping the `Future`. This means that events emitted by the `Future`'s `Drop` implementation will now be recorded within its `Span`. It also adds `#[inline]` hints to methods called in the `event!` macro's expansion, for an improvement in both binary size and performance. Additionally, this release updates the `tracing-attributes` dependency to [v0.1.24][attrs-0.1.24], which updates the [`syn`] dependency to v2.x.x. `tracing-attributes` v0.1.24 also includes improvements to the `#[instrument]` macro; see [the `tracing-attributes` 0.1.24 release notes][attrs-0.1.24] for details. ### Added - `Instrumented` futures will now enter the attached `Span` in their `Drop` implementation, allowing events emitted when dropping the future to occur within the span (#2562) - `#[inline]` attributes for methods called by the `event!` macros, making generated code smaller (#2555) - **attributes**: `level` argument to `#[instrument(err)]` and `#[instrument(ret)]` to override the level of the generated return value event (#2335) - **attributes**: Improved compiler error message when `#[instrument]` is added to a `const fn` (#2418) ### Changed - `tracing-attributes`: updated to [0.1.24][attrs-0.1.24] - Removed unneeded `cfg-if` dependency (#2553) - **attributes**: Updated [`syn`] dependency to 2.0 (#2516) ### Fixed - **attributes**: Fix `clippy::unreachable` warnings in `#[instrument]`-generated code (#2356) - **attributes**: Removed unused "visit" feature flag from `syn` dependency (#2530) ### Documented - **attributes**: Documented default level for `#[instrument(err)]` (#2433) - **attributes**: Improved documentation for levels in `#[instrument]` (#2350) Thanks to @nitnelave, @jsgf, @Abhicodes-crypto, @LukeMathWalker, @andrewpollack, @quad, @klensy, @davidpdrsn, @dbidwell94, @ldm0, @NobodyXu, @ilsv, and @daxpedda for contributing to this release! [`syn`]: https://crates.io/crates/syn [attrs-0.1.24]: https://github.com/tokio-rs/tracing/releases/tag/tracing-attributes-0.1.24
# 0.1.31 (May 11, 2023) This release of `tracing-core` fixes a bug that caused threads which call `dispatcher::get_default` _before_ a global default subscriber is set to never see the global default once it is set. In addition, it includes improvements for instrumentation performance in some cases, especially when using a global default dispatcher. ### Fixed - Fixed incorrect thread-local caching of `Dispatch::none` if `dispatcher::get_default` is called before `dispatcher::set_global_default` (#2593) ### Changed - Cloning a `Dispatch` that points at a global default subscriber no longer requires an `Arc` reference count increment, improving performance substantially (#2593) - `dispatcher::get_default` no longer attempts to access a thread local if the scoped dispatcher is not in use, improving performance when the default dispatcher is global (#2593) - Added `#[inline]` annotations called by the `event!` and `span!` macros to reduce the size of macro-generated code and improve recording performance (#2555) Thanks to new contributor @ldm0 for contributing to this release!
# 0.1.31 (May 11, 2023) This release of `tracing-core` fixes a bug that caused threads which call `dispatcher::get_default` _before_ a global default subscriber is set to never see the global default once it is set. In addition, it includes improvements for instrumentation performance in some cases, especially when using a global default dispatcher. ### Fixed - Fixed incorrect thread-local caching of `Dispatch::none` if `dispatcher::get_default` is called before `dispatcher::set_global_default` (#2593) ### Changed - Cloning a `Dispatch` that points at a global default subscriber no longer requires an `Arc` reference count increment, improving performance substantially (#2593) - `dispatcher::get_default` no longer attempts to access a thread local if the scoped dispatcher is not in use, improving performance when the default dispatcher is global (#2593) - Added `#[inline]` annotations called by the `event!` and `span!` macros to reduce the size of macro-generated code and improve recording performance (#2555) Thanks to new contributor @ldm0 for contributing to this release!
# 0.1.38 (April 25th, 2023) This `tracing` release changes the `Drop` implementation for `Instrumented` `Future`s so that the attached `Span` is entered when dropping the `Future`. This means that events emitted by the `Future`'s `Drop` implementation will now be recorded within its `Span`. It also adds `#[inline]` hints to methods called in the `event!` macro's expansion, for an improvement in both binary size and performance. Additionally, this release updates the `tracing-attributes` dependency to [v0.1.24][attrs-0.1.24], which updates the [`syn`] dependency to v2.x.x. `tracing-attributes` v0.1.24 also includes improvements to the `#[instrument]` macro; see [the `tracing-attributes` 0.1.24 release notes][attrs-0.1.24] for details. ### Added - `Instrumented` futures will now enter the attached `Span` in their `Drop` implementation, allowing events emitted when dropping the future to occur within the span (tokio-rs#2562) - `#[inline]` attributes for methods called by the `event!` macros, making generated code smaller (tokio-rs#2555) - **attributes**: `level` argument to `#[instrument(err)]` and `#[instrument(ret)]` to override the level of the generated return value event (tokio-rs#2335) - **attributes**: Improved compiler error message when `#[instrument]` is added to a `const fn` (tokio-rs#2418) ### Changed - `tracing-attributes`: updated to [0.1.24][attrs-0.1.24] - Removed unneeded `cfg-if` dependency (tokio-rs#2553) - **attributes**: Updated [`syn`] dependency to 2.0 (tokio-rs#2516) ### Fixed - **attributes**: Fix `clippy::unreachable` warnings in `#[instrument]`-generated code (tokio-rs#2356) - **attributes**: Removed unused "visit" feature flag from `syn` dependency (tokio-rs#2530) ### Documented - **attributes**: Documented default level for `#[instrument(err)]` (tokio-rs#2433) - **attributes**: Improved documentation for levels in `#[instrument]` (tokio-rs#2350) Thanks to @nitnelave, @jsgf, @Abhicodes-crypto, @LukeMathWalker, @andrewpollack, @quad, @klensy, @davidpdrsn, @dbidwell94, @ldm0, @NobodyXu, @ilsv, and @daxpedda for contributing to this release! [`syn`]: https://crates.io/crates/syn [attrs-0.1.24]: https://github.com/tokio-rs/tracing/releases/tag/tracing-attributes-0.1.24
# 0.1.31 (May 11, 2023) This release of `tracing-core` fixes a bug that caused threads which call `dispatcher::get_default` _before_ a global default subscriber is set to never see the global default once it is set. In addition, it includes improvements for instrumentation performance in some cases, especially when using a global default dispatcher. ### Fixed - Fixed incorrect thread-local caching of `Dispatch::none` if `dispatcher::get_default` is called before `dispatcher::set_global_default` (tokio-rs#2593) ### Changed - Cloning a `Dispatch` that points at a global default subscriber no longer requires an `Arc` reference count increment, improving performance substantially (tokio-rs#2593) - `dispatcher::get_default` no longer attempts to access a thread local if the scoped dispatcher is not in use, improving performance when the default dispatcher is global (tokio-rs#2593) - Added `#[inline]` annotations called by the `event!` and `span!` macros to reduce the size of macro-generated code and improve recording performance (tokio-rs#2555) Thanks to new contributor @ldm0 for contributing to this release!
Motivation
Make
tracing::event!
codegen smallerSolution
Add
inline
to several functions called bytracing::event!
.Simple example: https://github.com/ldm0/tracing_test
After inlining, executable size drops from 746kb to 697kb(
cargo build --release + strip
), saves 50 bytes perevent!
.Test environment: