Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rebuild_interest_cache on Dispatch's inner Subscriber Drop #909

Closed
wants to merge 2 commits into from

Conversation

jeromegn
Copy link
Contributor

@jeromegn jeromegn commented Aug 8, 2020

Motivation

Filters can leak (#902) when switching subscriber using WithSubscriber.

Solution

Rebuild the Interest cache when dropping the last Subscriber Arc clone from the Dispatcher by wrapping it in a inner struct.

@jeromegn
Copy link
Contributor Author

jeromegn commented Aug 8, 2020

Benchmarks show a regression and do not successfully complete:

$ cargo bench --all
   Compiling tracing-core v0.1.13 (/home/jerome/src/github.com/superfly/tracing/tracing-core)
   Compiling tracing v0.1.18 (/home/jerome/src/github.com/superfly/tracing/tracing)
   Compiling tracing-log v0.1.1 (/home/jerome/src/github.com/superfly/tracing/tracing-log)
   Compiling tracing-serde v0.1.1 (/home/jerome/src/github.com/superfly/tracing/tracing-serde)
   Compiling tracing-futures v0.2.6 (/home/jerome/src/github.com/superfly/tracing/tracing-futures)
   Compiling tracing-subscriber v0.2.10 (/home/jerome/src/github.com/superfly/tracing/tracing-subscriber)
   Compiling tracing-attributes v0.1.9 (/home/jerome/src/github.com/superfly/tracing/tracing-attributes)
   Compiling tracing-tower v0.1.0 (/home/jerome/src/github.com/superfly/tracing/tracing-tower)
   Compiling tracing-flame v0.1.0 (/home/jerome/src/github.com/superfly/tracing/tracing-flame)
   Compiling tracing-error v0.1.2 (/home/jerome/src/github.com/superfly/tracing/tracing-error)
   Compiling tracing-journald v0.1.0 (/home/jerome/src/github.com/superfly/tracing/tracing-journald)
   Compiling tracing-opentelemetry v0.6.0 (/home/jerome/src/github.com/superfly/tracing/tracing-opentelemetry)
   Compiling tracing-appender v0.1.1 (/home/jerome/src/github.com/superfly/tracing/tracing-appender)
    Finished bench [optimized] target(s) in 33.14s
     Running target/release/deps/tracing-b5a851a2c4fb79c3

running 0 tests

test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out

     Running target/release/deps/no_subscriber-a870917cfc751166
Gnuplot not found, using plotters backend
no_subscriber/span      time:   [2.6768 ns 2.6860 ns 2.6984 ns]
                        change: [-9.4320% -8.1213% -6.9933%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 2 outliers among 100 measurements (2.00%)
  2 (2.00%) high severe
no_subscriber/event     time:   [262.29 ps 265.22 ps 268.36 ps]
                        change: [+0.5575% +2.7164% +4.8982%] (p = 0.01 < 0.05)
                        Change within noise threshold.
Found 6 outliers among 100 measurements (6.00%)
  5 (5.00%) high mild
  1 (1.00%) high severe
no_subscriber/relaxed_load
                        time:   [761.07 ps 764.08 ps 767.47 ps]
                        change: [-4.0225% -2.7197% -1.3122%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 4 outliers among 100 measurements (4.00%)
  1 (1.00%) high mild
  3 (3.00%) high severe
no_subscriber/acquire_load
                        time:   [704.68 ps 706.20 ps 708.02 ps]
                        change: [-6.7596% -2.2168% +6.2535%] (p = 0.72 > 0.05)
                        No change in performance detected.
Found 2 outliers among 100 measurements (2.00%)
  2 (2.00%) high severe
no_subscriber/log       time:   [250.14 ps 250.83 ps 251.61 ps]
                        change: [-4.9319% -4.3776% -3.8493%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 37 outliers among 100 measurements (37.00%)
  13 (13.00%) low severe
  3 (3.00%) low mild
  1 (1.00%) high mild
  20 (20.00%) high severe

no_subscriber_field/span
                        time:   [2.9629 ns 2.9732 ns 2.9851 ns]
                        change: [+5.2735% +5.8329% +6.3371%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 3 outliers among 100 measurements (3.00%)
  1 (1.00%) low mild
  1 (1.00%) high mild
  1 (1.00%) high severe
no_subscriber_field/event
                        time:   [346.48 ps 355.94 ps 365.84 ps]
                        change: [+26.366% +30.357% +34.796%] (p = 0.00 < 0.05)
                        Performance has regressed.
no_subscriber_field/log time:   [556.73 ps 558.75 ps 561.48 ps]
                        change: [-0.7191% -0.3158% +0.0947%] (p = 0.15 > 0.05)
                        No change in performance detected.
Found 15 outliers among 100 measurements (15.00%)
  3 (3.00%) low mild
  5 (5.00%) high mild
  7 (7.00%) high severe

     Running target/release/deps/subscriber-6dd12b6810d45fe6
Gnuplot not found, using plotters backend
span_no_fields          time:   [34.297 ns 34.633 ns 35.096 ns]
                        change: [+3.5840% +5.1468% +6.9627%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 16 outliers among 100 measurements (16.00%)
  10 (10.00%) high mild
  6 (6.00%) high severe

enter_span              time:   [4.5445 ns 4.5814 ns 4.6194 ns]
                        change: [+1.2390% +2.1834% +3.0529%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 4 outliers among 100 measurements (4.00%)
  1 (1.00%) high mild
  3 (3.00%) high severe

span_repeatedly         time:   [4.3345 us 4.3449 us 4.3574 us]
                        change: [+3.8190% +4.3302% +4.7780%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 5 outliers among 100 measurements (5.00%)
  2 (2.00%) high mild
  3 (3.00%) high severe

span_with_fields        time:   [58.121 ns 58.325 ns 58.603 ns]
                        change: [-1.2199% -0.5907% +0.0064%] (p = 0.07 > 0.05)
                        No change in performance detected.
Found 5 outliers among 100 measurements (5.00%)
  5 (5.00%) high severe

span_with_fields_record time:   [366.73 ns 367.60 ns 368.64 ns]
                        change: [+1.4753% +2.0601% +2.6241%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 15 outliers among 100 measurements (15.00%)
  2 (2.00%) low severe
  1 (1.00%) low mild
  1 (1.00%) high mild
  11 (11.00%) high severe

dispatch/no_dispatch_get_ref
                        time:   [7.5588 ns 7.5812 ns 7.6068 ns]
                        change: [-1.6785% -1.2510% -0.8497%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 2 outliers among 100 measurements (2.00%)
  2 (2.00%) high mild
dispatch/no_dispatch_get_clone
                        time:   [17.265 ns 17.322 ns 17.401 ns]
                        change: [+1.5332% +1.8358% +2.1704%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 6 outliers among 100 measurements (6.00%)
  4 (4.00%) high mild
  2 (2.00%) high severe
dispatch/get_ref        time:   [5.7847 ns 5.8087 ns 5.8445 ns]
                        change: [-0.2567% +0.2453% +0.7602%] (p = 0.35 > 0.05)
                        No change in performance detected.
Found 2 outliers among 100 measurements (2.00%)
  1 (1.00%) high mild
  1 (1.00%) high severe
dispatch/get_clone      time:   [15.514 ns 15.541 ns 15.570 ns]
                        change: [-0.1183% +0.2165% +0.6190%] (p = 0.24 > 0.05)
                        No change in performance detected.
Found 7 outliers among 100 measurements (7.00%)
  1 (1.00%) low mild
  2 (2.00%) high mild
  4 (4.00%) high severe

     Running target/release/deps/tracing_appender-000a958cc227d4f5

running 18 tests
test non_blocking::test::backpressure_exerted ... ignored
test non_blocking::test::logs_dropped_if_lossy ... ignored
test non_blocking::test::multi_threaded_writes ... ignored
test rolling::test::test_join_date_never ... ignored
test rolling::test::test_next_date_daily ... ignored
test rolling::test::test_next_date_hourly ... ignored
test rolling::test::test_next_date_minutely ... ignored
test rolling::test::test_next_date_never ... ignored
test rolling::test::test_rotation_path_daily ... ignored
test rolling::test::test_rotation_path_hourly ... ignored
test rolling::test::test_rotation_path_minutely ... ignored
test rolling::test::test_round_date_daily ... ignored
test rolling::test::test_round_date_hourly ... ignored
test rolling::test::test_round_date_minutely ... ignored
test rolling::test::write_daily_log ... ignored
test rolling::test::write_hourly_log ... ignored
test rolling::test::write_minutely_log ... ignored
test rolling::test::write_never_log ... ignored

test result: ok. 0 passed; 0 failed; 18 ignored; 0 measured; 0 filtered out

     Running target/release/deps/tracing_attributes-50e7c15b01f05443

running 0 tests

test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out

     Running target/release/deps/tracing_core-37d13fd94e104bd3

running 17 tests
test dispatcher::test::default_dispatch ... ignored
test dispatcher::test::default_no_subscriber ... ignored
test dispatcher::test::dispatch_downcasts ... ignored
test dispatcher::test::dispatch_is ... ignored
test dispatcher::test::events_dont_infinite_loop ... ignored
test dispatcher::test::spans_dont_infinite_loop ... ignored
test field::test::empty_fields_are_skipped ... ignored
test field::test::empty_value_set_is_empty ... ignored
test field::test::fields_from_other_callsets_are_skipped ... ignored
test field::test::record_debug_fn ... ignored
test field::test::sparse_value_sets_are_not_empty ... ignored
test field::test::value_set_with_no_values_is_empty ... ignored
test field::test::value_sets_with_fields_from_other_callsites_are_empty ... ignored
test metadata::tests::filter_level_conversion ... ignored
test metadata::tests::level_filter_is_usize_sized ... ignored
test metadata::tests::level_filter_reprs ... ignored
test metadata::tests::level_from_str ... ignored

test result: ok. 0 passed; 0 failed; 17 ignored; 0 measured; 0 filtered out

     Running target/release/deps/tracing_error-f39fbdfe4739fef4

running 3 tests
test backtrace::tests::capture_empty ... ignored
test backtrace::tests::capture_supported ... ignored
test backtrace::tests::capture_unsupported ... ignored

test result: ok. 0 passed; 0 failed; 3 ignored; 0 measured; 0 filtered out

     Running target/release/deps/tracing_flame-389f80994dcfbe8c

running 0 tests

test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out

     Running target/release/deps/tracing_futures-ad3c154e18eef650

running 4 tests
test tests::futures_01_tests::future_enter_exit_is_reasonable ... ignored
test tests::futures_01_tests::future_error_ends_span ... ignored
test tests::futures_01_tests::span_follows_future_onto_threadpool ... ignored
test tests::futures_01_tests::stream_enter_exit_is_reasonable ... ignored

test result: ok. 0 passed; 0 failed; 4 ignored; 0 measured; 0 filtered out

     Running target/release/deps/tracing_journald-a93959fcb5e56904

running 0 tests

test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out

     Running target/release/deps/tracing_log-f28714cec1292fb6

running 5 tests
test test::debug_callsite_is_correct ... ignored
test test::error_callsite_is_correct ... ignored
test test::info_callsite_is_correct ... ignored
test test::trace_callsite_is_correct ... ignored
test test::warn_callsite_is_correct ... ignored

test result: ok. 0 passed; 0 failed; 5 ignored; 0 measured; 0 filtered out

     Running target/release/deps/tracing_macros-ee96079f8ae1abeb

running 0 tests

test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out

     Running target/release/deps/tracing_opentelemetry-2d09d3a33e48b6e9

running 2 tests
test layer::tests::dynamic_span_names ... ignored
test layer::tests::span_kind ... ignored

test result: ok. 0 passed; 0 failed; 2 ignored; 0 measured; 0 filtered out

     Running target/release/deps/tracing_serde-f0d69309b94a2be6

running 0 tests

test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out

     Running target/release/deps/tracing_subscriber-aa4479b15287ce6b

running 66 tests
test field::delimited::test::delimited_new_visitor ... ignored
test field::delimited::test::delimited_visitor ... ignored
test filter::env::directive::test::directive_ordering_by_field_num ... ignored
test filter::env::directive::test::directive_ordering_by_span ... ignored
test filter::env::directive::test::directive_ordering_by_target_len ... ignored
test filter::env::directive::test::directive_ordering_uses_lexicographic_when_equal ... ignored
test filter::env::directive::test::parse_directives_empty_level ... ignored
test filter::env::directive::test::parse_directives_global ... ignored
test filter::env::directive::test::parse_directives_invalid_crate ... ignored
test filter::env::directive::test::parse_directives_invalid_level ... ignored
test filter::env::directive::test::parse_directives_ralith ... ignored
test filter::env::directive::test::parse_directives_string_level ... ignored
test filter::env::directive::test::parse_directives_valid ... ignored
test filter::env::directive::test::parse_directives_valid_with_spans ... ignored
test filter::env::directive::test::parse_level_directives ... ignored
test filter::env::directive::test::parse_numeric_level_directives ... ignored
test filter::env::directive::test::parse_uppercase_level_directives ... ignored
test filter::env::tests::callsite_enabled_includes_span_directive ... ignored
test filter::env::tests::callsite_enabled_includes_span_directive_field ... ignored
test filter::env::tests::callsite_enabled_includes_span_directive_multiple_fields ... ignored
test filter::env::tests::callsite_enabled_no_span_directive ... ignored
test filter::env::tests::callsite_off ... ignored
test filter::env::tests::roundtrip ... ignored
test fmt::fmt_layer::test::fmt_layer_downcasts ... ignored
test fmt::fmt_layer::test::fmt_layer_downcasts_to_parts ... ignored
test fmt::fmt_layer::test::impls ... ignored
test fmt::fmt_layer::test::is_lookup_span ... ignored
test fmt::fmt_layer::test::synthesize_span_active ... ignored
test fmt::fmt_layer::test::synthesize_span_close ... ignored
test fmt::fmt_layer::test::synthesize_span_close_no_timing ... ignored
test fmt::fmt_layer::test::synthesize_span_full ... ignored
test fmt::fmt_layer::test::synthesize_span_none ... ignored
test fmt::format::json::test::json ... ignored
test fmt::format::json::test::json_disabled_current_span_event ... ignored
test fmt::format::json::test::json_disabled_span_list_event ... ignored
test fmt::format::json::test::json_flattened_event ... ignored
test fmt::format::json::test::json_nested_span ... ignored
test fmt::format::json::test::json_no_span ... ignored
test fmt::format::json::test::record_works ... ignored
test fmt::format::test::format_nanos ... ignored
test fmt::format::test::overridden_parents ... ignored
test fmt::format::test::overridden_parents_in_scope ... ignored
test fmt::format::test::with_ansi_false ... ignored
test fmt::format::test::with_ansi_true ... ignored
test fmt::format::test::without_level ... ignored
test fmt::test::impls ... ignored
test fmt::test::is_lookup_span ... ignored
test fmt::test::subscriber_downcasts ... ignored
test fmt::test::subscriber_downcasts_to_parts ... ignored
test fmt::writer::test::custom_writer_closure ... ignored
test fmt::writer::test::custom_writer_struct ... ignored
test layer::tests::downcasts_to_layer ... ignored
test layer::tests::downcasts_to_subscriber ... ignored
test layer::tests::layer_is_subscriber ... ignored
test layer::tests::three_layers_are_subscriber ... ignored
test layer::tests::two_layers_are_subscriber ... ignored
test registry::extensions::test_extensions ... ignored
test registry::sharded::tests::child_closes_grandparent ... ignored
test registry::sharded::tests::child_closes_parent ... ignored
test registry::sharded::tests::multiple_layers_can_access_closed_span ... ignored
test registry::sharded::tests::single_layer_can_access_closed_span ... ignored
test registry::sharded::tests::span_enter_guards_are_dropped_out_of_order ... ignored
test registry::sharded::tests::spans_are_only_closed_when_the_last_ref_drops ... ignored
test registry::sharded::tests::spans_are_removed_from_registry ... ignored
test registry::stack::tests::pop_first_span ... ignored
test registry::stack::tests::pop_last_span ... ignored

test result: ok. 0 passed; 0 failed; 66 ignored; 0 measured; 0 filtered out

     Running target/release/deps/filter-8df557eb6cc4158e
Gnuplot not found, using plotters backend
static/baseline_single_threaded
                        time:   [69.807 ns 70.009 ns 70.276 ns]
                        change: [+4.5512% +5.0735% +5.6265%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 30 outliers among 100 measurements (30.00%)
  8 (8.00%) low severe
  5 (5.00%) low mild
  5 (5.00%) high mild
  12 (12.00%) high severe
static/single_threaded  time:   [35.281 ns 35.396 ns 35.545 ns]
                        change: [-1.6130% -0.4449% +0.7568%] (p = 0.47 > 0.05)
                        No change in performance detected.
Found 6 outliers among 100 measurements (6.00%)
  3 (3.00%) high mild
  3 (3.00%) high severe
static/enabled_one      time:   [17.419 ns 17.567 ns 17.764 ns]
                        change: [+2.6294% +3.4615% +4.2441%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 8 outliers among 100 measurements (8.00%)
  5 (5.00%) high mild
  3 (3.00%) high severe
static/enabled_many     time:   [17.412 ns 17.466 ns 17.534 ns]
                        change: [+5.8135% +6.3704% +6.9221%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 11 outliers among 100 measurements (11.00%)
  9 (9.00%) high mild
  2 (2.00%) high severe
static/disabled_level_one
                        time:   [1.7224 ns 1.7280 ns 1.7345 ns]
                        change: [+20.705% +21.687% +22.565%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 12 outliers among 100 measurements (12.00%)
  8 (8.00%) high mild
  4 (4.00%) high severe
static/disabled_level_many
                        time:   [858.08 ps 860.86 ps 864.86 ps]
                        change: [+2.7059% +3.3586% +4.0819%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 27 outliers among 100 measurements (27.00%)
  6 (6.00%) low severe
  3 (3.00%) low mild
  4 (4.00%) high mild
  14 (14.00%) high severe
static/disabled_one     time:   [1.4845 ns 1.4885 ns 1.4930 ns]
                        change: [-12.222% -11.701% -11.233%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 2 outliers among 100 measurements (2.00%)
  2 (2.00%) high mild
static/disabled_many    time:   [1.7878 ns 1.8089 ns 1.8325 ns]
                        change: [+26.250% +27.478% +28.817%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 2 outliers among 100 measurements (2.00%)
  1 (1.00%) high mild
  1 (1.00%) high severe
Benchmarking static/baseline_multithreaded: Warming up for 3.0000 sthread '<unnamed>' panicked at 'failed to set up alternative stack guard page', src/libstd/sys/unix/stack_overflow.rs:155:13
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
fatal runtime error: failed to initiate panic, error 5
error: process didn't exit successfully: `/home/jerome/src/github.com/superfly/tracing/target/release/deps/filter-8df557eb6cc4158e --bench` (signal: 6, SIGABRT: process abort signal)

tracing-core/src/dispatcher.rs Outdated Show resolved Hide resolved
@jeromegn
Copy link
Contributor Author

jeromegn commented Aug 9, 2020

I ran lldb on the stalled tests and got a mutex deadlock: https://gist.github.com/jeromegn/7dff533c92f1aae4a71b667ea9f4010a

Doing Registrar#upgrade triggers a Drop on the Inner which tries to re-acquire a lock with rebuild_interest_cache.

@jeromegn
Copy link
Contributor Author

This is likely not the right approach

@jeromegn jeromegn closed this Aug 11, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants