refactor to add shim for per-tenant context resolving/interning #232

lukesteensen · 2024-09-03T22:55:49Z

The rebase on #217 was a little hairy, so there may be some bits left in a weird state.

lib/saluki-components/src/sources/dogstatsd/mod.rs

tobz

Overall, this is looking reasonable to me!

I think we just need to clean up the literal TODO comments and what not, and otherwise the code/naming/etc is generally fine as-is.

lib/saluki-io/src/deser/codec/dogstatsd/mod.rs

tobz · 2024-09-04T19:24:35Z

lib/saluki-io/src/deser/codec/dogstatsd/mod.rs

@@ -623,7 +606,7 @@ fn limit_str_to_len(s: &str, limit: usize) -> &str {
 /// `TagSplitter` can be cloned to create a new iterator with its own iteration state. The same underlying input byte
 /// slice is retained.
 #[derive(Clone)]
-struct TagSplitter<'a> {
+pub struct TagSplitter<'a> {


It's fine for now, but more as a note to myself that I don't love this being exposed, and it'd be nice to find a clean way to avoid needing to expose this to support resolving the context.

It's just weird/confusing enough of an IntoIterator impl (and in that same vein, weird how it's used) that my brain prefers it being hidden entirely.

I agree it is a bit odd. I initially thought about returning something more like an array of tags, but this representation is nicely compact and sidesteps the issue of knowing ahead of time how many tags there might be to avoid heap allocation.

Yeah, I have no problem with the implementation itself, just that I wish there was maybe a little cleaner of an implementation that it could have given that it needs to be public now.

Signed-off-by: Luke Steensen <luke.steensen@gmail.com>

pr-commenter · 2024-09-10T19:42:28Z

Regression Detector (Saluki)

Regression Detector Results

Run ID: fa923b72-dae4-4808-9a80-794b09d0110d

Baseline: 68c99c1
Comparison: c6170f3

Performance changes are noted in the perf column of each table:

✅ = significantly better comparison variant performance
❌ = significantly worse comparison variant performance
➖ = no significant change in performance

Significant changes in experiment optimization goals

Confidence level: 90.00%
Effect size tolerance: |Δ mean %| ≥ 5.00%

perf	experiment	goal	Δ mean %	Δ mean % CI	trials	links
❌	dsd_uds_500mb_3k_contexts	ingress throughput	-13.64	[-13.72, -13.55]	1
✅	dsd_uds_100mb_3k_contexts_distributions_only	memory utilization	-15.39	[-15.57, -15.21]	1

Fine details of change detection per experiment

perf	experiment	goal	Δ mean %	Δ mean % CI	trials
➖	dsd_uds_1mb_50k_contexts_memlimit	ingress throughput	+1.89	[-1.43, +5.21]	1
➖	dsd_uds_1mb_3k_contexts	ingress throughput	+0.02	[-0.01, +0.04]	1
➖	dsd_uds_512kb_3k_contexts	ingress throughput	+0.01	[-0.01, +0.03]	1
➖	dsd_uds_10mb_3k_contexts	ingress throughput	+0.01	[-0.04, +0.05]	1
➖	dsd_uds_100mb_3k_contexts	ingress throughput	+0.01	[-0.00, +0.02]	1
➖	dsd_uds_50mb_10k_contexts_no_inlining_no_allocs	ingress throughput	+0.00	[-0.04, +0.04]	1
➖	dsd_uds_50mb_10k_contexts_no_inlining	ingress throughput	+0.00	[-0.00, +0.00]	1
➖	dsd_uds_1mb_50k_contexts	ingress throughput	-0.00	[-0.00, +0.00]	1
➖	dsd_uds_100mb_250k_contexts	ingress throughput	-0.10	[-0.38, +0.17]	1
❌	dsd_uds_500mb_3k_contexts	ingress throughput	-13.64	[-13.72, -13.55]	1
✅	dsd_uds_100mb_3k_contexts_distributions_only	memory utilization	-15.39	[-15.57, -15.21]	1

Explanation

A regression test is an A/B test of target performance in a repeatable rig, where "performance" is measured as "comparison variant minus baseline variant" for an optimization goal (e.g., ingress throughput). Due to intrinsic variability in measuring that goal, we can only estimate its mean value for each experiment; we report uncertainty in that value as a 90.00% confidence interval denoted "Δ mean % CI".

For each experiment, we decide whether a change in performance is a "regression" -- a change worth investigating further -- if all of the following criteria are true:

Its estimated |Δ mean %| ≥ 5.00%, indicating the change is big enough to merit a closer look.
Its 90.00% confidence interval "Δ mean % CI" does not contain zero, indicating that if our statistical model is accurate, there is at least a 90.00% chance there is a difference in performance between baseline and comparison variants.
Its configuration does not mark it "erratic".

pr-commenter · 2024-09-10T19:50:54Z

Regression Detector (DogStatsD)

Regression Detector Results

Run ID: 2c8bb2d4-84a7-4832-8db9-2fb0bd7520fb

Baseline: 7.55.2
Comparison: 7.55.3

Performance changes are noted in the perf column of each table:

✅ = significantly better comparison variant performance
❌ = significantly worse comparison variant performance
➖ = no significant change in performance

No significant changes in experiment optimization goals

Confidence level: 90.00%
Effect size tolerance: |Δ mean %| ≥ 5.00%

There were no significant changes in experiment optimization goals at this confidence level and effect size tolerance.

Fine details of change detection per experiment

perf	experiment	goal	Δ mean %	Δ mean % CI	trials
➖	dsd_uds_100mb_3k_contexts_distributions_only	memory utilization	+0.57	[+0.39, +0.74]	1
➖	dsd_uds_512kb_3k_contexts	ingress throughput	+0.00	[-0.04, +0.04]	1
➖	dsd_uds_1mb_50k_contexts_memlimit	ingress throughput	-0.00	[-0.00, +0.00]	1
➖	dsd_uds_500mb_3k_contexts	ingress throughput	-0.00	[-0.01, +0.01]	1
➖	dsd_uds_100mb_250k_contexts	ingress throughput	-0.00	[-0.05, +0.04]	1
➖	dsd_uds_100mb_3k_contexts	ingress throughput	-0.01	[-0.04, +0.02]	1
➖	dsd_uds_10mb_3k_contexts	ingress throughput	-0.01	[-0.04, +0.01]	1
➖	dsd_uds_1mb_3k_contexts	ingress throughput	-0.02	[-0.06, +0.03]	1
➖	dsd_uds_1mb_50k_contexts	ingress throughput	-0.02	[-0.04, +0.01]	1

Explanation

A regression test is an A/B test of target performance in a repeatable rig, where "performance" is measured as "comparison variant minus baseline variant" for an optimization goal (e.g., ingress throughput). Due to intrinsic variability in measuring that goal, we can only estimate its mean value for each experiment; we report uncertainty in that value as a 90.00% confidence interval denoted "Δ mean % CI".

For each experiment, we decide whether a change in performance is a "regression" -- a change worth investigating further -- if all of the following criteria are true:

Its estimated |Δ mean %| ≥ 5.00%, indicating the change is big enough to merit a closer look.
Its 90.00% confidence interval "Δ mean % CI" does not contain zero, indicating that if our statistical model is accurate, there is at least a 90.00% chance there is a difference in performance between baseline and comparison variants.
Its configuration does not mark it "erratic".

pr-commenter · 2024-09-10T19:51:34Z

Regression Detector Links

Experiment Result Links

experiment	link(s)
dsd_uds_100mb_250k_contexts	[Profiling (ADP)] [Profiling (DSD)] [SMP Dashboard]
dsd_uds_100mb_3k_contexts	[Profiling (ADP)] [Profiling (DSD)] [SMP Dashboard]
dsd_uds_100mb_3k_contexts_distributions_only	[Profiling (ADP)] [Profiling (DSD)] [SMP Dashboard]
dsd_uds_10mb_3k_contexts	[Profiling (ADP)] [Profiling (DSD)] [SMP Dashboard]
dsd_uds_1mb_3k_contexts	[Profiling (ADP)] [Profiling (DSD)] [SMP Dashboard]
dsd_uds_1mb_50k_contexts	[Profiling (ADP)] [Profiling (DSD)] [SMP Dashboard]
dsd_uds_1mb_50k_contexts_memlimit	[Profiling (ADP)] [Profiling (DSD)] [SMP Dashboard]
dsd_uds_500mb_3k_contexts	[Profiling (ADP)] [Profiling (DSD)] [SMP Dashboard]
dsd_uds_512kb_3k_contexts	[Profiling (ADP)] [Profiling (DSD)] [SMP Dashboard]
dsd_uds_50mb_10k_contexts_no_inlining (ADP only)	[Profiling (ADP)] [SMP Dashboard]
dsd_uds_50mb_10k_contexts_no_inlining_no_allocs (ADP only)	[Profiling (ADP)] [SMP Dashboard]

Signed-off-by: Luke Steensen <luke.steensen@gmail.com>

lukesteensen added 2 commits September 3, 2024 17:37

refactor and add shim for multitenant resolvers

ce2a7e5

rearrange and fix tests

e6e46fa

lukesteensen requested review from a team as code owners September 3, 2024 22:55

github-actions bot added area/io General I/O and networking. area/components Sources, transforms, and destinations. source/dogstatsd DogStatsD source. labels Sep 3, 2024

nightly fmt

ab19cba

tobz reviewed Sep 4, 2024

View reviewed changes

lib/saluki-components/src/sources/dogstatsd/mod.rs Outdated Show resolved Hide resolved

tobz reviewed Sep 4, 2024

View reviewed changes

lukesteensen added 2 commits September 10, 2024 13:59

drop origin detection conditional

dff8106

Signed-off-by: Luke Steensen <luke.steensen@gmail.com>

Merge branch 'main' into multitenant-refactor-pt2

e851505

Signed-off-by: Luke Steensen <luke.steensen@gmail.com>

lukesteensen and others added 3 commits September 11, 2024 14:20

Merge branch 'main' into multitenant-refactor-pt2

9eb91bb

get rid of some unnecessary clones and reserves

26e9735

Signed-off-by: Luke Steensen <luke.steensen@gmail.com>

mutably share the resolver

c6170f3

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor to add shim for per-tenant context resolving/interning #232

refactor to add shim for per-tenant context resolving/interning #232

lukesteensen commented Sep 3, 2024

tobz left a comment

tobz Sep 4, 2024

lukesteensen Sep 10, 2024

tobz Sep 10, 2024

pr-commenter bot commented Sep 10, 2024 •

edited

Loading

Fine details of change detection per experiment

Explanation

pr-commenter bot commented Sep 10, 2024 •

edited

Loading

Fine details of change detection per experiment

Explanation

pr-commenter bot commented Sep 10, 2024 •

edited

Loading

refactor to add shim for per-tenant context resolving/interning #232

Are you sure you want to change the base?

refactor to add shim for per-tenant context resolving/interning #232

Conversation

lukesteensen commented Sep 3, 2024

tobz left a comment

Choose a reason for hiding this comment

tobz Sep 4, 2024

Choose a reason for hiding this comment

lukesteensen Sep 10, 2024

Choose a reason for hiding this comment

tobz Sep 10, 2024

Choose a reason for hiding this comment

pr-commenter bot commented Sep 10, 2024 • edited Loading

Regression Detector (Saluki)

Regression Detector Results

Significant changes in experiment optimization goals

Fine details of change detection per experiment

Explanation

pr-commenter bot commented Sep 10, 2024 • edited Loading

Regression Detector (DogStatsD)

Regression Detector Results

No significant changes in experiment optimization goals

Fine details of change detection per experiment

Explanation

pr-commenter bot commented Sep 10, 2024 • edited Loading

Regression Detector Links

Experiment Result Links

pr-commenter bot commented Sep 10, 2024 •

edited

Loading

pr-commenter bot commented Sep 10, 2024 •

edited

Loading

pr-commenter bot commented Sep 10, 2024 •

edited

Loading