[APR-205] chore: allow for contexts to be expired from `ContextResolver` #225

tobz · 2024-08-30T20:18:06Z

Context

Work in progress.

pr-commenter · 2024-08-30T20:34:58Z

Regression Detector (DogStatsD)

Regression Detector Results

Run ID: a69dcce6-3f75-48ab-b27e-ccace597bd7e

Baseline: 7.55.2
Comparison: 7.55.3

Performance changes are noted in the perf column of each table:

✅ = significantly better comparison variant performance
❌ = significantly worse comparison variant performance
➖ = no significant change in performance

No significant changes in experiment optimization goals

Confidence level: 90.00%
Effect size tolerance: |Δ mean %| ≥ 5.00%

There were no significant changes in experiment optimization goals at this confidence level and effect size tolerance.

Fine details of change detection per experiment

perf	experiment	goal	Δ mean %	Δ mean % CI	trials
➖	dsd_uds_100mb_3k_contexts_distributions_only	memory utilization	+1.55	[+1.38, +1.72]	1
➖	dsd_uds_512kb_3k_contexts	ingress throughput	+0.02	[-0.01, +0.04]	1
➖	dsd_uds_500mb_3k_contexts	ingress throughput	+0.00	[-0.01, +0.01]	1
➖	dsd_uds_1mb_3k_contexts	ingress throughput	+0.00	[-0.00, +0.00]	1
➖	dsd_uds_1mb_50k_contexts	ingress throughput	-0.00	[-0.03, +0.03]	1
➖	dsd_uds_1mb_50k_contexts_memlimit	ingress throughput	-0.00	[-0.00, +0.00]	1
➖	dsd_uds_100mb_250k_contexts	ingress throughput	-0.01	[-0.05, +0.03]	1
➖	dsd_uds_100mb_3k_contexts	ingress throughput	-0.02	[-0.04, +0.01]	1
➖	dsd_uds_10mb_3k_contexts	ingress throughput	-0.04	[-0.06, -0.01]	1

Explanation

A regression test is an A/B test of target performance in a repeatable rig, where "performance" is measured as "comparison variant minus baseline variant" for an optimization goal (e.g., ingress throughput). Due to intrinsic variability in measuring that goal, we can only estimate its mean value for each experiment; we report uncertainty in that value as a 90.00% confidence interval denoted "Δ mean % CI".

For each experiment, we decide whether a change in performance is a "regression" -- a change worth investigating further -- if all of the following criteria are true:

Its estimated |Δ mean %| ≥ 5.00%, indicating the change is big enough to merit a closer look.
Its 90.00% confidence interval "Δ mean % CI" does not contain zero, indicating that if our statistical model is accurate, there is at least a 90.00% chance there is a difference in performance between baseline and comparison variants.
Its configuration does not mark it "erratic".

pr-commenter · 2024-08-30T20:52:43Z

Regression Detector (Saluki)

Regression Detector Results

Run ID: 7632c9ca-5b9a-45bf-9f09-6d9cc0e9fbc5

Baseline: c1acd46
Comparison: d2701a0

Performance changes are noted in the perf column of each table:

✅ = significantly better comparison variant performance
❌ = significantly worse comparison variant performance
➖ = no significant change in performance

Significant changes in experiment optimization goals

Confidence level: 90.00%
Effect size tolerance: |Δ mean %| ≥ 5.00%

perf	experiment	goal	Δ mean %	Δ mean % CI	trials	links
❌	dsd_uds_100mb_3k_contexts_distributions_only	memory utilization	+5.57	[+5.33, +5.81]	1
❌	dsd_uds_100mb_250k_contexts	ingress throughput	-5.79	[-6.30, -5.28]	1

Fine details of change detection per experiment

perf	experiment	goal	Δ mean %	Δ mean % CI	trials
❌	dsd_uds_100mb_3k_contexts_distributions_only	memory utilization	+5.57	[+5.33, +5.81]	1
➖	dsd_uds_1mb_50k_contexts_memlimit	ingress throughput	+4.62	[+1.38, +7.86]	1
➖	dsd_uds_1mb_3k_contexts	ingress throughput	+0.03	[+0.00, +0.06]	1
➖	dsd_uds_50mb_10k_contexts_no_inlining_no_allocs	ingress throughput	+0.00	[-0.02, +0.03]	1
➖	dsd_uds_100mb_3k_contexts	ingress throughput	+0.00	[-0.01, +0.01]	1
➖	dsd_uds_512kb_3k_contexts	ingress throughput	-0.00	[-0.03, +0.03]	1
➖	dsd_uds_1mb_50k_contexts	ingress throughput	-0.00	[-0.00, +0.00]	1
➖	dsd_uds_50mb_10k_contexts_no_inlining	ingress throughput	-0.00	[-0.05, +0.04]	1
➖	dsd_uds_10mb_3k_contexts	ingress throughput	-0.02	[-0.04, +0.01]	1
➖	dsd_uds_500mb_3k_contexts	ingress throughput	-1.26	[-1.33, -1.19]	1
❌	dsd_uds_100mb_250k_contexts	ingress throughput	-5.79	[-6.30, -5.28]	1

Explanation

A regression test is an A/B test of target performance in a repeatable rig, where "performance" is measured as "comparison variant minus baseline variant" for an optimization goal (e.g., ingress throughput). Due to intrinsic variability in measuring that goal, we can only estimate its mean value for each experiment; we report uncertainty in that value as a 90.00% confidence interval denoted "Δ mean % CI".

For each experiment, we decide whether a change in performance is a "regression" -- a change worth investigating further -- if all of the following criteria are true:

Its estimated |Δ mean %| ≥ 5.00%, indicating the change is big enough to merit a closer look.
Its 90.00% confidence interval "Δ mean % CI" does not contain zero, indicating that if our statistical model is accurate, there is at least a 90.00% chance there is a difference in performance between baseline and comparison variants.
Its configuration does not mark it "erratic".

pr-commenter · 2024-08-30T20:53:15Z

Regression Detector Links

Experiment Result Links

experiment	link(s)
dsd_uds_100mb_250k_contexts	[Profiling (ADP)] [Profiling (DSD)] [SMP Dashboard]
dsd_uds_100mb_3k_contexts	[Profiling (ADP)] [Profiling (DSD)] [SMP Dashboard]
dsd_uds_100mb_3k_contexts_distributions_only	[Profiling (ADP)] [Profiling (DSD)] [SMP Dashboard]
dsd_uds_10mb_3k_contexts	[Profiling (ADP)] [Profiling (DSD)] [SMP Dashboard]
dsd_uds_1mb_3k_contexts	[Profiling (ADP)] [Profiling (DSD)] [SMP Dashboard]
dsd_uds_1mb_50k_contexts	[Profiling (ADP)] [Profiling (DSD)] [SMP Dashboard]
dsd_uds_1mb_50k_contexts_memlimit	[Profiling (ADP)] [Profiling (DSD)] [SMP Dashboard]
dsd_uds_500mb_3k_contexts	[Profiling (ADP)] [Profiling (DSD)] [SMP Dashboard]
dsd_uds_512kb_3k_contexts	[Profiling (ADP)] [Profiling (DSD)] [SMP Dashboard]
dsd_uds_50mb_10k_contexts_no_inlining (ADP only)	[Profiling (ADP)] [SMP Dashboard]
dsd_uds_50mb_10k_contexts_no_inlining_no_allocs (ADP only)	[Profiling (ADP)] [SMP Dashboard]

tobz · 2024-09-09T19:44:57Z

Just to jot down some notes here..

The two biggest problems are that what we really want to be able to do is:

avoid having to allocate in order to signal that a context is no longer used
leave the metric around for a little while (like a cache, with a TTL) so that we aren't just immediately blowing away resolved contexts

We can solve the first problem with Arc<T>-like semantics, just tracking when no outstanding reference to a context exists (besides our reference in the resolver) and then triggering the removal of that context... but that means while we're very precise about expiration, we actually expire too fast which means we spend gobs of time re-interning because of having to search through the interner.

If we made the interner O(1)-esque, then this might not be a problem... but doing so would also mean that it would be far less bounded than it currently is.

Likewise, we can trivially solve the second problem by just incrementally iterating over the resolved contexts, with sleeps in between, which isn't so much a true TTL as much as it simply introduces an inherently delay between a context becoming unused and being cleaned up. This, however, means that we either need to use a scheme that allows crawling the list in chunks (which will need locking) or crawling it in full, every time, which is naturally more and more expensive as the number of resolved contexts go up... and still isn't a true TTL.

I was trying to noodle around the idea of how to make the "signal that this context is now unused" bit super cheap, which would allow us to register it somewhere that could then try to do more of a true "has it been unused for more than X seconds?" check... but so far I haven't come up with something sufficiently simple and performant.

…background reclamation

…delayed background reclamation" This reverts commit 39e4229.

github-actions bot added area/core Core functionality, event model, etc. area/components Sources, transforms, and destinations. source/dogstatsd DogStatsD source. labels Aug 30, 2024

tobz force-pushed the tobz/context-resolver-expiration branch from 57319e9 to a03d559 Compare September 5, 2024 18:28

github-actions bot removed area/components Sources, transforms, and destinations. source/dogstatsd DogStatsD source. labels Sep 9, 2024

tobz force-pushed the tobz/context-resolver-expiration branch from 9a479dc to 39e4229 Compare September 10, 2024 16:02

github-actions bot added area/components Sources, transforms, and destinations. source/dogstatsd DogStatsD source. labels Sep 11, 2024

tobz added 7 commits September 11, 2024 19:22

wip

6741cfd

switch to papaya, but we expire too well... gotta go back to delayed …

94a5da4

…background reclamation

Revert "switch to papaya, but we expire too well... gotta go back to …

2eabf02

…delayed background reclamation" This reverts commit 39e4229.

Wip

9a9ab8c

wip

0990dfe

try this on for size

752081c

Just be really sure we're using a debug image

59a5033

tobz force-pushed the tobz/context-resolver-expiration branch from acf0ecc to 59a5033 Compare September 11, 2024 19:38

github-actions bot added the area/ci CI/CD, automated testing, etc. label Sep 11, 2024

tobz added 2 commits September 11, 2024 19:42

really make sure we aren't stripping the builds, either

6b1fe48

seems to be functional, let's see what SMP has to say about it

1579b3b

github-actions bot removed area/components Sources, transforms, and destinations. source/dogstatsd DogStatsD source. labels Sep 12, 2024

tobz added 2 commits September 12, 2024 17:56

simpler approach to collecting resolver metrics

f039750

ok, try _this_

cb380f3

github-actions bot added area/components Sources, transforms, and destinations. transform/aggregate Aggregate transform. labels Sep 13, 2024

tobz added 7 commits September 13, 2024 19:43

now try with a way bigger update op queue capacity

dcabad2

and now this simpler approach

6fdb56b

what about with jemalloc

767b288

let's see what this does

68ad669

anotha one

dcb6ec7

Merge branch 'main' into tobz/context-resolver-expiration

539f1c9

no mo jemalloc

d2701a0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[APR-205] chore: allow for contexts to be expired from `ContextResolver` #225

[APR-205] chore: allow for contexts to be expired from `ContextResolver` #225

tobz commented Aug 30, 2024

pr-commenter bot commented Aug 30, 2024 •

edited

Loading

Fine details of change detection per experiment

Explanation

pr-commenter bot commented Aug 30, 2024 •

edited

Loading

Fine details of change detection per experiment

Explanation

pr-commenter bot commented Aug 30, 2024 •

edited

Loading

tobz commented Sep 9, 2024

[APR-205] chore: allow for contexts to be expired from ContextResolver #225

Are you sure you want to change the base?

[APR-205] chore: allow for contexts to be expired from ContextResolver #225

Conversation

tobz commented Aug 30, 2024

Context

pr-commenter bot commented Aug 30, 2024 • edited Loading

Regression Detector (DogStatsD)

Regression Detector Results

No significant changes in experiment optimization goals

Fine details of change detection per experiment

Explanation

pr-commenter bot commented Aug 30, 2024 • edited Loading

Regression Detector (Saluki)

Regression Detector Results

Significant changes in experiment optimization goals

Fine details of change detection per experiment

Explanation

pr-commenter bot commented Aug 30, 2024 • edited Loading

Regression Detector Links

Experiment Result Links

tobz commented Sep 9, 2024

[APR-205] chore: allow for contexts to be expired from `ContextResolver` #225

[APR-205] chore: allow for contexts to be expired from `ContextResolver` #225

pr-commenter bot commented Aug 30, 2024 •

edited

Loading

pr-commenter bot commented Aug 30, 2024 •

edited

Loading

pr-commenter bot commented Aug 30, 2024 •

edited

Loading