-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Support For Multiplexing System.Diagnostics.Metrics
for Dotnet Monitor
and Dotnet Counters
#86504
Conversation
…ic testing in place now. Needs refinement and more testing to guarantee old behavior is still preserved.
… shutting down the shared session
… end, we finally disable, and then a new session can start when an enable is received. Also added in test for this.
…ead/Multiplexing_2
...ies/System.Diagnostics.DiagnosticSource/src/System/Diagnostics/Metrics/AggregationManager.cs
Show resolved
Hide resolved
@@ -45,6 +45,8 @@ internal sealed class MetricsEventSource : EventSource | |||
{ | |||
public static readonly MetricsEventSource Log = new(); | |||
|
|||
private const string SharedSessionId = "SHARED"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As mentioned in the diagnostics
repo, using the previously discussed "SHARED" value to signify these special sessions
|
||
string commandSessionId = GetSessionId(command); | ||
|
||
if ((command.Command == EventCommand.Update |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A lot of the actual code here needs to be cleaned up so don't bother too much with that, but here's the main idea:
- If we don't currently have a shared session, proceed as we would before
- If we have an active shared session, see if we can fuse with it (by checking that the args are consistent) -> if not, we give a
MultipleSessionsConfiguredIncorrectlyError
. -> if so, we callUpdate
on the_aggregationManager
. - We do ref counting for enables/disables - once all SHARED sessions have been disabled, this frees the
_aggregationManager
, allowing a new session to start.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Due to EventSource limitations I think you will struggle to do refcounting. I believe the only thing you can know for sure is that when IsEnabled() returns false then there are no sessions active. I think @davmason remembers these details better than I though.
(I ignored looking at the code in more detail, as you suggested)
...ies/System.Diagnostics.DiagnosticSource/src/System/Diagnostics/Metrics/MetricsEventSource.cs
Outdated
Show resolved
Hide resolved
@@ -16,7 +16,7 @@ namespace System.Diagnostics.Metrics.Tests | |||
public class MetricEventSourceTests | |||
{ | |||
ITestOutputHelper _output; | |||
const double IntervalSecs = 10; | |||
const double IntervalSecs = 4; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would recommend mostly ignoring the tests file for this draft - I have added a lot of tests around this (all of which pass), but there's a lot of cleanup that needs to happen here since I was adding new tests any time I found a bug (which means there's probably a lot of redundancy).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The tests are now in a more cleaned-up state for anyone who wants to take a look (I still need to change this variable back and remove the commenting for the OuterLoop attribute)
...ies/System.Diagnostics.DiagnosticSource/src/System/Diagnostics/Metrics/AggregationManager.cs
Outdated
Show resolved
Hide resolved
...ies/System.Diagnostics.DiagnosticSource/src/System/Diagnostics/Metrics/MetricsEventSource.cs
Outdated
Show resolved
Hide resolved
...ies/System.Diagnostics.DiagnosticSource/src/System/Diagnostics/Metrics/MetricsEventSource.cs
Outdated
Show resolved
Hide resolved
...ies/System.Diagnostics.DiagnosticSource/src/System/Diagnostics/Metrics/MetricsEventSource.cs
Outdated
Show resolved
Hide resolved
...ies/System.Diagnostics.DiagnosticSource/src/System/Diagnostics/Metrics/MetricsEventSource.cs
Show resolved
Hide resolved
private void IncrementRefCount(string uniqueIdentifier, EventCommandEventArgs command) | ||
{ | ||
// Could be unsafe if UniqueIdentifier protocol isn't followed | ||
if (command.Arguments!.TryGetValue("UniqueIdentifier", out string? uniqueIdentifierArg)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we just reject shared sessions that do not have UniqueIdentifiers?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure that's possible...the way Shared sessions work, every consumer (e.g. dotnet-monitor) with a SHARED session id will listen to the events, unless the UniqueIdentifier (now renamed to ClientId) has a MultipleSessionsConfiguredIncorrectlyError
. Without a ClientId, we have no way to communicate to that particular consumer that they shouldn't be allowed to listen to the SHARED session. In theory you could shut down the whole SHARED session, but that doesn't necessarily feel like a great solution either.
...ies/System.Diagnostics.DiagnosticSource/src/System/Diagnostics/Metrics/MetricsEventSource.cs
Outdated
Show resolved
Hide resolved
...ies/System.Diagnostics.DiagnosticSource/src/System/Diagnostics/Metrics/AggregationManager.cs
Outdated
Show resolved
Hide resolved
...ies/System.Diagnostics.DiagnosticSource/src/System/Diagnostics/Metrics/AggregationManager.cs
Outdated
Show resolved
Hide resolved
...ies/System.Diagnostics.DiagnosticSource/src/System/Diagnostics/Metrics/MetricsEventSource.cs
Outdated
Show resolved
Hide resolved
...ies/System.Diagnostics.DiagnosticSource/src/System/Diagnostics/Metrics/MetricsEventSource.cs
Outdated
Show resolved
Hide resolved
if (command.Command == EventCommand.Update | ||
|| command.Command == EventCommand.Enable) | ||
{ | ||
IncrementRefCount(commandSessionId, command); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@davmason - EventSource sends Enable events even when sessions are being disabled right, except for the last one? That makes it impossible to do this kind of ref-counting and why we had to resort to testing IsEnabled() instead for EventCounters as I recall.
Is behavior documented anywhere? I always struggle to remember it, nobody guesses it because its unintuitive and pages where I'd expect to find it don't mention it:
https://learn.microsoft.com/en-us/dotnet/api/system.diagnostics.tracing.eventcommand?view=net-7.0
https://learn.microsoft.com/en-us/dotnet/api/system.diagnostics.tracing.eventsource.oneventcommand?view=net-7.0#system-diagnostics-tracing-eventsource-oneventcommand(system-diagnostics-tracing-eventcommandeventargs)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm leaving this as unresolved for now (I'm not sure if @davmason had a chance to take a look), but assuming the ClientId
protocol is followed by all consumers, is there a reason why this can't/shouldn't work? In my own testing the ref counting seems to work properly (I tested with DM metrics, DM live metrics, and Dotnet Counters simultaneously), but I may be lacking some depth in understanding around this issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I didn't see Noah's comment. The underlying providers (ETW and EventPipe) send an enable command for every disabled session except the one that turns the provider off completely, but EventPipe will translate those to Enables so you don't have to worry about it.
The problem with ref counting is that we can't guarantee an even number of Enables and Disables with EventListeners. An EventListener can send as many Enables as they want and will send at most one Disable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For EventCounters we just check on every Disable command if the EventSource is active and shut everything down if it is not
|
||
string commandSessionId = GetSessionId(command); | ||
|
||
if ((command.Command == EventCommand.Update |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Due to EventSource limitations I think you will struggle to do refcounting. I believe the only thing you can know for sure is that when IsEnabled() returns false then there are no sessions active. I think @davmason remembers these details better than I though.
(I ignored looking at the code in more detail, as you suggested)
...ies/System.Diagnostics.DiagnosticSource/src/System/Diagnostics/Metrics/MetricsEventSource.cs
Outdated
Show resolved
Hide resolved
...ies/System.Diagnostics.DiagnosticSource/src/System/Diagnostics/Metrics/MetricsEventSource.cs
Outdated
Show resolved
Hide resolved
...ies/System.Diagnostics.DiagnosticSource/src/System/Diagnostics/Metrics/AggregationManager.cs
Outdated
Show resolved
Hide resolved
… as basic test for this). Also includes some cleanup and refactoring.
…ead/Multiplexing_2_backup
...ies/System.Diagnostics.DiagnosticSource/src/System/Diagnostics/Metrics/AggregationManager.cs
Outdated
Show resolved
Hide resolved
...ies/System.Diagnostics.DiagnosticSource/src/System/Diagnostics/Metrics/AggregationManager.cs
Show resolved
Hide resolved
@@ -21,17 +21,17 @@ internal sealed class AggregationManager | |||
// these fields are modified after construction and accessed on multiple threads, use lock(this) to ensure the data | |||
// is synchronized | |||
private readonly List<Predicate<Instrument>> _instrumentConfigFuncs = new(); | |||
private TimeSpan _collectionPeriod; | |||
public TimeSpan CollectionPeriod { get; private set; } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: consider scoping the lock to this property, rather than 'this' + _aggregationManager
…ead/Multiplexing_2_backup
…ead/Multiplexing_2_backup
…dded some comments.
@@ -217,6 +225,16 @@ public void UpDownCounterRateValuePublished(string sessionId, string meterName, | |||
WriteEvent(16, sessionId, meterName, meterVersion ?? "", instrumentName, unit ?? "", tags, rate, value); | |||
} | |||
|
|||
[Event(17, Keywords = Keywords.TimeSeriesValues)] | |||
#if !NET8_0_OR_GREATER |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm lacking some context here since this change happened midway through my change, but it appears that builds were failing due to this not being here - let me know if this isn't the correct resolution
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good, we made a change in EventSource to get rid of most of these suppression attributes, so after that change it would not be required except on below 8.
@@ -25,7 +25,638 @@ public MetricEventSourceTests(ITestOutputHelper output) | |||
} | |||
|
|||
[ConditionalFact(typeof(PlatformDetection), nameof(PlatformDetection.IsNotBrowser))] | |||
[OuterLoop("Slow and has lots of console spew")] | |||
//[OuterLoop("Slow and has lots of console spew")] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For convenience I've been leaving these commenting as we iterate (and using a reduced IntervalSecs) - I'll revert all of this before merging.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The shared session approach looks good to me. I didn't review it line by line. If you need me to look at a specific section of code let me know.
… Monitor` and `Dotnet Counters` (#3889) This PR corresponds to dotnet/runtime#86504 from the runtime repo to support multiplexing for `System.Diagnostics.Metrics`. This change converts `dotnet monitor` and `dotnet counters` to use a shared session, assuming the `MaxHistograms`, `MaxTimeSeries`, and `IntervalSeconds` are all the same.
This PR corresponds to dotnet/diagnostics#3889 from the diagnostics repo to support multiplexing for
System.Diagnostics.Metrics
. This change convertsdotnet monitor
anddotnet counters
to use a shared session, assuming theMaxHistograms
,MaxTimeSeries
, andIntervalSeconds
are all the same. The code is in a functional state for anyone who wants to interact with it locally; however, there may still be some bugs (I haven't done extensive testing yet).I'll call attention to specific areas where I'm looking for feedback via comments.
@wiktork @davmason @noahfalk