Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[dotnet-trace] Add option to stop tracing on trigger #3125

Closed
steveisok opened this issue Jun 15, 2022 · 5 comments · Fixed by #4363
Closed

[dotnet-trace] Add option to stop tracing on trigger #3125

steveisok opened this issue Jun 15, 2022 · 5 comments · Fixed by #4363
Labels
dotnet-trace enhancement New feature or request
Milestone

Comments

@steveisok
Copy link
Member

steveisok commented Jun 15, 2022

In order to better support tracing in mobile scenarios, we would like the ability to stop a trace based on a trigger. For example, we are using dotnet-trace to profile an Android application at startup. It would be helpful if we could stop a trace based on a common method name.

An example:

dotnet-trace /trigger:MethodName=<method-name:>
@steveisok
Copy link
Member Author

@tommcdon Can you add the enhancement label? I don't appear to be able to.

@tommcdon tommcdon added enhancement New feature or request dotnet-trace labels Jun 15, 2022
@tommcdon tommcdon added this to the 7.0.0 milestone Jun 15, 2022
@tommcdon
Copy link
Member

@tommcdon Can you add the enhancement label? I don't appear to be able to.

Added

@tommcdon tommcdon modified the milestones: 7.0.0, 8.0.0 Sep 12, 2022
@samsp-msft
Copy link
Member

Q: Should there be an API in System.Diagnostics for the app to be able to control tracing. GoLang has you call profiling api's explicitly as part of the app code to start and stop tracing.

We could have an API to:

  • Start tracing if a tracer is attached, and block if none is attached
  • Start tracing and not block
  • Stop tracing
  • Write a sign-post event into the trace. When debugging an issue, or trying to understand performance, it can be useful to be able to write simple events that can then act as signposts when doing analysis, without having to go to the effort of declaring a custom event, and making sure its collected. It would be the tracing equivalent of System.DIagnostics.Debug.WriteLine.

@steveisok
Copy link
Member Author

/cc @lateralusX

@lateralusX
Copy link
Member

First step will be to have similar support as added to dotnet-monitor around stop trigger. Original idea when implementing that was to share/port that code into dotnet-trace as well so we end up with similar functionality. Alternative is to do something along perfviews abilities to start/stop trace based on custom event triggers, but since we already implemented something similar in dotnet-monitor, I believe it make more sense to align to that implementation (or having multiple options).

One challenge with this implementation is to keep event streaming running with least amount of impact (since a degrade in read performance could lead to dropped events), side effect of this is that once the trigger has been detected and session have been stopped, .nettrace file will still need to include rundown events, so there will always be an unknow number of events past stop trigger that might end up in nettrace file. This could be an issue if the nettrace file will be used together with tools like dotnet-pgo, since that might lead to more methods getting AOT:ed than anticipated. In order to mitigate that scenario + adding support for a bunch of new scenarios, dotnet/runtime#89853 was implemented giving the user ability to use the same method as used as stop trigger as the end filter passed to dotnet-pgo. The new filter capabilities in dotnet-pgo together with ability to merge mibc file and ability in Mono cross compilers to accept multiple mibc files opens up ability to make detailed control on what methods that will an up AOT:ed in the profiled AOT use case.

If we are going to utilize the same capabilities as dotnet-monitor in dotnet-traces stop trigger implementation, the following PR reference the implementation of stop trigger in dotnet-monitor, dotnet/dotnet-monitor#2557.

hoyosjs pushed a commit that referenced this issue Nov 2, 2023
Fixes #3125

This PR provides users another method to stop a dotnet-trace via a
stopping event similar to that of dotnet-monitor, originally implemented
in dotnet/dotnet-monitor#2557.

Three arguments are added to the `dotnet-trace collect` command to
specify a stopping event:
| Argument | Description |
|----------|----------|
|`--stopping-event-provider-name` | A string, parsed as-is, that will
stop the trace upon hitting an event with the matching provider name.
For a more specific stopping event, additionally provide
`--stopping-event-event-name` and/or `--stopping-event-payload-filter`.
|
| `--stopping-event-event-name` | A string, parsed as-is, that will stop
the trace upon hitting an event with the matching event name. Requires
`--stopping-event-provider-name` to be set. For a more specific stopping
event, additionally provide `--stopping-event-payload-filter`. |
| `--stopping-event-payload-filter` | A string, parsed as
[payload_field_name]:[payload_field_value] pairs separated by commas,
that will stop the trace upon hitting an event with a matching payload.
Requires `--stopping-event-provider-name` and
`--stopping-event-event-name` to be set. |

Note: Though technically `--stopping-event-payload-filter` can be set
without needing a `--stopping-event-event-name`, this may lead to
mismatched payloads should another `TraceEvent` under the same provider
not have that particular payload field name. Until there is a good
reason to stop a trace given a payload filter regardless of the event
name, we require `--stopping-event-event-name` to be set whenever
`--stopping-event-payload-filter` is provided.

To stop a trace at a particular event, dotnet-monitor's
[approach](https://github.com/dotnet/dotnet-monitor/blob/0820b6911f3ac47b6b5ec867ac906699e5c15787/src/Tools/dotnet-monitor/Trace/TraceUntilEventOperation.cs#L47)
using an
[EventMonitor](https://github.com/dotnet/diagnostics/blob/main/src/Microsoft.Diagnostics.Monitoring.EventPipe/EventMonitor.cs)
is adopted. Upon hitting a TraceEvent with a matching ProviderName,
EventName (if specified), and PayloadFilter (if specified), we trigger
dotnet-trace's fallback logic to stop the EventPipeSession before the
EventStream ends.

Note: As the EventStream is being parsed asynchronously, there will be
some events that pass through between the time a trace event matching
the specified stopping event arguments is parsed and the
EventPipeSession is stopped.

In addition, this PR modifies `EventMonitor` to use the
`ClrTraceEventParser` to parse `TraceEvent` objects under the
`Microsoft-Windows-DotNETRuntime` provider, and the
`DynamicTraceEventParser` otherwise. The `ClrTraceEventParser` is
generated to understand the ETW event manifest for
`Microsoft-Windows-DotNETRuntime` events. The previous implementation
defaulting to `DynamicTraceEventParser` would not work on non-Windows
platforms such as OSX which could not parse the payload to populate
`PayloadNames` and `PayloadValue(i)` because there was no manifest
available. On the other hand, Windows is able to locate manifest data
for known events through the OS.

-------------------

## Testing

With an Android App
``` C#
    private static void PrintA()
    {
        Console.WriteLine("A");
        Thread.Sleep(1000);
    }
    
    ...
    
    private static void PrintK()
    {
        Console.WriteLine("K");
        Thread.Sleep(1000);
    }

    public static void Main(string[] args)
    {
        Console.WriteLine("Hello, Android!"); // logcat
        PrintA();
        PrintB();
        PrintC();
        PrintD();
        PrintE();
        PrintF();
        PrintG();
        PrintH();
        PrintI();
        PrintJ();
        PrintK();

        while (true)
        {
            Thread.Sleep(100);
        }
    }
```

Running dotnet-dsrouter to connect the diagnostic tooling with the
android device
`./artifacts/bin/dotnet-dsrouter/Debug/net6.0/dotnet-dsrouter android -v
debug`

Tracing with a stopping event args provided
`./artifacts/bin/dotnet-trace/Debug/net6.0/dotnet-trace collect -p 16683
--providers Microsoft-Windows-DotNETRuntime:0x1F000080018:5
--stopping-event-provider-name Microsoft-Windows-DotNETRuntime
--stopping-event-event-name Method/JittingStarted
--stopping-event-payload-filter MethodName:PrintA`


[dotnet-dsrouter_20231024_165648.nettrace.zip](https://github.com/dotnet/diagnostics/files/13147788/dotnet-dsrouter_20231024_165648.nettrace.zip)

There are no `Method/JittingStarted` events with MethodName `PrintC`
through `PrintK` in the `.nettrace`, showing that the trace was indeed
stopped after seeing the `PrintA` event. The events after `PrintA` are
an artifact of the second note above. Below is the JITStats of the
.nettrace opened in perfview, showing that the last method was `PrintB`.

<img width="1128" alt="JITStatsPrintA"
src="https://github.com/dotnet/diagnostics/assets/16830051/1742baf4-b528-43c3-aef3-b00a576f2fb8">
@ghost ghost locked as resolved and limited conversation to collaborators Dec 3, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
dotnet-trace enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants