Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes unhandled exception on trace if process cannot start #3409

Merged
merged 7 commits into from
Oct 5, 2022

Conversation

dramos020
Copy link
Contributor

Fixes #3381. This was occuring because the tool was still trying to call Process.HasExited even though an exception was thrown when trying to start the process. Fixed by maintaining a boolean that indicates that a process was successfully started and only tries to kill the process at the end if it was successfully started.

Here is the solution's output when an invalid file is given:

PS C:\code\diagnostics\artifacts\bin\dotnet-trace\Debug\netcoreapp3.1> ./dotnet-trace collect -- garbage
No profile or providers specified, defaulting to trace profile 'cpu-sampling'

Provider Name                           Keywords            Level               Enabled By
Microsoft-DotNETCore-SampleProfiler     0x0000F00000000000  Informational(4)    --profile
Microsoft-Windows-DotNETRuntime         0x00000014C14FCCBD  Informational(4)    --profile

Launching: garbage
Cannot start target process: garbage
System.ComponentModel.Win32Exception (2): An error occurred trying to start process 'garbage' with working directory 'C:\code\diagnostics\artifacts\bin\dotnet-trace\Debug\netcoreapp3.1'. The system cannot find the file specified.
   at System.Diagnostics.Process.StartWithCreateProcess(ProcessStartInfo startInfo)
   at System.Diagnostics.Process.Start()
   at Microsoft.Internal.Common.Utils.ProcessLauncher.Start(String diagnosticTransportName, CancellationToken ct, Boolean showChildIO, Boolean printLaunchCommand) in C:\code\diagnostics\src\Tools\Common\ReversedServerHelpers\ReversedServerHelpers.cs:line 99
[ERROR] System.InvalidOperationException: Failed to start 'garbage '.
   at Microsoft.Internal.Common.Utils.DiagnosticsClientBuilder.Build(CancellationToken ct, Int32 processId, String diagnosticPort, Boolean showChildIO, Boolean printLaunchCommand) in C:\code\diagnostics\src\Tools\Common\ReversedServerHelpers\ReversedServerHelpers.cs:line 219
   at Microsoft.Diagnostics.Tools.Trace.CollectCommandHandler.Collect(CancellationToken ct, IConsole console, Int32 processId, FileInfo output, UInt32 buffersize, String providers, String profile, TraceFileFormat format, TimeSpan duration, String clrevents, String clreventlevel, String name, String diagnosticPort, Boolean showchildio, Boolean resumeRuntime) in C:\code\diagnostics\src\Tools\dotnet-trace\CommandLine\Commands\CollectCommand.cs:line 169

Here is the solution's output when working with a valid .exe file:

PS C:\code\diagnostics\artifacts\bin\dotnet-trace\Debug\netcoreapp3.1> ./dotnet-trace collect -- C:\code\RuntimeTesting\bin\Debug\net6.0\RuntimeTesting.exe
No profile or providers specified, defaulting to trace profile 'cpu-sampling'

Provider Name                           Keywords            Level               Enabled By
Microsoft-DotNETCore-SampleProfiler     0x0000F00000000000  Informational(4)    --profile
Microsoft-Windows-DotNETRuntime         0x00000014C14FCCBD  Informational(4)    --profile

Launching: C:\code\RuntimeTesting\bin\Debug\net6.0\RuntimeTesting.exe
Process        : C:\code\RuntimeTesting\bin\Debug\net6.0\RuntimeTesting.exe
Output File    : C:\code\diagnostics\artifacts\bin\dotnet-trace\Debug\netcoreapp3.1\RuntimeTesting.exe_20220923_155320.nettrace

[00:00:00:00]   Recording trace 397.686  (KB)
Press <Enter> or <Ctrl+C> to exit...

Trace completed.
Process exited with code '0'.

@dramos020 dramos020 requested a review from a team as a code owner September 23, 2022 23:03
@dramos020 dramos020 removed the request for review from davmason September 23, 2022 23:31
Co-authored-by: Juan Hoyos <juan.hoyos@microsoft.com>
@hoyosjs
Copy link
Member

hoyosjs commented Sep 23, 2022

You might want to consider a step further. Is printing the exception to console in

really useful to users?

@dramos020 dramos020 removed the request for review from noahfalk September 26, 2022 17:40
Copy link
Member

@noahfalk noahfalk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Glad you were able to find the issue but I recommend we change how the fix is implemented.

@dramos020
Copy link
Contributor Author

dramos020 commented Sep 27, 2022

Took the suggestion from Noah to move processStarted bool to be managed inside the ProcessLauncher in order to minimize coupling. Also changed the printing of exceptions when the process does not start properly to only print the message instead. This makes it clearer that the exception was caught and only displays information useful to most users. Failures now look like this:

PS C:\code\diagnostics\artifacts\bin\dotnet-trace\Debug\netcoreapp3.1> ./dotnet-trace collect -- garbage
No profile or providers specified, defaulting to trace profile 'cpu-sampling'

Provider Name                           Keywords            Level               Enabled By
Microsoft-DotNETCore-SampleProfiler     0x0000F00000000000  Informational(4)    --profile
Microsoft-Windows-DotNETRuntime         0x00000014C14FCCBD  Informational(4)    --profile

Launching: garbage
Cannot start target process: garbage
An error occurred trying to start process 'garbage' with working directory 'C:\code\diagnostics\artifacts\bin\dotnet-trace\Debug\netcoreapp3.1'. The system cannot find the file specified.
[ERROR] Failed to start 'garbage '.

Copy link
Member

@noahfalk noahfalk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with the one other change I commented inline

Co-authored-by: Noah Falk <noahfalk@users.noreply.github.com>
}
catch (Exception e)
{
Console.WriteLine($"Cannot start target process: {_childProc.StartInfo.FileName} {_childProc.StartInfo.Arguments}");
Console.WriteLine(e.ToString());
Console.WriteLine(e.Message);
Copy link
Member

@noahfalk noahfalk Sep 29, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than letting the code handle the error here we can refactor this slightly to get more useful error handling behavior.

  1. Move the CommandLineErrorException to the Tools/Common code and remove the "Counters" from
    the namespace. This type can be generally useful and has nothing specific to counters.
    https://github.dev/dotnet/diagnostics/blob/main/src/Tools/dotnet-counters/CommandLineErrorException.cs

  2. Instead of having ProcessLauncher.Start() print to the console on error, let it throw CommandLineErrorException instead and put the text it would have written to the console in the Exception message.

  3. Then add a catch clause for the CommandLineErrorException that will print out that message in DiagnosticsServerRouterRunner.runRouter(...) and CollectCommandHandler.Collect(...). For example in Collect(...)

// these are errors we anticipated and we know the error message will make sense to
// print to the console
catch(CommandLineErrorException e)
{
    Console.Error.WriteLine("[ERROR] " + e.Message);
    collectionStopped = true;
    ret = ReturnCode.TracingError;
}
// these errors could be anything and the messages may not make sense without context
// add the stack trace to help diagnose even though its ugly. Ideally no errors would
// get here.
catch(Exception e)
{
    Console.Error.WriteLine(e.ToString());
    collectionStopped = true;
    ret = ReturnCode.TracingError;
}

Give that a shot and see if it produces a better looking result?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did this for CollectCommandHandler.Collect() and the output would look like this:

No profile or providers specified, defaulting to trace profile 'cpu-sampling'

Provider Name                           Keywords            Level               Enabled By
Microsoft-DotNETCore-SampleProfiler     0x0000F00000000000  Informational(4)    --profile
Microsoft-Windows-DotNETRuntime         0x00000014C14FCCBD  Informational(4)    --profile

Launching: garbage
[ERROR] Cannot start target process: garbage

I'm still not sure though why would add a catch clause for a CommandLineErrorException in DiagnosticsServerRouterRunner.runRouter(). I don't think Process.Start() ever gets called here. DiagnosticsServerRouterRunner.Start() and Router.Start() get called, both of which don't throw a CommandLineException.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was under the impression that runRouter() calls OnRouterStarted() (here) and OnRouterStarted calls ProcessLauncher.Start() (here)

If I got that wrong my underlying intent is if ds-router calls ProcessLauncher.Start() anywhere then we should figure out what is the callstack where that call occurs and make sure the CommandLineErrorException will be caught.

Copy link
Member

@noahfalk noahfalk Sep 30, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we also add this to the CommandLineErrorException message? "An error occurred trying to start process 'garbage' with working directory 'C:\code\diagnostics\artifacts\bin\dotnet-trace\Debug\netcoreapp3.1'. The system cannot find the file specified. I assume this part of the text is coming from the Exception.Message property on the exception that is thrown by Process.Start()?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, I hadn't checked OnRouterStarted() and I see the Process.Start() call now. I'll add the catch clause to runRouter().

Also, the message property from the exception thrown by Process.Start() is actually only:

The system cannot find the file specified.

Don't know if we still want to add it to the CommandLineErrorException message.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, do you think we really need a separate catch statement for CommandLineErrorExceptions for runRouter()? The catch statement that is currently in place only sends the exception message for all exceptions. Unless we want to change it to log only the message for CommandLineErrorExceptions and the entire ToString() for all other exceptions.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@davmason - I've gotten tied up with some 1st party conf stuff and feel bad having the discussion delayed on me. I'm happy for you and Daniel to make the call together. Worst case we can always can always tweak the behavior later.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dramos020 I think we should do what Noah suggested a few comments back and take the exception message from Process.Start and wrap it in a little extra text:

throw new CommandLineErrorException($"An error occurred trying to start process '{processName}' with working directory '{workingDir}'. {orginalException.Message}.");

Then we don't have to have to add a new handler for runRouter

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error message now looks like this:

PS C:\code\diagnostics\artifacts\bin\dotnet-trace\Debug\netcoreapp3.1> ./dotnet-trace collect -- garbage
No profile or providers specified, defaulting to trace profile 'cpu-sampling'

Provider Name                           Keywords            Level               Enabled By
Microsoft-DotNETCore-SampleProfiler     0x0000F00000000000  Informational(4)    --profile
Microsoft-Windows-DotNETRuntime         0x00000014C14FCCBD  Informational(4)    --profile

Launching: garbage
[ERROR] An error occured trying to start process garbage with working directory C:\code\diagnostics\artifacts\bin\dotnet-trace\Debug\netcoreapp3.1. The system cannot find the file specified.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! 👍

Co-authored-by: Juan Hoyos <juan.hoyos@microsoft.com>
@hoyosjs hoyosjs merged commit 597cc8b into dotnet:main Oct 5, 2022
mikem8361 added a commit that referenced this pull request Oct 6, 2022
* Update dependencies from https://github.com/dotnet/runtime build 20220923.9 (#3411)

[main] Update dependencies from dotnet/runtime

* Update dependencies from https://github.com/dotnet/source-build-reference-packages build 20220923.1 (#3412)

[main] Update dependencies from dotnet/source-build-reference-packages

* Update dependencies from https://github.com/dotnet/aspnetcore build 20220923.18 (#3410)

[main] Update dependencies from dotnet/aspnetcore

* Update dependencies from https://github.com/dotnet/aspnetcore build 20220924.10 (#3413)

[main] Update dependencies from dotnet/aspnetcore

* Update dependencies from https://github.com/dotnet/runtime build 20220923.12 (#3414)

[main] Update dependencies from dotnet/runtime

* Update dependencies from https://github.com/dotnet/arcade build 20220923.1 (#3415)

[main] Update dependencies from dotnet/arcade

* Update dependencies from https://github.com/dotnet/aspnetcore build 20220926.26 (#3417)

[main] Update dependencies from dotnet/aspnetcore

* Update dependencies from https://github.com/dotnet/runtime build 20220926.12 (#3418)

[main] Update dependencies from dotnet/runtime

* Update dependencies from https://github.com/dotnet/runtime build 20220927.8 (#3420)

[main] Update dependencies from dotnet/runtime

* Update dependencies from https://github.com/dotnet/source-build-reference-packages build 20220927.2 (#3421)

[main] Update dependencies from dotnet/source-build-reference-packages

* [main] Update dependencies from dotnet/aspnetcore (#3419)

[main] Update dependencies from dotnet/aspnetcore

* Update dependencies from https://github.com/dotnet/runtime build 20220928.9 (#3424)

[main] Update dependencies from dotnet/runtime

* Update dependencies from https://github.com/dotnet/aspnetcore build 20220929.15 (#3426)

[main] Update dependencies from dotnet/aspnetcore

* Update dependencies from https://github.com/dotnet/runtime build 20220930.10 (#3428)

[main] Update dependencies from dotnet/runtime

* Update dependencies from https://github.com/dotnet/source-build-reference-packages build 20220930.1 (#3429)

[main] Update dependencies from dotnet/source-build-reference-packages

* Update dependencies from https://github.com/dotnet/aspnetcore build 20221001.1 (#3427)

[main] Update dependencies from dotnet/aspnetcore

* Update dependencies from https://github.com/dotnet/aspnetcore build 20221001.6 (#3431)

[main] Update dependencies from dotnet/aspnetcore

* Update dependencies from https://github.com/dotnet/arcade build 20220930.2 (#3432)

[main] Update dependencies from dotnet/arcade

* Rename Microsoft.Diagnostics.DbgShim.linux-musl-arm package (#3433)

* FreeBSD fixes (#3430)

* fix missing include

* fix VisitModule under FreeBSD

I guess it got lost/reverted in the move

* Update dependencies from https://github.com/dotnet/aspnetcore build 20221003.15 (#3435)

[main] Update dependencies from dotnet/aspnetcore

* [main] Update dependencies from dotnet/installer (#3370)

* Update dependencies from https://github.com/dotnet/installer build 20220909.2

Microsoft.Dotnet.Sdk.Internal
 From Version 7.0.100-rc.2.22454.1 -> To Version 7.0.100-rc.2.22459.2

* Update single-file version for new SDK

* Update dependencies from https://github.com/dotnet/installer build 20220918.5

Microsoft.Dotnet.Sdk.Internal
 From Version 7.0.100-rc.2.22454.1 -> To Version 7.0.100-rtm.22468.5

* Update dependencies from https://github.com/dotnet/installer build 20220923.5

Microsoft.Dotnet.Sdk.Internal
 From Version 7.0.100-rc.2.22454.1 -> To Version 7.0.100-rtm.22473.5

* Update dependencies from https://github.com/dotnet/installer build 20220930.6

Microsoft.Dotnet.Sdk.Internal
 From Version 7.0.100-rc.2.22454.1 -> To Version 7.0.100-rtm.22480.6

* Add all configs necessary to override SDK behavior

* Don't rev 3.1's REF pack

Co-authored-by: dotnet-maestro[bot] <dotnet-maestro[bot]@users.noreply.github.com>
Co-authored-by: Mike McLaughlin <mikem@microsoft.com>
Co-authored-by: Juan Sebastian Hoyos Ayala <juan.hoyos@microsoft.com>

* Unpin monitoring assembly versions (#3436)

* Update dependencies from https://github.com/dotnet/aspnetcore build 20221004.29 (#3437)

[main] Update dependencies from dotnet/aspnetcore

* Update dependencies from https://github.com/dotnet/runtime build 20221004.4 (#3438)

[main] Update dependencies from dotnet/runtime

* Fixes unhandled exception on trace if process cannot start (#3409)

* Checks if a process was successfully started

* Update src/Tools/dotnet-trace/CommandLine/Commands/CollectCommand.cs

Co-authored-by: Juan Hoyos <juan.hoyos@microsoft.com>

* Moved the bool to ProcessLauncher and changed printing of exceptions

* Update CollectCommand.cs to print the exception ToString

Co-authored-by: Noah Falk <noahfalk@users.noreply.github.com>

* Added CommandLineException for when process cannot start

* Update exception message to have quotes

Co-authored-by: Juan Hoyos <juan.hoyos@microsoft.com>

Co-authored-by: Juan Hoyos <juan.hoyos@microsoft.com>
Co-authored-by: Noah Falk <noahfalk@users.noreply.github.com>

* Test unpinning compiler + trace event revert. (#3439)

Co-authored-by: dotnet-maestro[bot] <42748379+dotnet-maestro[bot]@users.noreply.github.com>
Co-authored-by: Thefrank <1910378+Thefrank@users.noreply.github.com>
Co-authored-by: dotnet-maestro[bot] <dotnet-maestro[bot]@users.noreply.github.com>
Co-authored-by: Juan Sebastian Hoyos Ayala <juan.hoyos@microsoft.com>
Co-authored-by: Justin Anderson <jander-msft@users.noreply.github.com>
Co-authored-by: Daniel Ramos <111319334+dramos020@users.noreply.github.com>
Co-authored-by: Noah Falk <noahfalk@users.noreply.github.com>
Co-authored-by: Juan Hoyos <juan.s.hoyos@outlook.com>
@github-actions github-actions bot locked and limited conversation to collaborators Jan 11, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

dotnet-trace dumps stack if process cannot start
4 participants