-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix ELT profiler tests #44285
Fix ELT profiler tests #44285
Conversation
Tagging subscribers to this area: @tommcdon |
What changed? You mention a timer firing; it wasn't before? I'm asking because we saw some other diagnostic failures start frequently around the same time, e.g. #44037, and I'm wondering if there might be a link. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
LogTestFailure($"Profilee returned exit code {process.ExitCode} instead of expected exit code 100."); | ||
} | ||
|
||
return 100; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you said in an earlier PR that the exit code didn't matter for profiler tests. It still would be nice to have a comment to that effect.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't remember exactly what I said, but I think I was trying to say that the exit code doesn't matter for that particular test. We do check the exit code, it's just that a lot of profiler tests don't use it to indicate failure. Some of them do, however, and crashes will return a non 100 exit code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess I just wanted to make it clear to someone other than you that return 100;
can happen even though the test failed. I just feels confusing to me. If we hadn't talked about it, I would have asked again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, I added a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that logging a failure message and then returning 100 feels counter intuitive. If the exit code matters I was expecting there would be some conditional logic or if it doesn't matter then perhaps we should return 0?
EDIT: nm, I see that LogTestFailure() also throws an exception so 100 is only what happens in the success case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I misunderstood what Steve was saying. As Noah points out in the edit above LogTestFailure intentionally throws an unhandled exception to fail the test. I can see how that is unclear to others though, since it just says log in the name
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Precisely unclear
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just changed it from LogTestFailure to FailFastWithMessage so hopefully its more clear just from the name
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks
I shouldn't have worded it so strongly. I was inferring something changed due to it only recently starting to fail, but have no proof. These tests register enter/leave hooks through the profiler APIs, and recently they started failing where after the profiler shut down and cleaned up resources some background thread would still be calling managed code and the test would AV trying to use pCorProfilerInfo, which is now null. When I debugged, it consistently hit in a managed timer callback but I never actually verified what code path was hitting it. For the purposes of this test the exact code that causes the failure is irrelevant, it has always been legal and possible for other threads to continue running during shutdown. This is one of those "I'm surprised this worked until now" moments. I will take a look to see if there is an obvious change causing this, but I would be pretty surprised if this was related to the other failing tests you linked. |
It fails before it gets to anything that actually tells me what timer is firing:
But, although it mainly fails in the timer path it also randomly fails in other codepaths too. I don't know what changed but it's very unlikely that this is anything but a test bug. |
It occurs to me that the reason we're seeing managed code executing at times when it wasn't before is the portable thread pool work that @kouvel enabled recently. Earlier I said I didn't know what was causing the timer, but I think the new thread pool implementation itself is causing it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Tiered compilation's background work currently uses the thread pool, a timer is used to get a thread pool thread and some new managed code is triggered. |
author Stephen Toub <stoub@microsoft.com> 1604601164 -0500 committer Tammy Qiu <tammy.qiu@yahoo.com> 1604960878 -0500 Add stream conformance tests for TranscodingStream (dotnet#44248) * Add stream conformance tests for TranscodingStream * Special-case 0-length input buffers to TranscodingStream.Write{Async} The base implementation of Encoder.Convert doesn't like empty inputs. Regardless, if the input is empty, we can avoid a whole bunch of unnecessary work. JIT: minor inliner refactoring (dotnet#44215) Extract out the budget check logic so it can vary by inlining policy. Use this to exempt the FullPolicy from budget checking. Fix inline xml to dump the proper (full name) hash for inlinees. Update range dumper to dump ranges in hex. Remove unused QCall for WinRTSupported (dotnet#44278) ConcurrentQueueSegment allows spinning threads to sleep. (dotnet#44265) * Allow threads to sleep when ConcurrentQueue has many enqueuers/dequeuers. * Update src/libraries/System.Private.CoreLib/src/System/Collections/Concurrent/ConcurrentQueueSegment.cs Co-authored-by: Stephen Toub <stoub@microsoft.com> * Apply suggestions from code review Co-authored-by: Stephen Toub <stoub@microsoft.com> Co-authored-by: AMD DAYTONA EPYC <amd@amd-DAYTONA-X0.com> Co-authored-by: Stephen Toub <stoub@microsoft.com> File.Exists() is not null when true (dotnet#44310) * File.Exists() is not null when true * Fix compile * Fix compile 2 [master][watchOS] Add simwatch64 support (dotnet#44303) Xcode 12.2 removed 32 bits support for watchOS simulators, this PR helps to fix xamarin/xamarin-macios#9949, we have tested the new binaries and they are working as expected ![unknown](https://user-images.githubusercontent.com/204671/98253709-64413200-1f49-11eb-9774-8c5aa416fc57.png) Co-authored-by: dalexsoto <dalexsoto@users.noreply.github.com> Implementing support to Debugger::Break. (dotnet#44305) Set fgOptimizedFinally flag correctly (dotnet#44268) - Initialize to 0 at compiler startup - Set flag when finally cloning optimization kicks in Fixes non-deterministic generation of nop opcodes into ARM32 code Forbid `- byref cnst` -> `+ (byref -cnst)` transformation. (dotnet#44266) * Add a repro test. * Forbid the transformation for byrefs. * Update src/coreclr/src/jit/morph.cpp Co-authored-by: Andy Ayers <andya@microsoft.com> * Update src/coreclr/src/jit/morph.cpp * Fix the test return value. WriteLine is just to make sure we don't delete the value. * improve the test. avoid a possible overflow and don't waste time on printing. Co-authored-by: Andy Ayers <andya@microsoft.com> Pick libmonosgen-2.0.so from cmake install directory instead of .libs (dotnet#44291) This aligns Linux with what we already do for all the other platforms. Update SharedPerformanceCounter assert (dotnet#44333) Remove silly ToString in GetCLRInstanceString (dotnet#44335) Use targetPlatformMoniker for net5.0 and newer tfms (dotnet#43965) * Use targetPlatformMoniker for net5.0 and newer tfms * disabling analyzer, update version to 0.0, and use new format. * update the targetFramework.sdk * removing supportedOS assembly level attribute * fix linker errors and addressing feedback * making _TargetFrameworkWithoutPlatform as private [sgen] Add Ward annotations to sgen_get_total_allocated_bytes (dotnet#43833) Attempt to fix https://jenkins.mono-project.com/job/test-mono-mainline-staticanalysis/ Co-authored-by: lambdageek <lambdageek@users.noreply.github.com> [tests] Re-enable tests fixed by dotnet#44081 (dotnet#44212) Fixes mono/mono#15030 and fixes mono/mono#15031 and fixes mono/mono#15032 Add an implicit argument coercion check. (dotnet#43386) * Add `impCheckImplicitArgumentCoercion`. * Fix tests with type mismatch. * Try to fix VM signature. * Allow to pass byref as native int. * another fix. * Fix another IL test. [mono] Change CMakelists.txt "python" -> Python3_EXECUTABLE (dotnet#44340) Debian doesn't install a "python" binary for python3. Tweak StreamConformanceTests for cancellation (dotnet#44342) - Avoid unnecessary timers - Separate tests for precancellation, ReadAsync(byte[], ...) cancellation, and ReadAsync(Memory, ...) cancellation Use Dictionary for underlying cache of ResourceSet (dotnet#44104) Simplify catch-rethrow logic in NetworkStream (dotnet#44246) A follow-up on dotnet#40772 (comment), simplifies and harmonizes the way we wrap exceptions into IOException. Having one catch block working with System.Exception seems to be enough here, no need for specific handling of SocketException. Simple GT_NEG optimization for dotnet#13837 (dotnet#43921) * Simple arithmetic optimization with GT_NEG * Skip GT_NEG optimization when an operand is constant. Revert bitwise rotation pattern * Fixed Value Numbering assert * Cleaned up code and comments for simple GT_NEG optimization * Formatting Co-authored-by: Julie Lee <jeonlee@microsoft.com> [master] Update dependencies from mono/linker (dotnet#44322) * Update dependencies from https://github.com/mono/linker build 20201105.1 Microsoft.NET.ILLink.Tasks From Version 6.0.0-alpha.1.20527.2 -> To Version 6.0.0-alpha.1.20555.1 * Update dependencies from https://github.com/mono/linker build 20201105.2 Microsoft.NET.ILLink.Tasks From Version 6.0.0-alpha.1.20527.2 -> To Version 6.0.0-alpha.1.20555.2 * Disable new optimization for libraries mode (it cannot work in this mode) Co-authored-by: dotnet-maestro[bot] <dotnet-maestro[bot]@users.noreply.github.com> Co-authored-by: Marek Safar <marek.safar@gmail.com> Tighten argument validation in StreamConformanceTests (dotnet#44326) Add threshold on number of files / partition in SPMI collection (dotnet#44180) * Add check for files count * Fix the OS check * decrese file limit to 1500: * misc fix * Do not upload to azure if mch files are zero size Fix ELT profiler tests (dotnet#44285) [master] Update dependencies from dotnet/arcade dotnet/llvm-project dotnet/icu (dotnet#44336) [master] Update dependencies from dotnet/arcade dotnet/llvm-project dotnet/icu - Merge branch 'master' into darc-master-2211df94-2a02-4c3c-abe1-e3534e896267 Fix Send_TimeoutResponseContent_Throws (dotnet#44356) If the client times out too quickly, the server may never have a connection to accept and will hang forever. Match CoreCLR behaviour on thread start failure (dotnet#44124) Co-authored-by: Aleksey Kliger (λgeek) <akliger@gmail.com> Add slash in Windows SoD tool build (dotnet#44359) * Add slash in Windows SoD tool build * Update SoD search path to match output dir * Fixup dotnet version * Remove merge commit headers * Disable PRs Co-authored-by: Drew Scoggins <andrew.g.scoggins@gmail> Reflect test path changes in .gitattributes; remove nonexistent files (dotnet#44371) Bootstrapping a test for R2RDump (dotnet#42150) Improve performance of Enum's generic IsDefined / GetName / GetNames (dotnet#44355) Eliminates the boxing in IsDefined/GetName/GetValues, and in GetNames avoids having to go through RuntimeType's GetEnumNames override. clarify http version test (dotnet#44379) Co-authored-by: Geoffrey Kizer <geoffrek@windows.microsoft.com> Update dependencies from https://github.com/mono/linker build 20201106.1 (dotnet#44367) Microsoft.NET.ILLink.Tasks From Version 6.0.0-alpha.1.20555.2 -> To Version 6.0.0-alpha.1.20556.1 Co-authored-by: dotnet-maestro[bot] <dotnet-maestro[bot]@users.noreply.github.com> Disable RunThreadLocalTest8_Values on Mono (dotnet#44357) * Disable RunThreadLocalTest8_Values on Mono It's failing on SLES * fix typo LongProcessNamesAreSupported: make test work on distros where sleep is a symlink/script (dotnet#44299) * LongProcessNamesAreSupported: make test work on distros where sleep is a symlink/script * PR feedback Co-authored-by: Stephen Toub <stoub@microsoft.com> * fix compilation Co-authored-by: Stephen Toub <stoub@microsoft.com> add missing constructor overloads (dotnet#44380) Co-authored-by: Geoffrey Kizer <geoffrek@windows.microsoft.com> change using in ConnectCallback_UseUnixDomainSocket_Success (dotnet#44366) Clean up the samples (dotnet#44293) Update dotnet/roslyn issue link Delete stale comment about dotnet/roslyn#30797 Fix/remove TODO-NULLABLEs (dotnet#44300) * Fix/remove TODO-NULLABLEs * remove redundant ! * apply Jozkee's feedback * address feedback Update glossary (dotnet#44274) Co-authored-by: Juan Hoyos <juan.hoyos@microsoft.com> Co-authored-by: Stephen Toub <stoub@microsoft.com> Co-authored-by: Günther Foidl <gue@korporal.at> Add files need for wasm executable relinking/aot to the wasm runtime pack. (dotnet#43785) Co-authored-by: Alexander Köplinger <alex.koeplinger@outlook.com> Move some more UnmanagedCallersOnly tests to IL now that they're invalid C# (dotnet#43366) Fix C++ build for mono/metadata/threads.c (dotnet#44413) `throw` is a reserved keyword in C++. Disable a failing test. (dotnet#44404) Change async void System.Text.Json test to be async Task (dotnet#44418) Improve crossgen2 comparison jobs (dotnet#44119) - Fix compilation on unix platforms - Wrap use of wildcard in quotes - Print better display name into log - Fix X86 constant comparison handling - Add ability to compile specific overload via single method switches Remove some unnecessary GetTypeInfo usage (dotnet#44414) Fix MarshalTypedArrayByte and re-enable it. Re-enable TestFunctionApply
Fixes #44055
Recently profiler/elt/slowpatheltenter/slowpatheltenter started failing. On investigation it was because a timer firing was triggering managed code to run as the runtime was shutting down, and we would occasionally try to use pCorProfilerInfo after it was set to NULL.
This PR add some synchronization so we bail on the ELT hooks once Shutdown is called. It also changes the output when a test fails, the previous wording was confusing because it implied PROFILER_TEST_PASSES in the output was the only thing necessary for a test to pass when we check the exit code of the process as well.