Convert JIT\Directed tests to merged test groups #83256

markples · 2023-03-10T11:39:03Z

(based on #81969)

…their names

ghost · 2023-03-10T11:39:37Z

Tagging subscribers to this area: @dotnet/area-system-reflection-metadata
See info in area-owners.md if you want to be subscribed.

Issue Details

(based on #81969)

Author:	markples
Assignees:	markples
Labels:	`area-System.Reflection.Metadata`
Milestone:	-

markples · 2023-03-10T11:40:16Z

/azp run runtime-coreclr outerloop

azure-pipelines · 2023-03-10T11:40:32Z

Azure Pipelines successfully started running 1 pipeline(s).

BrianBohe · 2023-03-10T17:10:26Z

LGTM! Let see if that typo was the only problem

BrianBohe · 2023-03-10T18:52:43Z

/azp run runtime-coreclr outerloop

azure-pipelines · 2023-03-10T18:53:04Z

Azure Pipelines successfully started running 1 pipeline(s).

…nt argument)

…form tool Related PR dotnet#64837

Related PR dotnet#66157

Actual dedup strategy groups projs in groups based on suffixes _r, _d, etc. This file does not have a matching _d version, which leads ILTransform to inconsistently renaming projects.

…fo/tests)

…tributes

… help future ILTransform passes

…nt is global

…line declaration)

…c methods with private signatures)

…ram")

markples · 2023-03-10T20:20:16Z

/azp run runtime-coreclr outerloop

azure-pipelines · 2023-03-10T20:20:32Z

Azure Pipelines successfully started running 1 pipeline(s).

BrianBohe

Does it make sense to split HardwareIntrinsics_ro? Would that avoid these timeouts? I don't see issues related to this pr

BrianBohe · 2023-03-11T00:03:10Z

Does it make sense to split HardwareIntrinsics_ro? Would that avoid these timeouts? I don't see issues related to this pr

What do you think about this @trylek ?

trylek · 2023-03-11T00:10:42Z

What is the number of tests in the group and its duration in checked mode (without any stress modes)?

markples · 2023-03-11T01:02:36Z

It looks like it has 5834. I see a successful x64 run that took about 5 minutes. Even the timeouts are only showing 5-10 minutes in the test group. This includes timestamps that appear to be after the test script is terminated, so I don't think it's getting stuck. (These tests are generated, etc., so there are a lot of them... @davidwrighton added various things like the test striping mechanism just to handle them at all.. so maybe there's more to do there.)

There's a comment:

<!-- For Vector512, we only have a very small pool of machines with acceleration support, so they are always outerloop -->

Could this be a case where queue time is counting against a timeout value?

markples · 2023-03-11T01:05:24Z

/azp run runtime-coreclr outerloop

azure-pipelines · 2023-03-11T01:05:46Z

Azure Pipelines successfully started running 1 pipeline(s).

trylek · 2023-03-11T01:16:10Z

Hmm, both these values seem far beyond what I'd consider reasonable. Please remember that in GC / JIT stress modes the tests are typically two to three orders of magnitude slower and Helix items taking 5000 minutes are less fun than they seem. Under JIT/Methodical, I tried to keep the group size around 400 tests, we can certainly be more aggressive for tests that are really tiny but running the entire Pri1 test set as a single Helix item is obviously not desirable for multiple reasons.

davidwrighton · 2023-03-11T01:48:11Z

@trylek, the tests are setup so that under striping, we run only about 600 or so tests , which results in them finishing in a vaguely reasonable timeframe under stress. If we're seeing intermittent failures here, then yep, we've got a real product bug to look into, and what I saw here looked like an intermittent failure that should be investigated.

markples · 2023-03-17T23:06:54Z

Thanks @davidwrighton for pointing out the details on striping - I now see the msbuild variable is called NumberOfStripesToUseInStress so my reporting of a normal run was inaccurate.

I am curious if you can share your reasoning for suspecting a product issue here. Looking at one of the failures (AzDO build https://dev.azure.com/dnceng-public/public/_build/results?buildId=200759&view=results), I see

top of log:

BEGIN EXECUTION
"C:\h\w\9D430947\p\watchdog.exe" 300 "C:\h\w\9D430947\p\corerun.exe" -p "System.Reflection.Metadata.MetadataUpdater.IsSupported=false"  HardwareIntrinsics_ro.dll 
22:21:46.449 Running test: global::JIT.HardwareIntrinsics.Arm._AdvSimd.Arm64.Program.Abs_Vector128_Double()
Supported ISAs:

bottom of log:

22:26:48.580 Passed test: global::JIT.HardwareIntrinsics.General._Vector128_1.Program.op_AdditionByte()
22:26:48.581 Running test: global::JIT.HardwareIntrinsics.General._Vector128_1.Program.op_AdditionDouble()
Beginning scenario: RunBasicScenario_UnsafeRead
SUCCESS: The process "corerun.exe" with PID 5812 has been terminated.
2023-03-10T22:26:49.815Z	INFO   	run.py	run(48)	main	Beginning reading of test results.

Start @ 22:21:46.449
Start failing test @ 22:26:48.581
Results message @ 22:26:49.815

The failing test is only running for just over a second before the overall 5 minute timeout hits.

Recent changes have greatly bumped up this value, so if you are seeing a product issue, then it may be hidden now.

markples · 2023-03-17T23:08:35Z

The failures we've seen so far have been seen elsewhere, so I will resolve the conflicts and update this for merging. @trylek, Brian and I have both made changes/reviewed this, so please let me know if there's anything that you'd like to look at more closely. Thanks!

trylek

Looks great to me, thanks Mark!

…analyzers errors

…ion change

SamMonoRT · 2023-04-05T20:32:33Z

@markples @JulieLeeMSFT @BrianBohe - Looking at https://dev.azure.com/dnceng-public/public/_build/results?buildId=226494&view=logs&j=e45de9b4-b3b3-54f9-2ea5-8e56201c788d&t=92db4d17-27b4-5f87-01bd-8cef7a796420 failure,

For such large extensive changes to test infrastructure, given same tests are run for both runtimes, please kick off "/azp run runtime-extra-platforms" which should trigger some more lanes, one of which is linux-arm64 Release AllSubsets_Mono_LLVMFullAot_RuntimeTests llvmfullaot' extra-platforms lane. During that, the LLVM AOT cross-compile CoreCLR tests job runs, fails on a couple tests, hence the Send to Helix job doesn't collect any logs. Mono doesn't seem to have any aot-llvm arm64 runtime test coverage due to this.
Example of that is https://dev.azure.com/dnceng-public/public/_build/results?buildId=226494&view=logs&j=e45de9b4-b3b3-54f9-2ea5-8e56201c788d&t=92db4d17-27b4-5f87-01bd-8cef7a796420
The two test failing in above are : Regression_1.dll:

Assertion at /__w/1/s/src/mono/mono/mini/aot-compiler.c:3956, condition `image_index < MAX_IMAGE_INDEX' not met
GitHub_27678.dll:
Assertion at /__w/1/s/src/mono/mono/mini/memory-access.c:120, condition `size < MAX_INLINE_COPY_SIZE' not met

A PR was merged to fix one of those : [mono][aot] Fix an assert in the aot compiler. #84343
For the second one, GitHub_27678.dll, it seems it is ignoring the condition in the issues.target file to exclude for Mono runtime. But on deeper look in the failure logs, it shows file at .../coreclr/linux.arm64.Release/JIT/Regression/Regression_3/GitHub_27678.dll, while as in the issues.target file, it seems to be ignoring from the JitBlue directory as in: ....JIT/Regression/JitBlue/GitHub_27678

You should work to disable the test correctly for Mono. Also @trylek I want to understand motivation behind the coreclr cross-compiler tests running under a Mono AOT-LLVM arm64 lane -- shouldn't these be two different lanes ?

markples · 2023-04-05T21:52:22Z

@SamMonoRT Thank you for the info and investigation. I'll add runtime-extra-platforms.

The naming in issues.targets will work the same before and after merging unless certain internal test changes are made (I believe that splitting a single Main into multiple [Fact]s can cause this). However, merged tests cause this AOT step to happen during Regression_3 for all of the tests inside the group before we check for specific tests. I opened #84380, which effectively pulls GitHub_27678 into a separate executable so that the exclusion happens in time.

markples · 2023-04-06T19:26:08Z

@SamMonoRT I just wanted to double-check that you weren't waiting for me on anything else. I think these two were the full impact (and both builds fixed so I abandoned my PR - one still fails at runtime but issue.targets handles that).

BrianBohe added 4 commits March 8, 2023 00:40

Remove test straccess4 - identical to straccess3_cs_d

07c3f14

Remove duplicate csproj files - identical to versions without _cs in …

ba6d22f

…their names

Manually remove main arg from out_of_range_fp_to_int_conversions.

c425491

Remove Main from straccess3 (save as comment for local use)

8900958

ghost assigned markples Mar 10, 2023

dotnet-issue-labeler bot added the area-System.Reflection.Metadata label Mar 10, 2023

markples added test-enhancement Improvements of test source code and removed area-System.Reflection.Metadata labels Mar 10, 2023

markples mentioned this pull request Mar 10, 2023

Testmerging JIT/Directed #81969

Closed

build-analysis bot mentioned this pull request Mar 10, 2023

[release/6.0] Doublelinklist GC failures on Mono #83245

Closed

build-analysis bot mentioned this pull request Mar 10, 2023

IOException running NuGet-Migrations during tests in dotnet CLI first run #80619

Closed

BrianBohe and others added 13 commits March 10, 2023 12:13

Remove switchdefaultonly* Main argument (replace with noinline consta…

b7b6d19

…nt argument)

Remove IL namespace/class declarations (in shift/) that break ILTrans…

8cabb62

…form tool Related PR dotnet#64837

Use $(TestLibraryProjectPath) in 5 ilproj files (helps ILTransform)

1661858

Related PR dotnet#66157

Manually rename ldfldstatic1_il_r

3395e6b

Actual dedup strategy groups projs in groups based on suffixes _r, _d, etc. This file does not have a matching _d version, which leads ILTransform to inconsistently renaming projects.

[cs-main] Remove unused Main arg from arglist\vararg.cs

19f8801

[ILTransform -p] Rename _d/_r ilproj to _il_d/_il_r (but undo debugin…

b237f5e

…fo/tests)

[ILTransform -n] 3 iterations - Deduplicate project names

a836382

[ILTransform -a] Match .assembly names to project names

a0ff951

[ILTransform -prociso] Set RequiresProcessIsolation based on other at…

549e942

…tributes

[ILTransform -collapse-main-sig] Collapse .method Main to one line to…

1446025

… help future ILTransform passes

[ILTransform -public] Make entry points public, add class if entrypoi…

82dc64b

…nt is global

Manually remove "public public" from C# (ILTransform failure on multi…

d5c2e9c

…line declaration)

Manually fix accessibility after exposing main class (other now-publi…

83fdffc

…c methods with private signatures)

BrianBohe added 4 commits March 10, 2023 12:13

Rename IL classes to avoid keywords

2dc230d

Manually update issues.targets (after renaming various projects)

f8aca8d

Manually convert more_tails.cs (and generated IL from it)

bef45f9

Manually fix punning.cs (depended on simple wrapper using "class Prog…

e9f2c99

…ram")

markples force-pushed the merge-directed branch from 1fdcfeb to e9f2c99 Compare March 10, 2023 20:14

BrianBohe approved these changes Mar 11, 2023

View reviewed changes

trylek approved these changes Mar 21, 2023

View reviewed changes

Merge remote-tracking branch 'dotnet/main' into merge-directed

aea9c6d

markples marked this pull request as ready for review March 23, 2023 17:29

markples added 3 commits March 23, 2023 14:49

Merge remote-tracking branch 'dotnet/main' into merge-directed

8f1e73e

Finish merge - incorporate changes to csproj/Dir.B.props - fix xunit.…

d01ee5e

…analyzers errors

Rename csproj to be shorter/more consistent with upcoming JIT/Regress…

b4b702a

…ion change

markples merged commit 18a0403 into dotnet:main Mar 24, 2023

markples mentioned this pull request Mar 24, 2023

Merge the remaining groups of runtime tests #71732

Open

33 tasks

JulieLeeMSFT mentioned this pull request Mar 29, 2023

CodeGen infrastructure work planned for .NET 8 #79018

Closed

17 tasks

ghost locked as resolved and limited conversation to collaborators May 7, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Convert JIT\Directed tests to merged test groups #83256

Convert JIT\Directed tests to merged test groups #83256

markples commented Mar 10, 2023

ghost commented Mar 10, 2023

markples commented Mar 10, 2023

azure-pipelines bot commented Mar 10, 2023

BrianBohe commented Mar 10, 2023

BrianBohe commented Mar 10, 2023

azure-pipelines bot commented Mar 10, 2023

markples commented Mar 10, 2023

azure-pipelines bot commented Mar 10, 2023

BrianBohe left a comment

BrianBohe commented Mar 11, 2023

trylek commented Mar 11, 2023

markples commented Mar 11, 2023

markples commented Mar 11, 2023

azure-pipelines bot commented Mar 11, 2023

trylek commented Mar 11, 2023

davidwrighton commented Mar 11, 2023

markples commented Mar 17, 2023

markples commented Mar 17, 2023

trylek left a comment

SamMonoRT commented Apr 5, 2023

markples commented Apr 5, 2023

markples commented Apr 6, 2023

Convert JIT\Directed tests to merged test groups #83256

Convert JIT\Directed tests to merged test groups #83256

Conversation

markples commented Mar 10, 2023

ghost commented Mar 10, 2023

markples commented Mar 10, 2023

azure-pipelines bot commented Mar 10, 2023

BrianBohe commented Mar 10, 2023

BrianBohe commented Mar 10, 2023

azure-pipelines bot commented Mar 10, 2023

markples commented Mar 10, 2023

azure-pipelines bot commented Mar 10, 2023

BrianBohe left a comment

Choose a reason for hiding this comment

BrianBohe commented Mar 11, 2023

trylek commented Mar 11, 2023

markples commented Mar 11, 2023

markples commented Mar 11, 2023

azure-pipelines bot commented Mar 11, 2023

trylek commented Mar 11, 2023

davidwrighton commented Mar 11, 2023

markples commented Mar 17, 2023

markples commented Mar 17, 2023

trylek left a comment

Choose a reason for hiding this comment

SamMonoRT commented Apr 5, 2023

markples commented Apr 5, 2023

markples commented Apr 6, 2023