-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Convert JIT\Directed tests to merged test groups #83256
Conversation
Tagging subscribers to this area: @dotnet/area-system-reflection-metadata Issue Details(based on #81969)
|
/azp run runtime-coreclr outerloop |
Azure Pipelines successfully started running 1 pipeline(s). |
LGTM! Let see if that typo was the only problem |
/azp run runtime-coreclr outerloop |
Azure Pipelines successfully started running 1 pipeline(s). |
…form tool Related PR dotnet#64837
Actual dedup strategy groups projs in groups based on suffixes _r, _d, etc. This file does not have a matching _d version, which leads ILTransform to inconsistently renaming projects.
… help future ILTransform passes
…line declaration)
…c methods with private signatures)
1fdcfeb
to
e9f2c99
Compare
/azp run runtime-coreclr outerloop |
Azure Pipelines successfully started running 1 pipeline(s). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it make sense to split HardwareIntrinsics_ro? Would that avoid these timeouts? I don't see issues related to this pr
What do you think about this @trylek ? |
What is the number of tests in the group and its duration in checked mode (without any stress modes)? |
It looks like it has 5834. I see a successful x64 run that took about 5 minutes. Even the timeouts are only showing 5-10 minutes in the test group. This includes timestamps that appear to be after the test script is terminated, so I don't think it's getting stuck. (These tests are generated, etc., so there are a lot of them... @davidwrighton added various things like the test striping mechanism just to handle them at all.. so maybe there's more to do there.) There's a comment:
Could this be a case where queue time is counting against a timeout value? |
/azp run runtime-coreclr outerloop |
Azure Pipelines successfully started running 1 pipeline(s). |
Hmm, both these values seem far beyond what I'd consider reasonable. Please remember that in GC / JIT stress modes the tests are typically two to three orders of magnitude slower and Helix items taking 5000 minutes are less fun than they seem. Under JIT/Methodical, I tried to keep the group size around 400 tests, we can certainly be more aggressive for tests that are really tiny but running the entire Pri1 test set as a single Helix item is obviously not desirable for multiple reasons. |
@trylek, the tests are setup so that under striping, we run only about 600 or so tests , which results in them finishing in a vaguely reasonable timeframe under stress. If we're seeing intermittent failures here, then yep, we've got a real product bug to look into, and what I saw here looked like an intermittent failure that should be investigated. |
Thanks @davidwrighton for pointing out the details on striping - I now see the msbuild variable is called I am curious if you can share your reasoning for suspecting a product issue here. Looking at one of the failures (AzDO build https://dev.azure.com/dnceng-public/public/_build/results?buildId=200759&view=results), I see top of log:
bottom of log:
Start @ 22:21:46.449 The failing test is only running for just over a second before the overall 5 minute timeout hits. Recent changes have greatly bumped up this value, so if you are seeing a product issue, then it may be hidden now. |
The failures we've seen so far have been seen elsewhere, so I will resolve the conflicts and update this for merging. @trylek, Brian and I have both made changes/reviewed this, so please let me know if there's anything that you'd like to look at more closely. Thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great to me, thanks Mark!
@markples @JulieLeeMSFT @BrianBohe - Looking at https://dev.azure.com/dnceng-public/public/_build/results?buildId=226494&view=logs&j=e45de9b4-b3b3-54f9-2ea5-8e56201c788d&t=92db4d17-27b4-5f87-01bd-8cef7a796420 failure,
You should work to disable the test correctly for Mono. Also @trylek I want to understand motivation behind the coreclr cross-compiler tests running under a Mono AOT-LLVM arm64 lane -- shouldn't these be two different lanes ? |
@SamMonoRT Thank you for the info and investigation. I'll add runtime-extra-platforms. The naming in issues.targets will work the same before and after merging unless certain internal test changes are made (I believe that splitting a single |
@SamMonoRT I just wanted to double-check that you weren't waiting for me on anything else. I think these two were the full impact (and both builds fixed so I abandoned my PR - one still fails at runtime but issue.targets handles that). |
(based on #81969)