Exception in test infrastructure - System.InvalidOperationException: Collection was modified after the enumerator was instantiated #11063

4creators · 2018-09-11T10:08:36Z

For the second time I have encountered the following error during CI OSX test run.

https://ci.dot.net/job/dotnet_coreclr/job/master/job/x64_checked_osx10.12_innerloop_tst_prtest/5550/

02:51:26   Unhandled Exception: System.InvalidOperationException: Collection was modified after the enumerator was instantiated.
02:51:26      at System.Collections.Generic.Stack`1.Enumerator.MoveNext()
02:51:26      at Xunit.Sdk.DisposalTracker.Dispose() in C:\projects\xunit\src\xunit.execution\Sdk\DisposalTracker.cs:line 26
02:51:26      at Xunit.Sdk.ExtensibilityPointFactory.Dispose() in C:\projects\xunit\src\xunit.execution\Sdk\ExtensibilityPointFactory.cs:line 54
02:51:26      at Xunit.Sdk.TestFramework.Dispose() in C:\projects\xunit\src\xunit.execution\Sdk\Frameworks\TestFramework.cs:line 49
02:51:26      at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state)
02:51:26   --- End of stack trace from previous location where exception was thrown ---
02:51:26      at System.Threading.ThreadPoolWorkQueue.Dispatch()

The text was updated successfully, but these errors were encountered:

danmoseley · 2019-01-17T17:50:32Z

@ViktorHofer does this look familiar?

ViktorHofer · 2019-01-17T17:52:09Z

that's something that dotnet/arcade#1613 should fix.

ViktorHofer · 2019-11-18T19:17:33Z

@echesakov any numbers how often this happens? have somebody already root-caused the issue? Seems like this lives in xunit/xunit and not in the runner.

danmoseley · 2019-11-18T19:19:10Z

xunit/xunit#1855

safern · 2020-05-15T18:23:32Z

Happened again in: https://dev.azure.com/dnceng/public/_build/results?buildId=646104&view=results

https://helix.dot.net/api/2019-06-17/jobs/6259c638-1c48-444a-9df5-b3d8b89354e4/workitems/PayloadGroup0/console

Anipik · 2020-11-02T21:08:00Z

happened again in #43651

ViktorHofer · 2020-11-03T12:34:54Z

Probably coming from: https://github.com/xunit/xunit/blob/a8614d34999c889e1c8014f679de95d20eae1304/src/xunit.execution/Sdk/DisposalTracker.cs#L26-L27

I see @danmosemsft already linked to the xunit issue: xunit/xunit#1855

stephentoub · 2020-11-09T17:42:17Z

I've been hitting this reasonably frequently, e.g.
https://helixre8s23ayyeko0k025g8.blob.core.windows.net/dotnet-runtime-refs-pull-44409-merge-8a1a8bc2383a41f093/PayloadGroup0/console.e860a3c4.log?sv=2019-07-07&se=2020-11-29T17%3A18%3A38Z&sr=c&sp=rl&sig=0mS1HKsUz%2FgaWANt6pRwMCfnGXyaOIXCnwq327pVblg%3D

Unhandled exception.   Discovering: Exceptions.ForeignThread.XUnitWrapper
System.InvalidOperationException: Collection was modified after the enumerator was instantiated.
   at System.Collections.Generic.Stack`1.Enumerator.MoveNext()
   at Xunit.Sdk.DisposalTracker.Dispose() in C:\Dev\xunit\xunit\src\xunit.execution\Sdk\DisposalTracker.cs:line 26
   at Xunit.Sdk.TestFramework.Dispose() in C:\Dev\xunit\xunit\src\xunit.execution\Sdk\Frameworks\TestFramework.cs:line 52
   at System.Threading.Tasks.Task.<>c.<ThrowAsync>b__140_1(Object state)
   at System.Threading.QueueUserWorkItemCallbackDefaultContext.Execute()
   at System.Threading.ThreadPoolWorkQueue.Dispatch()
Aborted (core dumped)

stephentoub · 2020-11-09T17:43:05Z

I see @danmosemsft already linked to the xunit issue: xunit/xunit#1855

Which is closed.

ViktorHofer · 2020-11-09T18:20:51Z

Which is closed.

As it was fixed in xunit v3 which hasn't shipped yet.

naricc · 2020-11-23T16:31:07Z

I am also seeing this intermittently on Mono runtime test lanes.

radical · 2020-12-03T14:52:42Z

Seeing this in https://helixre8s23ayyeko0k025g8.blob.core.windows.net/dotnet-runtime-refs-heads-master-21906b723775461185/PayloadGroup0/console.81c9eaa1.log?sv=2019-07-07&se=2020-12-23T13%3A13%3A31Z&sr=c&sp=rl&sig=D65mDzIZzZ36F68UhRgc5vOffnOAUJ8%2BXHelojRWpIA%3D

on Mono Linux x64 Release @ Ubuntu.1804.Amd64.Open

ViktorHofer · 2020-12-27T13:15:55Z

Happened again in https://dev.azure.com/dnceng/public/_build/results?buildId=932232&view=ms.vss-test-web.build-test-results-tab&runId=29520600&resultId=102132&paneView=attachments.

maryamariyan · 2021-09-09T22:38:26Z

@ericstj and I were looking at the linked test failures again and found an interesting pattern. All of the clr test failures shown here were using XUnitWrapper in the logs.

e.g. seen in here:

Unhandled exception.   Discovering: readytorun.r2rdump.XUnitWrapper (method display = ClassAndMethod, method display options = None)
  Discovered:  readytorun.r2rdump.XUnitWrapper (found 1 test case)
  Starting:    readytorun.r2rdump.XUnitWrapper (parallel test collections = on, max threads = 4)
System.InvalidOperationException: Collection was modified after the enumerator was instantiated.
   at System.Collections.Generic.Stack`1.Enumerator.MoveNext()
   at Xunit.Sdk.DisposalTracker.Dispose() in C:\Dev\xunit\xunit\src\xunit.execution\Sdk\DisposalTracker.cs:line 26
   at Xunit.Sdk.ExtensibilityPointFactory.Dispose() in C:\Dev\xunit\xunit\src\xunit.execution\Sdk\ExtensibilityPointFactory.cs:line 53
   at Xunit.Sdk.TestFramework.Dispose() in C:\Dev\xunit\xunit\src\xunit.execution\Sdk\Frameworks\TestFramework.cs:line 52
   at System.Threading.Tasks.Task.<>c.<ThrowAsync>b__127_1(Object state)
   at System.Threading.QueueUserWorkItemCallbackDefaultContext.Execute()
   at System.Threading.ThreadPoolWorkQueue.Dispatch()
   at System.Threading.PortableThreadPool.WorkerThread.WorkerThreadStart()
   at System.Threading.Thread.StartCallback()

Are we doing anything unusual in that test infrastructure?

ChadNedzlek · 2021-09-09T22:47:22Z

I think this is just a bug in the XUnit execution engine, which I don't think is under active development anymore. I don't think it's caused by anything in the actual testing.

ericstj · 2021-09-09T22:52:01Z

@maryamariyan is actually looking at fixing that bug, and as part of that fix we'd like to describe why it's happening. Xunit has a static API that will operate on this shared stack, so something is calling that static API from multiple threads. We were going to trace a test exectuion to see why that happens from multiple threads. To find a good test to trace we started looking at the existing repros and that's when we noticed this pattern. It seemed too curious not to ignore.

We'd also like to make sure that a fix for this InvalidOperationException doesn't just result in exposing a new exception because we disposed some shared state that the other thread needs. If we can identify root cause it would be good.

BruceForstall · 2021-09-09T22:55:34Z

That's how the CoreCLR tests work: a set of "XUnitWrapper" assemblies are built with one class for each test with an xunit "Fact" for the test, which gets invoked by xunit, which then spawns an execution of the test. You can see the built wrapper source code in the artifacts, e.g., artifacts\tests\coreclr\windows.x64.Checked\TestWrappers\JIT.CodeGenBringUpTests\JIT.CodeGenBringUpTests.XUnitWrapper.cs

I think the code which creates these wrappers is src\tests\run.proj

@trylek can probably explain the process more

ericstj · 2021-09-09T23:02:58Z

built with one class for each test

I see, so perhaps this pattern is appearing because we're stressing Xunit. It's interesting that we pass many test assemblies to a single invocation. That could be the reason this happens more often with these wrapper tests.

BruceForstall · 2021-09-09T23:12:24Z

Note that we pass -parallel collections to xunit. I think (?) that allows tests invoked by different wrapper assemblies to be run in parallel, but those in a single wrapper assembly must be serialized. I can't remember why we did that; possibly to avoid hitting this issue (as much).

agocke · 2021-09-09T23:37:28Z

I believe -parallel collections means to parallelize every XUnit collection, which is controlled by the XUnit Collection attribute that you can put on test classes. If a class isn't attributed, I think all classes are by-default different collections, so effectively every test class is executed in parallel.

ericstj · 2021-09-09T23:50:13Z

Interesting, that jives with "stressing" xunit. Another characteristic we noticed was that the tests failing were all PayloadGroup0. At first I thought this was just a reporting quirk, but now I noticed that it's actually the work item name:

runtime/src/tests/Common/testgrouping.proj

Lines 119 to 137 in 57bfe47

    
               <XUnitWrapperGrouping Include="$(TestBinDir)baseservices\compilerservices\*.XUnitWrapper.dll; 
        
                                              $(TestBinDir)baseservices\exceptions\*.XUnitWrapper.dll; 
        
                                              $(TestBinDir)baseservices\multidimmarray\*.XUnitWrapper.dll; 
        
                                              $(TestBinDir)baseservices\TieredCompilation\*.XUnitWrapper.dll; 
        
                                              $(TestBinDir)baseservices\typeequivalence\*.XUnitWrapper.dll; 
        
                                              $(TestBinDir)baseservices\varargs\*.XUnitWrapper.dll; 
        
                                              $(TestBinDir)CoreMangLib\**\*.XUnitWrapper.dll; 
        
                                              $(TestBinDir)Exceptions\**\*.XUnitWrapper.dll; 
        
                                              $(TestBinDir)GC\**\*.XUnitWrapper.dll; 
        
                                              $(TestBinDir)hosting\**\*.XUnitWrapper.dll; 
        
                                              $(TestBinDir)ilasm\**\*.XUnitWrapper.dll; 
        
                                              $(TestBinDir)Loader\**\*.XUnitWrapper.dll; 
        
                                              $(TestBinDir)managed\**\*.XUnitWrapper.dll; 
        
                                              $(TestBinDir)readytorun\**\*.XUnitWrapper.dll; 
        
                                              $(TestBinDir)reflection\**\*.XUnitWrapper.dll; 
        
                                              $(TestBinDir)Regressions\**\*.XUnitWrapper.dll; 
        
                                              $(TestBinDir)tracing\**\*.XUnitWrapper.dll" 
        
                                     Exclude="$(TestBinDir)Loader\classloader\Loader.classloader.XUnitWrapper.dll"> 
        
                 <PayloadGroup>PayloadGroup0</PayloadGroup>

ViktorHofer · 2021-09-10T07:58:34Z

@ericstj and I were looking at the linked test failures again and found an interesting pattern. All of the clr test failures shown here were using XUnitWrapper in the logs.

Even though the failure occurs in runtime tests most of the time (which use the xunitwrapper infra) there are also hits in libraries tests.

ericstj · 2021-09-15T19:45:09Z

I'd be very interested in seeing where this hits in the libraries tests. If you have any logs please share them. I couldn't find any in the linked builds.

After looking at the call-stacks here and the tests they repro in, it's all happening due to us running multiple test assemblies at once. The TestFramework object has a 1:1 relationship with test assembly and when that object is disposed it calls the static ExtensibilityFactory.Dispose which modifies a static DisposalTracker field without a lock. These runtime tests use a lot of assemblies with very few tests, so they have a high likelihood of concurrent execution. @bradwilson mentioned that .NETCore xunit runner isn't designed to handle running multiple assemblies at once and we might be the only folks doing it due to our special runner. It might be a good idea to avoid multiple assemblies long term, as it puts us off the happy path and might cause other issues. I can see us dealing with that by either merging the assemblies (they are all generated after all) or making one call per assembly.

We'll still pursue the xunit fix, but I wanted to point out we are doing something rather special here that seems to be causing/exacerbating this problem.

agocke · 2021-09-15T19:49:52Z

@bradwilson mentioned that .NETCore xunit runner isn't designed to handle running multiple assemblies at once

~~I don't understand this. https://xunit.net/docs/running-tests-in-parallel seems to document the option as the default for the XUnit 1 and 2 console runners.~~

Never mind, I misread -- the default is collections.

ericstj · 2021-09-15T19:52:10Z

Test classes/cases work in parallel, but multiple assemblies in parallel does not (is unknown). Apparently it's only working for us because we have a fork of the runner.

agocke · 2021-09-15T19:53:14Z

Well, assemblies is a documented option to the runner, it's just not the default. Maybe it's just less robust?

ViktorHofer · 2021-09-15T19:55:37Z

Apparently it's only working for us because we have a fork of the runner.

Our fork is exactly the same as the upstream xunit.console runner (unsure if it still exists upstream) except for the following changes which we documented here: https://github.com/dotnet/arcade/blob/main/src/Microsoft.DotNet.XUnitConsoleRunner/README.md.

ericstj · 2021-09-15T20:24:36Z

My understanding is that even the upstream runner didn't support multiple test assemblies very well and was a motivating factor for a change in direction in v3. V3's console runner only supports .NETFramework. I believe the runner was in this state because it was originally designed for .NETFramework which provided AppDomain isolation and test-specific dependency control that could allow for isolation of runner:test and test:test dependencies. We didn't have parity for that in .NETCore. Now we have some support with ALC functionality but I suspect it's not equivalent and I believe the decision was already made to change direction.

Maybe it's just less robust?

Yes, though the 2.0 runner supports that commandline option, we might be the only ones using it, and likely are the only ones stressing it to the extent we do in dotnet/runtime. So if we could refactor to single-assembly per invocation we'd be more likely to be on the happy path and avoid these sort of bugs.

bradwilson · 2021-09-15T20:52:41Z

The issue (in v2) is this:

We support running multiple assemblies in parallel, but the only runners that officially support this are the .NET Framework runners (our console and MSBuild runners). There is app domain isolation available there that helps prevent multiple test assemblies from stepping on each other, though there are edge cases inside .NET Framework itself that can still confound (like the fact that "current directory" is a process-wide setting, so calling Directory.SetCurrentDirectory affects all app domains in the process; by and large, app domain isolation does all the isolation heavy lifting).

Most teams--I want to say "everybody except the .NET team" but I don't have the data to back that up--who run .NET Core unit tests use the third party dotnet test functionality (or something that consumes that same plugin, like Visual Studio and VS Code). Whether those things support running multiple assemblies in parallel is up to them, and how they deal with isolation is up to them (the only reasonable option they'd have is process isolation, I assume). We don't provide that functionality because it's outside the scope of the VSTest plugin API.

What the .NET team appears to be doing is using a forked version of a deprecated runner that was a .NET Core version of our console runner (originally designed for the now removed dotnet xunit functionality), which is attempting to run multiple assemblies in the same address space. .NET Core does not have app domain isolation (or any other isolation, for that matter) that can help prevent the kinds of collisions that you've hit. This is why the bug your team is hitting is only being hit by you: because you're the only people trying to write your own runner in this way (or if anybody else is doing so, they've fixed or ignored the problem in their own way without raising it with xUnit.net).

As for v3, the entire design of test projects has changed from "compiles to library" to "compiles to application". The issues with isolation and dependency resolution in .NET Core were a big part of the forcing function. In v3, our supported console runner will allow you to run .NET Core tests in addition to .NET Framework tests (and even parallelize .NET Framework and .NET Core tests together, as well as having significantly better support for varying versions of .NET Framework or .NET Core, allow running 32- and 64-bit tests in parallel, etc.).

ViktorHofer · 2021-09-28T13:23:48Z

According to runfo this isn't happening anymore in main or release/6.0: https://runfo.azurewebsites.net/tracking/issue/111/

As the fix was merged into release/6.0 but not release/6.0-rc2 there are still a few hits for the RC2 branch but those will just naturally go away when RC2 doesn't produce any builds anymore.

Closing.

stephentoub · 2021-09-28T13:24:55Z

Woohoo

msftgits transferred this issue from dotnet/coreclr Jan 31, 2020

msftgits added this to the Future milestone Jan 31, 2020

akoeplinger mentioned this issue May 15, 2020

Update XHarness for latest fixes #36484

Merged

jkotas mentioned this issue Jul 12, 2020

Sync shared compiler files #39162

Merged

jkotas mentioned this issue Jul 28, 2020

Cleanup mscorlib references under src\coreclr\src #39994

Merged

This was referenced Aug 2, 2020

System.InvalidOperationException: Collection was modified after the enumerator was instantiated. xunit/xunit#1855

Closed

Remove netstandard packaging infra #40239

Merged

echesakov mentioned this issue Aug 6, 2020

JIT: Properly handle a switch opt case during early flow opts #40434

Merged

jozkee mentioned this issue Aug 15, 2020

Reverts https://github.com/dotnet/corefx/pull/37583 ("Use clonefile for CopyFile, if available #37583") #40753

Merged

AndyAyersMS mentioned this issue Aug 27, 2020

Fix SPMI to handle replays of BBINSTR jit method contexts #41386

Merged

jkotas mentioned this issue Nov 17, 2020

Special-case zero parameter array allocations in GetParameters #44759

Merged

safern added the blocking-clean-ci Blocking PR or rolling runs of 'runtime' or 'runtime-extra-platforms' label Nov 23, 2020

runfoapp bot mentioned this issue Nov 23, 2020

Infrastructure - Status/Health #702

Closed

ViktorHofer mentioned this issue Dec 16, 2020

[master] Update dependencies from dotnet/runtime #45387

Merged

AndyAyersMS mentioned this issue Dec 17, 2020

RyuJIT: Don't emit cast helpers for (T)array.Clone() #45311

Merged

danmoseley mentioned this issue Jan 8, 2021

[release/5.0] Backport System.Speech to compat pack #46729

Merged

akoeplinger mentioned this issue Jan 15, 2021

[aot] Avoid creating runtime invoke wrappers for methods which return a byreflike type, the return value cannot be boxed. #47033

Merged

josalem mentioned this issue Sep 9, 2021

Test failure tracing/eventpipe/pauseonstart/pauseonstart/pauseonstart.sh #58781

Closed

terrajobst mentioned this issue Sep 9, 2021

Fork Xunit Console Runner to Arcade dotnet/arcade#1613

Closed

5 tasks

am11 mentioned this issue Sep 13, 2021

Share init-compiler with arcade #59018

Merged

premun mentioned this issue Sep 15, 2021

[release/6.0] Update dependencies from dotnet/emsdk dotnet/arcade #59081

Merged

maryamariyan mentioned this issue Sep 16, 2021

Add lock around DisposalTracker xunit/xunit#2382

Merged

jakobbotsch mentioned this issue Sep 16, 2021

Use unsigned casts for array/Span indexers #57970

Merged

ericstj mentioned this issue Sep 17, 2021

[main] Update dependencies from 8 repositories #59195

Merged

maryamariyan mentioned this issue Sep 17, 2021

Update xunit version on dotnet/runtime #59275

Merged

ViktorHofer closed this as completed Sep 28, 2021

ericstj mentioned this issue Sep 30, 2021

Avoid running test assemblies in parallel in the same xunit process #59821

Open

ghost locked as resolved and limited conversation to collaborators Nov 3, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Exception in test infrastructure - System.InvalidOperationException: Collection was modified after the enumerator was instantiated #11063

Exception in test infrastructure - System.InvalidOperationException: Collection was modified after the enumerator was instantiated #11063

4creators commented Sep 11, 2018 •

edited by jaredpar

Loading

danmoseley commented Jan 17, 2019

ViktorHofer commented Jan 17, 2019

ViktorHofer commented Nov 18, 2019

danmoseley commented Nov 18, 2019

safern commented May 15, 2020

Anipik commented Nov 2, 2020

ViktorHofer commented Nov 3, 2020

stephentoub commented Nov 9, 2020

stephentoub commented Nov 9, 2020

ViktorHofer commented Nov 9, 2020

naricc commented Nov 23, 2020

radical commented Dec 3, 2020

ViktorHofer commented Dec 27, 2020

maryamariyan commented Sep 9, 2021

ChadNedzlek commented Sep 9, 2021

ericstj commented Sep 9, 2021 •

edited

Loading

BruceForstall commented Sep 9, 2021 •

edited

Loading

ericstj commented Sep 9, 2021 •

edited

Loading

BruceForstall commented Sep 9, 2021

agocke commented Sep 9, 2021

ericstj commented Sep 9, 2021

ViktorHofer commented Sep 10, 2021

ericstj commented Sep 15, 2021

agocke commented Sep 15, 2021 •

edited

Loading

ericstj commented Sep 15, 2021 •

edited

Loading

agocke commented Sep 15, 2021

ViktorHofer commented Sep 15, 2021

ericstj commented Sep 15, 2021

bradwilson commented Sep 15, 2021 •

edited

Loading

ViktorHofer commented Sep 28, 2021

stephentoub commented Sep 28, 2021

Exception in test infrastructure - System.InvalidOperationException: Collection was modified after the enumerator was instantiated #11063

Exception in test infrastructure - System.InvalidOperationException: Collection was modified after the enumerator was instantiated #11063

Comments

4creators commented Sep 11, 2018 • edited by jaredpar Loading

danmoseley commented Jan 17, 2019

ViktorHofer commented Jan 17, 2019

ViktorHofer commented Nov 18, 2019

danmoseley commented Nov 18, 2019

safern commented May 15, 2020

Anipik commented Nov 2, 2020

ViktorHofer commented Nov 3, 2020

stephentoub commented Nov 9, 2020

stephentoub commented Nov 9, 2020

ViktorHofer commented Nov 9, 2020

naricc commented Nov 23, 2020

radical commented Dec 3, 2020

ViktorHofer commented Dec 27, 2020

maryamariyan commented Sep 9, 2021

ChadNedzlek commented Sep 9, 2021

ericstj commented Sep 9, 2021 • edited Loading

BruceForstall commented Sep 9, 2021 • edited Loading

ericstj commented Sep 9, 2021 • edited Loading

BruceForstall commented Sep 9, 2021

agocke commented Sep 9, 2021

ericstj commented Sep 9, 2021

ViktorHofer commented Sep 10, 2021

ericstj commented Sep 15, 2021

agocke commented Sep 15, 2021 • edited Loading

ericstj commented Sep 15, 2021 • edited Loading

agocke commented Sep 15, 2021

ViktorHofer commented Sep 15, 2021

ericstj commented Sep 15, 2021

bradwilson commented Sep 15, 2021 • edited Loading

ViktorHofer commented Sep 28, 2021

stephentoub commented Sep 28, 2021

4creators commented Sep 11, 2018 •

edited by jaredpar

Loading

ericstj commented Sep 9, 2021 •

edited

Loading

BruceForstall commented Sep 9, 2021 •

edited

Loading

ericstj commented Sep 9, 2021 •

edited

Loading

agocke commented Sep 15, 2021 •

edited

Loading

ericstj commented Sep 15, 2021 •

edited

Loading

bradwilson commented Sep 15, 2021 •

edited

Loading