-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Exception in test infrastructure - System.InvalidOperationException: Collection was modified after the enumerator was instantiated #11063
Comments
@ViktorHofer does this look familiar? |
that's something that dotnet/arcade#1613 should fix. |
@echesakov any numbers how often this happens? have somebody already root-caused the issue? Seems like this lives in xunit/xunit and not in the runner. |
happened again in #43651 |
Probably coming from: https://github.com/xunit/xunit/blob/a8614d34999c889e1c8014f679de95d20eae1304/src/xunit.execution/Sdk/DisposalTracker.cs#L26-L27 I see @danmosemsft already linked to the xunit issue: xunit/xunit#1855 |
I've been hitting this reasonably frequently, e.g.
|
Which is closed. |
As it was fixed in xunit v3 which hasn't shipped yet. |
I am also seeing this intermittently on Mono runtime test lanes. |
@ericstj and I were looking at the linked test failures again and found an interesting pattern. All of the clr test failures shown here were using XUnitWrapper in the logs. e.g. seen in here:
Are we doing anything unusual in that test infrastructure? |
I think this is just a bug in the XUnit execution engine, which I don't think is under active development anymore. I don't think it's caused by anything in the actual testing. |
@maryamariyan is actually looking at fixing that bug, and as part of that fix we'd like to describe why it's happening. Xunit has a static API that will operate on this shared stack, so something is calling that static API from multiple threads. We were going to trace a test exectuion to see why that happens from multiple threads. To find a good test to trace we started looking at the existing repros and that's when we noticed this pattern. It seemed too curious not to ignore. We'd also like to make sure that a fix for this |
That's how the CoreCLR tests work: a set of "XUnitWrapper" assemblies are built with one class for each test with an xunit "Fact" for the test, which gets invoked by xunit, which then spawns an execution of the test. You can see the built wrapper source code in the artifacts, e.g., artifacts\tests\coreclr\windows.x64.Checked\TestWrappers\JIT.CodeGenBringUpTests\JIT.CodeGenBringUpTests.XUnitWrapper.cs I think the code which creates these wrappers is src\tests\run.proj @trylek can probably explain the process more |
I see, so perhaps this pattern is appearing because we're stressing Xunit. It's interesting that we pass many test assemblies to a single invocation. That could be the reason this happens more often with these wrapper tests. |
Note that we pass |
I believe |
Interesting, that jives with "stressing" xunit. Another characteristic we noticed was that the tests failing were all runtime/src/tests/Common/testgrouping.proj Lines 119 to 137 in 57bfe47
|
Even though the failure occurs in runtime tests most of the time (which use the xunitwrapper infra) there are also hits in libraries tests. |
I'd be very interested in seeing where this hits in the libraries tests. If you have any logs please share them. I couldn't find any in the linked builds. After looking at the call-stacks here and the tests they repro in, it's all happening due to us running multiple test assemblies at once. The We'll still pursue the xunit fix, but I wanted to point out we are doing something rather special here that seems to be causing/exacerbating this problem. |
Never mind, I misread -- the default is |
Test classes/cases work in parallel, but multiple assemblies in parallel does not (is unknown). Apparently it's only working for us because we have a fork of the runner. |
Well, |
Our fork is exactly the same as the upstream xunit.console runner (unsure if it still exists upstream) except for the following changes which we documented here: https://github.com/dotnet/arcade/blob/main/src/Microsoft.DotNet.XUnitConsoleRunner/README.md. |
My understanding is that even the upstream runner didn't support multiple test assemblies very well and was a motivating factor for a change in direction in v3. V3's console runner only supports .NETFramework. I believe the runner was in this state because it was originally designed for .NETFramework which provided AppDomain isolation and test-specific dependency control that could allow for isolation of runner:test and test:test dependencies. We didn't have parity for that in .NETCore. Now we have some support with ALC functionality but I suspect it's not equivalent and I believe the decision was already made to change direction.
Yes, though the 2.0 runner supports that commandline option, we might be the only ones using it, and likely are the only ones stressing it to the extent we do in dotnet/runtime. So if we could refactor to single-assembly per invocation we'd be more likely to be on the happy path and avoid these sort of bugs. |
The issue (in v2) is this: We support running multiple assemblies in parallel, but the only runners that officially support this are the .NET Framework runners (our console and MSBuild runners). There is app domain isolation available there that helps prevent multiple test assemblies from stepping on each other, though there are edge cases inside .NET Framework itself that can still confound (like the fact that "current directory" is a process-wide setting, so calling Most teams--I want to say "everybody except the .NET team" but I don't have the data to back that up--who run .NET Core unit tests use the third party What the .NET team appears to be doing is using a forked version of a deprecated runner that was a .NET Core version of our console runner (originally designed for the now removed As for v3, the entire design of test projects has changed from "compiles to library" to "compiles to application". The issues with isolation and dependency resolution in .NET Core were a big part of the forcing function. In v3, our supported console runner will allow you to run .NET Core tests in addition to .NET Framework tests (and even parallelize .NET Framework and .NET Core tests together, as well as having significantly better support for varying versions of .NET Framework or .NET Core, allow running 32- and 64-bit tests in parallel, etc.). |
According to runfo this isn't happening anymore in main or release/6.0: https://runfo.azurewebsites.net/tracking/issue/111/ As the fix was merged into release/6.0 but not release/6.0-rc2 there are still a few hits for the RC2 branch but those will just naturally go away when RC2 doesn't produce any builds anymore. Closing. |
Woohoo |
For the second time I have encountered the following error during CI OSX test run.
https://ci.dot.net/job/dotnet_coreclr/job/master/job/x64_checked_osx10.12_innerloop_tst_prtest/5550/
The text was updated successfully, but these errors were encountered: