Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Roslyn analyzer throws error AD0001 NullReferenceException #3305

Open
3 tasks
akoeplinger opened this issue Jul 8, 2024 · 24 comments
Open
3 tasks

Roslyn analyzer throws error AD0001 NullReferenceException #3305

akoeplinger opened this issue Jul 8, 2024 · 24 comments

Comments

@akoeplinger
Copy link
Member

akoeplinger commented Jul 8, 2024

Build

https://dev.azure.com/dnceng-public/cbb18261-c48f-4abb-8651-8cdcb5474649/_build/results?buildId=733128

Build leg reported

VMR Vertical Build / Ubuntu2404_DevVersions_x64 / Build

Pull Request

dotnet/sdk#42019

Known issue core information

Fill out the known issue JSON section by following the step by step documentation on how to create a known issue

 {
    "ErrorMessage" : "",
    "BuildRetry": true,
    "ErrorPattern": "error AD0001: Analyzer.*threw an exception of type 'System.NullReferenceException'",
    "ExcludeConsoleLog": false
 }

@dotnet/dnceng

Release Note Category

  • Feature changes/additions
  • Bug fixes
  • Internal Infrastructure Improvements

Release Note Description

Additional information about the issue reported

No response

Known issue validation

Build: 🔎 https://dev.azure.com/dnceng-public/public/_build/results?buildId=733128
Error message validated: [error AD0001: Analyzer.*threw an exception of type 'System.NullReferenceException']
Result validation: ✅ Known issue matched with the provided build.
Validation performed at: 7/8/2024 7:22:28 PM UTC

Report

Build Definition Step Name Console log Pull Request
2579533 dotnet-runtime Build product Log
2576798 dotnet-runtime Build product Log
2573475 dotnet-runtime Build product Log

Summary

24-Hour Hit Count 7-Day Hit Count 1-Month Count
0 0 3
@ViktorHofer
Copy link
Member

Hmm... This error showed up in dotnet/sdk#36807 as well:

CSC : error AD0001: Analyzer 'Microsoft.NetCore.CSharp.Analyzers.Runtime.CSharpDetectPreviewFeatureAnalyzer' threw an exception of type 'System.NullReferenceException' with message 'Object reference not set to an instance of an object.'. [/vmr/src/efcore/src/EFCore/EFCore.csproj]

But I don't see it in the above table. Are we missing data?

@akoeplinger
Copy link
Member Author

It's now in the table, probably just took a bit.

@ViktorHofer
Copy link
Member

Interesting, I was under the impression that this only affects VMR builds but apparently this is more widespread.

@akoeplinger
Copy link
Member Author

akoeplinger commented Jul 9, 2024

Yeah. I queried Kusto and we hit this 133 times over the last 60 days just in build logs.

@ericstj
Copy link
Member

ericstj commented Jul 10, 2024

Should we add "BuildRetry": false to the known issue pattern here? I haven't used that feature but it may help.

@akoeplinger
Copy link
Member Author

Yeah. Done.

@jaredpar
Copy link
Member

Not sure what is going on but a significant chunk of the hits here seem to be false positives. Just spent 10 minutes digging through builds and can't see the error on 75% of them.

@jaredpar
Copy link
Member

Many of the builds are attributed to dotnet-runtime when it's actually dotnet-runtime-perf. Also the results look like this ...

image

If the compiler is indeed throwing there it's very hard to dig through to the failure.

@jaredpar
Copy link
Member

jaredpar commented Jul 19, 2024

Cross posting the analysis from the linked bug on roslyn-analyzers

This is the stack of the NullReferenceException in at least one case:

System.NullReferenceException: Object reference not set to an instance of an object.
at System.Collections.Concurrent.ConcurrentDictionary`2.TryRemoveInternal(TKey key, TValue& value, Boolean matchValue, TValue oldValue)
at Microsoft.CodeQuality.Analyzers.Maintainability.AvoidUnusedPrivateFieldsAnalyzer.<>c__DisplayClass5_0.<Initialize>b__2(OperationAnalysisContext operationContext)
at Microsoft.CodeAnalysis.Diagnostics.AnalyzerExecutor.ExecuteAndCatchIfThrows_NoLock[TArg](DiagnosticAnalyzer analyzer, Action`1 analyze, TArg argument, Nullable`1 info, CancellationToken cancellationToken)

That almost certainly represents this line in the roslyn analyzers code:

IFieldSymbol field = ((IFieldReferenceOperation)operationContext.Operation).Field;
if (field.DeclaredAccessibility == Accessibility.Private)
{
    referencedPrivateFields.TryAdd(field, default);
    // Error is here. 
    maybeUnreferencedPrivateFields.TryRemove(field, out _);
}

Both values here are non-null:

  • maybeUnreferencedPrivateFields: is single assign and initialized to non-null at declaration
  • field: is used above this line several times without null reffing.

That seems like a runtime bug.

@jaredpar
Copy link
Member

@akoeplinger, @ericstj, @ViktorHofer at least the variation I'm seeing above appears to be a runtime bug. Do we want to use this issue to track that or file a new one?

@ericstj
Copy link
Member

ericstj commented Jul 22, 2024

I think a new one since the pattern observed here (single log statement with NRE) can't necessarily tie it to the single analyzer. Do you have a dump to help triage the runtime issue or is it just based on the callstack?

@jaredpar
Copy link
Member

jaredpar commented Jul 22, 2024

can't necessarily tie it to the single analyzer.

The log statements are always single line (for reasons I don't understand). But if you get the matching binlog you can usually see the full stack trace if you dig down into the messages.

Do you have a dump to help triage the runtime issue or is it just based on the callstack?

This is just based on call stacks. It manifests as an exception in the analyzer and by default compiler catches those and issues a warning.

The compiler can be configured to fail fast when this happens by setting the following msbuild property

<Features>$(Features);debug-analyzers</Features>

The failures are mostly coming on the runtime pipeline builds so you'd need to be setup to catch crash dumps on process FailFast. Doing that and we should get a dump in a few days.

@jaredpar
Copy link
Member

Another variation of the NRE looks like this

Exception occurred with following context:
Compilation: Microsoft.CodeAnalysis.Razor.Compiler
IOperation: Invocation
SyntaxTree: /vmr/src/razor/src/Compiler/Microsoft.CodeAnalysis.Razor.Compiler/src/Language/Syntax/Generated/Syntax.xml.Syntax.Generated.cs
SyntaxNode: GetAnnotations() [InvocationExpressionSyntax]@[6078..6094) (208,30)-(208,46)
System.NullReferenceException: Object reference not set to an instance of an object.
at Microsoft.NetCore.Analyzers.Runtime.DetectPreviewFeatureAnalyzer.GetOperationSymbol(IOperation operation)
at Microsoft.NetCore.Analyzers.Runtime.DetectPreviewFeatureAnalyzer.OperationUsesPreviewFeatures(OperationAnalysisContext context, ConcurrentDictionary`2 requiresPreviewFeaturesSymbols, INamedTypeSymbol previewFeatureAttributeSymbol, ISymbol& referencedPreviewSymbol)
at Microsoft.NetCore.Analyzers.Runtime.DetectPreviewFeatureAnalyzer.<>c__DisplayClass33_0.<Initialize>b__1(OperationAnalysisContext context)
at Microsoft.CodeAnalysis.Diagnostics.AnalyzerExecutor.ExecuteAndCatchIfThrows_NoLock[TArg](DiagnosticAnalyzer analyzer, Action`1 analyze, TArg argument, Nullable`1 info, CancellationToken cancellationToken)

That is basically down to this block of code. That code on it's own (no inlining) is very hard to see a NRE on. Suspect that there is some amount of inlining going on here.

The invocation being analyzed here is the GetAnnotations() call on this line. That means we should be at this point in the code

        private static ISymbol? GetOperationSymbol(IOperation operation)
            => operation switch
            {
                // EXECUTION SHOULD BE HERE
                IInvocationOperation iOperation => iOperation.TargetMethod,
                IObjectCreationOperation cOperation => cOperation.Constructor,

Basically that should be an InvocationOperation and given that it's sealed and only impl of IInvocationOperation it is likely a candidate for inlining. At the same time the TargetMethod is an auto-implemented property and shouldn't ever null ref itself.

This a very puzzling one to understand. Seems like another case we'd need the debug-analyzers flag to get a dump and track down.

@333fred in case he can see any flaw in my analysis here.

@333fred
Copy link
Member

333fred commented Jul 22, 2024

That is basically down to this block of code. That code on it's own (no inlining) is very hard to see a NRE on.

Agreed. Looking at that block, there shouldn't be an opportunity for a null ref there, everything is null-safe. Even inlining in that location shouldn't result in a null-ref; looking through every property called by that method, directly or indirectly, they're auto-props, and the instances they're called on are checked for null beforehand. The only thing in that closure that isn't an extremely simple auto-prop is the call to arrayTypeSymbol.ElementType, but it's also extremely hard to see where that property would ever null-ref; further, it's not a good candidate for inlining, since there are several implementations. Agreed that we need more data to troubleshoot further.

@jaredpar
Copy link
Member

Note: the results from the aspnetcore-quarantined-pr pipeline aren't actionable. They don't upload any binlogs when the build fails so we can't see what is happening.

@captainsafia who on aspnetcore owns this pipeline?

@captainsafia
Copy link
Member

@captainsafia who on aspnetcore owns this pipeline?

AFAIK, the build isn't configured to produce binlogs at the moment (ref). @wtgodbe can help with making a change here to produce binlogs for further investigation.

@jaredpar
Copy link
Member

Started compiling all of the results from looking at the failures into this gist

@jaredpar
Copy link
Member

AFAIK, the build isn't configured to produce binlogs at the moment (ref). @wtgodbe can help with making a change here to produce binlogs for further investigation.

Note: the aspnetcore-ci pipeline uploads logs but it overwrites them on retry. That means if we hit any of these failures, then retry the logs of the failure are effectively deleted. That means we can't really get any info from any of the aspnetcore pipelines here.

@jaredpar
Copy link
Member

Have a gist where I've summarized the diff errors I'm seeing. Dug into five of them.

  • Roughly three are NRE where it's very hard to see how there could be an NRE
  • One I can only narrow down to a medium sized method in rosyln-analyzers. That method is hardened against a lot of null but given the size it's harder to say with confidence what is happening here.
  • One is a cast exception the compiler. In isolation that feels like a compiler bug. In context I wonder if it could be related to the underlying issue we're seeing.

There are 2-3 other analyzer that are producing AD0001 diagnostics that I haven't bothered to dig into.

I think the next steps are to get the owners of dotnet-unified-build and sdk-unified-build to enable compiler crash on analyzer exception and collection of dump logs so we can get a better idea what is going on here.

@jaredpar
Copy link
Member

Can we please resolve this as a dupe of dotnet/runtime#104123? That is what the runtime team's investigation lead them to. The fix is in PR and will be back ported to 9.0 P7

@akoeplinger
Copy link
Member Author

the issue just got reopened, I also think we should keep this open for a bit so we have Build Analysis tracking

@omajid
Copy link
Member

omajid commented Aug 14, 2024

I am also running into this (and some variants) when building the .NET 9 Preview 7 using the VMR on a number of arm64 platforms using 9.0.100-preview.7.24380.2 as the build SDK.

Like I mentioned in dotnet/source-build#4555 I am seeing a number of variants of errors:

  • Analyzer 'Microsoft.Interop.Analyzers.CustomMarshallerAttributeAnalyzer' threw an exception of type 'System.InvalidProgramException' with message 'Common Language Runtime detected an invalid program
  • CSC : error AD0001: Analyzer 'Microsoft.NetCore.CSharp.Analyzers.Runtime.CSharpDetectPreviewFeatureAnalyzer' threw an exception of type 'System.NullReferenceException' with message 'Object reference not set to an instance of an object.'
  • error MSB6006: "csc.dll" exited with code 139

@jaredpar
Copy link
Member

This as a dupe of dotnet/runtime#104123. I don't have permissions to resolve the issue.

@ellahathaway
Copy link
Member

dotnet/source-build#4576 - I suspect that SB just encountered this error in one of our 9.0 builds

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants