Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cache ShouldAnalyzeTree results for better performance #8407

Closed
wants to merge 2 commits into from

Conversation

sebastien-marichal
Copy link
Contributor

@sebastien-marichal sebastien-marichal commented Nov 24, 2023

Fixes #8406

Solution using Tuple to store the latest SyntaxTree within the RegisterNodeAction of CompilationStartActionContext.

On oqtane framework, before changes:
image

After changes:
image

From RegisterNodeAction point of view:
Before:
image
After:
image

@@ -47,6 +47,9 @@ public abstract class SonarAnalysisContextBase<TContext> : SonarAnalysisContextB
public SonarAnalysisContext AnalysisContext { get; }
public TContext Context { get; }

private readonly ConcurrentDictionary<SyntaxTree, bool> shouldAnalyzeTreeCache = new();
private bool? shouldAnalyzeNullTreeCache = null;
Copy link
Contributor Author

@sebastien-marichal sebastien-marichal Nov 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now, I use this single bool to cache the result when the SyntaxTree is null as I cannot store null as a key in the ConcurrentDictionary.

|| sonarLintXml.AnalyzeGeneratedCode(Compilation.Language)
|| !tree.IsConsideredGenerated(generatedCodeRecognizer, Compilation, IsRazorAnalysisEnabled()))
&& (tree is null || (!IsUnchanged(tree) && !IsExcluded(sonarLintXml, tree.FilePath)));
public bool ShouldAnalyzeTree(SyntaxTree tree, GeneratedCodeRecognizer generatedCodeRecognizer)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you open a second PR with this change in SonarCompilationStartAnalysisContext

    public void RegisterNodeAction<TSyntaxKind>(GeneratedCodeRecognizer generatedCodeRecognizer, Action<SonarSyntaxNodeReportingContext> action, params TSyntaxKind[] syntaxKinds)
        where TSyntaxKind : struct
    {
        if (HasMatchingScope(AnalysisContext.SupportedDiagnostics))
        {
            var lastShouldAnalyze = default(Tuple<SyntaxTree, bool>);
            Context.RegisterSyntaxNodeAction(x =>
            {
                var tree = x.Node.SyntaxTree;
                var last = lastShouldAnalyze; // Make a local copy of the reference to avoid concurrency issues between the access of Item1 and Item2
                bool shouldAnalyze;
                if (ReferenceEquals(last?.Item1, tree))
                {
                    shouldAnalyze = last!.Item2;
                }
                else
                {
                    // Inlined from "Execute"
                    shouldAnalyze = ShouldAnalyzeTree(x.Node.SyntaxTree, generatedCodeRecognizer)
                        && SonarAnalysisContext.LegacyIsRegisteredActionEnabled(AnalysisContext.SupportedDiagnostics, x.Node.SyntaxTree)
                        && AnalysisContext.ShouldAnalyzeRazorFile(x.Node.SyntaxTree);
                    lastShouldAnalyze = new Tuple<SyntaxTree, bool>(tree, shouldAnalyze);
                }
                if (shouldAnalyze)
                {
                    action(new(AnalysisContext, x));
                }
            }, syntaxKinds);
        }
    }

It would be interesting to see the savings here, as this doesn't come with the overhead of the concurrent dictionaries.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using this approach is not as good as caching in a dictionary.
image
But that's only from ShouldAnalyzeTree point of view.

From SonarCompilationStartAnalysisContext.RegisterNodeAction point of view, it is faster:

Using your solution, with a tuple:
image
My solution, with a ConcurrentDisctionary:
image

For reference, here it is before any changes (of this PR):
image

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Am I right in assuming that this is guaranteed that within the same compilation, the SyntaxTrees will be analyzed sequentially?
Once lastShouldAnalyze will change that means analysis on the last SyntaxTree is done?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please open two (draft) PRs so we can compare both solutions.

Am I right in assuming that this is guaranteed that within the same compilation, the SyntaxTrees will be analyzed sequentially?

There is no such guarantee to rely on:

There are no ordering constraints for the invocations of any actions, except those introduced by start and end actions (compilation start, compilation end, symbol start, symbol end, operation block start, operation block end, code bock start, and code block end actions). Invocations of actions of one kind may be interleaved with invocations of actions of another kind. A given host environment might appear to provide a predictable order—for example, invoking syntax tree actions before semantic model actions, or symbol actions before code block actions, but depending on such an ordering is not safe.

Source

It was a shot in the blue by me. We can make an educated guess now for the analyzer runner that is in place in the build context. That runner is likely different from the one running in the IDE context. Your solution is more robust but comes at the cost of:

  • Locking on write (which is somewhat common)
  • Slow on a large number of SyntaxTree per compilation
  • Holds references to all SyntaxTrees at the end of the compilation, even though only a small portion is still needed
  • Does not work well when one of the additional files changes (namely SonarLint.xml).

We may need to come up with a compromise between only covering "latest" (mine) and "all" (yours). With the following characteristics

At the same time, we do not have the time to over-engineer it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will keep this one PR for the tuple solution.
I opened #8411 with the ConcurrentDirectory solution

@@ -47,6 +47,9 @@ public abstract class SonarAnalysisContextBase<TContext> : SonarAnalysisContextB
public SonarAnalysisContext AnalysisContext { get; }
public TContext Context { get; }

private readonly ConcurrentDictionary<SyntaxTree, bool> shouldAnalyzeTreeCache = new();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like the idea of a Dictionary that holds references to SyntaxTrees on an object, that may outlive the compilation. See also https://github.com/dotnet/roslyn-analyzers/blob/main/src/Microsoft.CodeAnalysis.Analyzers/Microsoft.CodeAnalysis.Analyzers.md#rs1008-avoid-storing-per-compilation-data-into-the-fields-of-a-diagnostic-analyzer for why this isn't a good idea.

Let's keep this PR as a reference, but we should try what I proposed here as well.

Copy link

sonarcloud bot commented Nov 24, 2023

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 0 Code Smells

No Coverage information No Coverage information
No Duplication information No Duplication information

Copy link

sonarcloud bot commented Nov 24, 2023

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 0 Code Smells

100.0% 100.0% Coverage
0.0% 0.0% Duplication

@sebastien-marichal
Copy link
Contributor Author

Closing this PR in favor of #8414.
The scoped tuple solution tends to have cache misses due to the SyntaxTrees being analyzed concurrently.

@sebastien-marichal sebastien-marichal deleted the sebastien/shouldanalyzetree branch March 6, 2024 15:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Cache RegisterNodeAction checks result per SyntaxTree
2 participants