Simplify the semantic model caching scheme we use to reuse semantic models during typing. #45565

CyrusNajmabadi · 2020-06-30T20:11:27Z

This PR will be easier to review one commit at a time.

--

The new model is drastically simpler than the old one, and bases its implementation on the observation that the majority of typing tends to be localized. i.e. users do not type one char, then jump elsewhere and type another thing, then jump elsewhere and type another thing**. Instead, they will have a batch of a typing in one location, will routinely edit in a single method body, and will eventually go elsewhere to do their next bit of work.

Given that, the semantic model cache we have is simple: We start by retrieving a real semantic-model S for a document at time T. This is used normally. When an edit happens and all the edits that happened between T and now were intra-method edits, then we use S to retrieve a speculative model S' that can be used to answer questions that typing features need.

In a similar vein, because many features (like all the completion providers) will light up at teh same time asking for the same semantic model, we cache S' as well, returning it if no edits have occurred.

This PR will be easier to review one commit at a time.

--

**
Note: that jumping/typing pattern does occur when cycling through things and potentially replacing matches. However, even for that case, we should be no worse than we were before.

CyrusNajmabadi · 2020-07-01T00:49:02Z

...cModelWorkspaceService/SemanticModelWorkspaceServiceFactory.SemanticModelWorkspaceService.cs

@@ -9,13 +9,13 @@
 using Microsoft.CodeAnalysis.Host.Mef;
 using Microsoft.CodeAnalysis.Shared.Extensions;

-namespace Microsoft.CodeAnalysis.SemanticModelWorkspaceService
+namespace Microsoft.CodeAnalysis.SemanticModelReuse


renamed namespace and types to be clearer about what purpose they serve.

CyrusNajmabadi

c

src/EditorFeatures/CSharpTest/TypeInferrer/TypeInferrerTests.cs

CyrusNajmabadi · 2020-07-01T00:58:12Z

...paces/SharedUtilitiesAndExtensions/Compiler/CSharp/Services/SyntaxFacts/CSharpSyntaxFacts.cs

@@ -978,71 +979,6 @@ public bool ContainsInMemberBody(SyntaxNode node, TextSpan span)
        private static TextSpan GetBlockBodySpan(BlockSyntax body)
            => TextSpan.FromBounds(body.OpenBraceToken.Span.End, body.CloseBraceToken.SpanStart);

-        public int GetMethodLevelMemberId(SyntaxNode root, SyntaxNode node)


these members were errantly placed in ISyntaxFacts. i moved them to a dedicated service for semanticmodel reuse as that's all they're intended for.

I'll never complain about the number of methods of SyntaxFacts going down!

CyrusNajmabadi · 2020-07-01T01:05:58Z

src/Workspaces/CSharp/Portable/SemanticModelReuse/CSharpSemanticModelReuseLanguageService.cs

+namespace Microsoft.CodeAnalysis.CSharp.SemanticModelReuse
+{
+    [ExportLanguageService(typeof(ISemanticModelReuseLanguageService), LanguageNames.CSharp), Shared]
+    internal class CSharpSemanticModelReuseLanguageService : ISemanticModelReuseLanguageService


note: there is a workspace-service and a lang-service. teh former is what is used for the basic semantic-model caching. The latter is waht it used for doing vb/c# specific work in that process.

...ble/SemanticModelReuse/SemanticModelWorkspaceServiceFactory.SemanticModelWorkspaceService.cs

…orkspaceServiceFactory.SemanticModelWorkspaceService.cs Co-authored-by: Sam Harwell <sam@tunnelvisionlabs.com>

CyrusNajmabadi · 2020-07-02T22:12:50Z

...SharedUtilitiesAndExtensions/Compiler/CSharp/Extensions/ContextQuery/SyntaxTreeExtensions.cs

+{
+    internal static partial class SyntaxTreeExtensions
+    {
+        public static bool IsPreProcessorDirectiveContext(this SyntaxTree syntaxTree, int position, SyntaxToken preProcessorTokenOnLeftOfPosition, CancellationToken cancellationToken)


this is just a move.

CyrusNajmabadi · 2020-07-02T22:13:52Z

src/Workspaces/SharedUtilitiesAndExtensions/Compiler/Core/Services/SyntaxFacts/ISyntaxFacts.cs

@@ -81,6 +81,7 @@ internal partial interface ISyntaxFacts
        /// preprocessor directive.  For example `if` or `pragma`.
        /// </summary>
        bool IsPreprocessorKeyword(SyntaxToken token);
+        bool IsPreProcessorDirectiveContext(SyntaxTree syntaxTree, int position, CancellationToken cancellationToken);


moved this from ISemanticFacts. This is not a semantic fact. it's a syntax fact.

also, the inconsistency between Preprocessor and PreProcessor bugs me. but i'm not changing that now.

CyrusNajmabadi · 2020-07-06T18:00:17Z

@jasonmalinowski ptal. thanks!

jasonmalinowski

Looks overall good, I do have one general concern still about the perf impacts of this. This assumes that some feature already primed the the cache first with the non-speculative model, but is that actually happening in practice? Consider:

Open a file.
We produce a semantic model for classification since it runs. But classification doesn't prime the cache since it isn't calling into this.
First keystroke happens in a method. Completion is calling this, but this is the call that actually primes the cache. As a side effect, we produced a new semantic model from scratch for this request.
Second keystroke happens. Classification and everything else is still using the full model, and completion is just filtering our existing data. It's not until this commits and a second completion comes up do we get a benefit.

What might be useful even if this isn't a concern: have callers to this API pass in their feature name, and just record a per-feature cache hit/miss rate. Then it'd be easy to see that this is actually working at all, and if so which feature is actually getting the win. I could totally imagine something might happen awkwardly that completion is still saddled with the full request of having to make a new semantic model, but hey, signature help (which isn't as critical since it's not on a blocking path) gets the benefit.

jasonmalinowski · 2020-07-06T22:59:25Z

...cModelWorkspaceService/SemanticModelWorkspaceServiceFactory.SemanticModelWorkspaceService.cs

        {
-            public Task<SemanticModel> GetSemanticModelForNodeAsync(Document document, SyntaxNode node, CancellationToken cancellationToken = default)
+            public SemanticModelReuseWorkspaceService(Workspace _)


Why the constructor that isn't doing anything with this?

(If I'm guessing it's because you want to imply this service should be per workspace due to event subscriptions but that probably just deserves a comment than me guessing.)

basically, we have multiple impls of this. one for the codefix layer, and one for the other layers. the shared code just unilaterally calls new SemanticModelReuseWorkspaceService(worjkspace). So i need the codefix impl to take in a workspace, even if it doesn't use it.

jasonmalinowski · 2020-07-06T23:15:04Z

src/Workspaces/Core/Portable/SemanticModelReuse/AbstractSemanticModelReuseLanguageService.cs

+                    return null;
+                }
+
+                return previousAccessors[currentAccessors.IndexOf(currentAccessor)];


Potential indexing violation of for some reason GetAccessors here if GetAccessors didn't return a set of accessors that contained currentAccessor. Not sure if there's some icky broken code cases but hopefully we don't get this far?

tehre should be no indexing violation. First, we guarantee we have the same count of accessors. Second, we're getting the index of currentAccessor within currentAccessors. The second part must succeed. i.e. it would/should be impossible to have an accessor not be within the accessor list it is contained in.

The debug-asserts are for parts of this code that i don't have full confidence are always certain. i.e. i'm not 100% certain that if "top level version" stays the same that all these correspondances hold. But i am 100% certain that an accessor must be in its containing accessor list, or else the wheels have fallen off the wagon entirely.

jasonmalinowski · 2020-07-06T23:18:35Z

src/Workspaces/Core/Portable/SemanticModelReuse/ISemanticModelReuseLanguageService.cs

+        SyntaxNode? TryGetContainingMethodBodyForSpeculation(SyntaxNode node);
+
+        /// <summary>
+        /// Given a previous semantic model, and a method-eque node in the current tree for that same document, attempts


Suggested change

/// Given a previous semantic model, and a method-eque node in the current tree for that same document, attempts

/// Given a previous semantic model, and a method-esque node in the current tree for that same document, attempts

jasonmalinowski · 2020-07-06T23:21:47Z

...ble/SemanticModelReuse/SemanticModelWorkspaceServiceFactory.SemanticModelWorkspaceService.cs

+                        return;
+
+                    var solution = e.NewSolution;
+                    foreach (var (docId, _) in map)


Just do map.Keys? Or is this actually faster for some reason?

no. you're correct. i've been coding in go a little too much recently :-/

jasonmalinowski · 2020-07-06T23:58:33Z

...ble/SemanticModelReuse/SemanticModelWorkspaceServiceFactory.SemanticModelWorkspaceService.cs

+                    {
+                        if (!solution.ContainsDocument(docId))
+                        {
+                            _semanticModelMap = ImmutableDictionary<DocumentId, SemanticModelReuseInfo?>.Empty;


I think this code is great, but it obviously looks like a terrifying race since this could potentially run well after a new solution has been added, and somebody has called ReuseExistingSpeculativeModelAsync with a document from that new solution. Given this is a cache and that's really rare, it's fine, but may be worth a comment.

(and to be 100% clear I wouldn't try to address that race in any way, but this does stick out like somebody forgot to use an Interlocked...)

i will add comment.

jasonmalinowski · 2020-07-07T00:08:42Z

src/Workspaces/CoreTest/SemanticModelReuse/SemanticModelReuseTests.cs

+            var document = CreateDocument("", LanguageNames.CSharp);
+
+            // trying to get a model for null should return a non-speculative model
+            var model = await document.ReuseExistingSpeculativeModelAsync(null, CancellationToken.None);


Nit: named parameter. (Or add @akhera99's feature to GitHub, whichever is easier.)

jasonmalinowski · 2020-07-07T00:12:24Z

src/Workspaces/CoreTest/SemanticModelReuse/SemanticModelReuseTests.cs

+            Assert.False(model2.IsSpeculativeSemanticModel);
+
+            // Should be the same models.
+            Assert.Equal(model1, model2);


There's Assert.Same() if you want to explicitly state object instances versus any (potential) Equals override.

jasonmalinowski · 2020-07-07T00:13:27Z

src/Workspaces/CoreTest/SemanticModelReuse/SemanticModelReuseTests.cs

+
+            // This should be able to get a speculative model using the original model we primed the cache with.
+            var model2 = await document2.ReuseExistingSpeculativeModelAsync(source.IndexOf("return"), CancellationToken.None);
+            Assert.True(model2.IsSpeculativeSemanticModel);


There is ParentModel if you want to assert it's parented by the prior one, although I can imagine that might be a bit too much "asserting implementation details" if you want to stay away from that.

Also worth adding an assert that model2.SyntaxTree.ToString() gets you in-body-edited text back, just to make sure we didn't make a speculative model but it's somehow speculating on the wrong thing?

jasonmalinowski · 2020-07-07T00:14:24Z

src/Workspaces/CoreTest/SemanticModelReuse/SemanticModelReuseTests.cs

+
+            // This should be able to get a speculative model using the original model we primed the cache with.
+            var model3 = await document3.ReuseExistingSpeculativeModelAsync(source.IndexOf("return"), CancellationToken.None);
+            Assert.True(model3.IsSpeculativeSemanticModel);


Ditto: ParentModel assertion? (Answer can also be "no" if you'd like.)

jasonmalinowski · 2020-07-07T00:25:58Z

src/Workspaces/CoreTest/SymbolKeyTests.cs

-            var tree1 = await document.GetSyntaxTreeAsync();
-            var basemethod1 = tree1.FindTokenOnLeftOfPosition(position, CancellationToken.None).GetAncestor<CSharp.Syntax.BaseMethodDeclarationSyntax>();
+
+            // Ensure we prime the reuse cache with the true semantic model.


Will raise concern back on main thread.

CyrusNajmabadi · 2020-07-07T18:56:47Z

Consider:

This analysis is spot on. However, it overly focuses just on the first edit in a file/body. In that case, it is correct that you pay the full initial cost of getting that primed semantic model. However, importantly, from that point on as long as you are editing within that body (and not doing anything to throw off the top-level-version info), then that primed semantic model will be used. I typed out several full statements in a method body and got a 100% hit rate on that semantic model for all edits post-priming. This is the value here of paying the price initially for the first semantic model (which we're always going to pay because at least someone is going to ask for semantics), but then being able to use it for a set of edits which is significant in terms of how people do code.

CyrusNajmabadi · 2020-07-07T19:01:23Z

Thank you for the deep and thorough review @jasonmalinowski !

CyrusNajmabadi force-pushed the semanticModelWork branch 2 times, most recently from a4fa12f to 27baafe Compare July 1, 2020 00:37

CyrusNajmabadi requested a review from jasonmalinowski July 1, 2020 00:41

CyrusNajmabadi changed the title ~~Semantic model work~~ Simplify the semantic model caching scheme we use to reuse semantic models during typing. Jul 1, 2020

CyrusNajmabadi force-pushed the semanticModelWork branch from a3b4b04 to 499e554 Compare July 1, 2020 00:48

CyrusNajmabadi commented Jul 1, 2020

View reviewed changes

CyrusNajmabadi marked this pull request as ready for review July 1, 2020 00:59

CyrusNajmabadi requested a review from a team as a code owner July 1, 2020 00:59

CyrusNajmabadi commented Jul 1, 2020

View reviewed changes

...ble/SemanticModelReuse/SemanticModelWorkspaceServiceFactory.SemanticModelWorkspaceService.cs Outdated Show resolved Hide resolved

CyrusNajmabadi added 19 commits July 1, 2020 11:21

Go through common semantic model path.

564a9bb

Add helper

82a1dc7

Add helper extension

c86bde9

Rename method

b4f3e30

Simplify code

7fe7aed

Simplify code.

7d94503

Rename method

a09fc44

Rename method

e67435f

Make parameter non-optional

7f41453

Simplify code.

6764f1e

Simplify code.

6cdd3f7

Rename

3306bb1

Initial work on an entirely new caching implementation

5afc428

C# impl done.

6fcd367

Simplify code.

4123302

VBimpl

44ec875

Fix

d285011

Remove dead code.

23d6f4d

Make non-optional

d4ddfa2

CyrusNajmabadi and others added 12 commits July 2, 2020 12:08

start at the right node.

6279a08

Add assert

3ded424

Update src/Workspaces/Core/Portable/SemanticModelReuse/SemanticModelW…

020258b

…orkspaceServiceFactory.SemanticModelWorkspaceService.cs Co-authored-by: Sam Harwell <sam@tunnelvisionlabs.com>

Add docs and an assert.

8dafe3b

Update doc

8d9db83

Clear the cache as the solution changes.

cb6729b

Doc threading

09bcd71

Simplify

9861343

Simplify

8b4051f

Keep cached linked doc info.

b21720f

Share code between c#and vb

9d9af3a

Add asserts

b19e705

CyrusNajmabadi commented Jul 2, 2020

View reviewed changes

Undo

16d3450

CyrusNajmabadi commented Jul 2, 2020

View reviewed changes

CyrusNajmabadi added 4 commits July 2, 2020 15:14

REgular attribute

e7f2c44

Add direct tests

81bc7d5

Merge remote-tracking branch 'upstream/master' into semanticModelWork

53b88f5

Fix tests

b1fcb48

CyrusNajmabadi requested a review from jasonmalinowski July 6, 2020 18:00

jasonmalinowski approved these changes Jul 7, 2020

View reviewed changes

CyrusNajmabadi merged commit 840dc9b into dotnet:master Jul 7, 2020

ghost added this to the Next milestone Jul 7, 2020

CyrusNajmabadi deleted the semanticModelWork branch July 7, 2020 19:01

JoeRobich modified the milestones: Next, 16.8.P1 Jul 20, 2020

jasonmalinowski mentioned this pull request Aug 26, 2021

Use document with partial semantic for completion #55745

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Simplify the semantic model caching scheme we use to reuse semantic models during typing. #45565

Simplify the semantic model caching scheme we use to reuse semantic models during typing. #45565

CyrusNajmabadi commented Jun 30, 2020 •

edited

Loading

CyrusNajmabadi Jul 1, 2020

CyrusNajmabadi left a comment

CyrusNajmabadi Jul 1, 2020

jasonmalinowski Jul 2, 2020

CyrusNajmabadi Jul 1, 2020

CyrusNajmabadi Jul 2, 2020

CyrusNajmabadi Jul 2, 2020

CyrusNajmabadi Jul 6, 2020

CyrusNajmabadi commented Jul 6, 2020

jasonmalinowski left a comment

jasonmalinowski Jul 6, 2020

jasonmalinowski Jul 6, 2020

CyrusNajmabadi Jul 7, 2020

jasonmalinowski Jul 6, 2020

CyrusNajmabadi Jul 7, 2020

jasonmalinowski Jul 6, 2020

jasonmalinowski Jul 6, 2020

CyrusNajmabadi Jul 7, 2020

jasonmalinowski Jul 6, 2020

jasonmalinowski Jul 7, 2020

CyrusNajmabadi Jul 7, 2020

jasonmalinowski Jul 7, 2020

jasonmalinowski Jul 7, 2020

jasonmalinowski Jul 7, 2020

jasonmalinowski Jul 7, 2020

jasonmalinowski Jul 7, 2020

jasonmalinowski Jul 7, 2020

CyrusNajmabadi commented Jul 7, 2020

CyrusNajmabadi commented Jul 7, 2020

	/// Given a previous semantic model, and a method-eque node in the current tree for that same document, attempts
	/// Given a previous semantic model, and a method-esque node in the current tree for that same document, attempts

Simplify the semantic model caching scheme we use to reuse semantic models during typing. #45565

Simplify the semantic model caching scheme we use to reuse semantic models during typing. #45565

Conversation

CyrusNajmabadi commented Jun 30, 2020 • edited Loading

Choose a reason for hiding this comment

CyrusNajmabadi left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

CyrusNajmabadi commented Jul 6, 2020

jasonmalinowski left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

CyrusNajmabadi commented Jul 7, 2020

CyrusNajmabadi commented Jul 7, 2020

CyrusNajmabadi commented Jun 30, 2020 •

edited

Loading