Increase lock granularity for CommandInfo cache #1166

rjmholt · 2019-03-05T01:33:11Z

PR Summary

Introduces a concurrent dictionary allowing different caller threads to get CommandInfo objects for different commands without blocking each other.

Alternative implementation of #1162.
Closes #1162

More details:

Spins the commandinfo cache into its own object, so that the complexity of thread management is managed in one place
Uses a single concurrency primitive (ConcurrentDictionary<K, V>.GetOrAdd()), so very little opportunity to introduce problems later
Does not use tasks
Has the same performance characteristics as the task-based implementation

PR Checklist

PR has a meaningful title
- Use the present tense and imperative mood when describing your changes
Summarized changes
Change is not breaking
Make sure all .cs, .ps1 and .psm1 files have the correct copyright header
Make sure you've added a new test if existing tests do not effectively test the code changed and/or updated documentation
This PR is ready to merge and is not Work in Progress.
- If the PR is work in progress, please add the prefix WIP: to the beginning of the title and remove the prefix when the PR is ready.

bergmeister · 2019-03-05T22:03:58Z

Engine/Helper.cs

-                    return commandInfoCache[key];
-                }
-
-                var commandInfo = GetCommandInfoInternal(cmdletName, commandType);


If you look carefully, the name variable is being used for the creation of the key but the cmdletname variable is passed into this function. The name and cmdletname variable might not always be the same but you pass only name into the new GetCommandInfo() function. I think you'd have to pass the CommandLookupKey as well into it instead to preserve legacy behaviour. We have seen it very hard to make any change to this method without breaking something else, hence why this method is marked as legacy. Unless you can argue that they are always the same, I'd be careful changing it (even if tests are green)

bergmeister · 2019-03-05T22:18:46Z

Thanks for the efforts, I see now what you mean, yes it looks more elegant to me, I didn't know one could do caching with lambda expressions this way as well (that's why I thought originally I had to use Task). I'd have preferred it if the changeset was kept smaller. Looks ok-ish but I have 1 comment about trying to preserve legacy behaviour to reduce the risk of a potential regression.

rjmholt · 2019-03-06T04:34:47Z

Thanks for the efforts, I see now what you mean, yes it looks more elegant to me, I didn't know one could do caching with lambda expressions this way as well (that's why I thought originally I had to use Task). I'd have preferred it if the changeset was kept smaller. Looks ok-ish but I have 1 comment about trying to preserve legacy behaviour to reduce the risk of a potential regression.

Fair points, I'll undo the unnecessary changes

daxian-dbw · 2019-03-08T20:33:49Z

Engine/CommandInfoCache.cs

+        /// <returns>Returns null if command does not exists</returns>
+        private static CommandInfo GetCommandInfoInternal(string cmdName, CommandTypes? commandType)
+        {
+            using (var ps = System.Management.Automation.PowerShell.Create())


There is a chance to further improve the perf by using a static RunspacePool, instead of creating/disposing a Runspace for every call to this method.
Of course, you will need to decide the size of the pool carefully based on the typical scenarios.

The code would be something like this:

staitc RunspacePool rps = RunspaceFactory.CreateRunspacePool(minRunspaces: 1, maxRunspaces: 5) ... private static CommandInfo GetCommandInfoInternal(string cmdName, CommandTypes? commandType) { using (var ps = System.Management.Automation.PowerShell.Create()) { ps.RunspacePool = rps; ps.AddCommand("Get-Command").Invoke() ... } }

Yes, I have tried something like this already in the past but could not get any significant measurable differences in terms of time. I think because all rules are run in separate threads, the rule that takes the longest is the weakest link in the chain and therefore the bottleneck.

Using a RunspacePool here is mainly for two purpose:

Make this method run faster

Runspace.Open is expensive.

module auto-loading and the cache populated during command discovery can be reused

RunspacePool may cause some extent of synchronization among the threads (more threads than the number of Runspaces get in this method). But only one invocation will be done by each thread, so the introduced synchronization wouldn't be that bad.

Reduce the GC pressure

Open and dispose a Runspace in every thread that gets in this method would produce a lot shot living objects.

As for how effective it is to the overall run time improvement, maybe not much -- measurement speaks the truth :) But it looks to me a low hanging fruit.

Engine/CommandInfoCache.cs

bergmeister

Thanks, looks good to me. I am happy to close my PR #1162 in favour of this

Co-Authored-By: rjmholt <rjmholt@gmail.com>

* Redo command info cache changes * Fix comment typo Co-Authored-By: rjmholt <rjmholt@gmail.com> * Trigger CI

rjmholt requested review from bergmeister and JamesWTruher March 5, 2019 01:33

rjmholt mentioned this pull request Mar 5, 2019

Improve performance with better CommandInfo cache locking making Invoke-ScriptAnalyzer nearly twice as fast #1162

Closed

6 tasks

bergmeister requested changes Mar 5, 2019

View reviewed changes

Redo command info cache changes

a1f94f5

rjmholt force-pushed the gcm-parallel-lock branch from 51d5ee6 to a1f94f5 Compare March 6, 2019 05:43

daxian-dbw reviewed Mar 8, 2019

View reviewed changes

bergmeister reviewed Mar 12, 2019

View reviewed changes

Engine/CommandInfoCache.cs Outdated Show resolved Hide resolved

bergmeister approved these changes Mar 12, 2019

View reviewed changes

bergmeister mentioned this pull request Mar 12, 2019

Decreased performance in the last couple months in large modules PowerShell/vscode-powershell#1497

Closed

bergmeister and others added 2 commits March 12, 2019 18:48

Fix comment typo

f4f627a

Co-Authored-By: rjmholt <rjmholt@gmail.com>

Trigger CI

c2fab46

JamesWTruher approved these changes Mar 13, 2019

View reviewed changes

JamesWTruher merged commit f34d956 into PowerShell:development Mar 13, 2019

bergmeister pushed a commit to bergmeister/PSScriptAnalyzer that referenced this pull request Mar 22, 2019

Increase lock granularity for CommandInfo cache (PowerShell#1166)

aa6e147

* Redo command info cache changes * Fix comment typo Co-Authored-By: rjmholt <rjmholt@gmail.com> * Trigger CI

jborean93 mentioned this pull request Apr 14, 2019

UseShouldProcessCorrectly Unable to find type #1217

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Increase lock granularity for CommandInfo cache #1166

Increase lock granularity for CommandInfo cache #1166

rjmholt commented Mar 5, 2019 •

edited by bergmeister

Loading

bergmeister Mar 5, 2019 •

edited

Loading

bergmeister commented Mar 5, 2019

rjmholt commented Mar 6, 2019

daxian-dbw Mar 8, 2019

bergmeister Mar 12, 2019

daxian-dbw Mar 13, 2019 •

edited

Loading

bergmeister left a comment

Increase lock granularity for CommandInfo cache #1166

Increase lock granularity for CommandInfo cache #1166

Conversation

rjmholt commented Mar 5, 2019 • edited by bergmeister Loading

PR Summary

PR Checklist

bergmeister Mar 5, 2019 • edited Loading

Choose a reason for hiding this comment

bergmeister commented Mar 5, 2019

rjmholt commented Mar 6, 2019

daxian-dbw Mar 8, 2019

Choose a reason for hiding this comment

bergmeister Mar 12, 2019

Choose a reason for hiding this comment

daxian-dbw Mar 13, 2019 • edited Loading

Choose a reason for hiding this comment

bergmeister left a comment

Choose a reason for hiding this comment

rjmholt commented Mar 5, 2019 •

edited by bergmeister

Loading

bergmeister Mar 5, 2019 •

edited

Loading

daxian-dbw Mar 13, 2019 •

edited

Loading