-
Notifications
You must be signed in to change notification settings - Fork 229
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SE: Cache NumberConstraint #7251
Conversation
1f0fb77
to
e26c500
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use of concurrent dictionary can probably be improved, as well as ambiguity from lack of verb mitigated or solved.
analyzers/src/SonarAnalyzer.Common/SymbolicExecution/Constraints/NumberConstraint.cs
Outdated
Show resolved
Hide resolved
} | ||
else if (min.HasValue || max.HasValue) | ||
if (min.HasValue && max.HasValue && min.Value > max.Value) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Out of curiosity: what is the rationale behind having a swap?
Do we already have scenarios where the caller of From
passes the arguments in reversed order?
Isn't that a problem for the caller of From
, to ensure that it is passing arguments in the right order?
Wouldn't a fail-fast approach be safer? When the min <= max
is broken in the caller, it may reveal bugs at the call site.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to play safe and be sure that Min <= Max
contract of NumberConstraint
is always kept.
var key = new CacheKey(min, max); | ||
return cache.TryGetValue(key, out var constraint) | ||
? constraint | ||
: cache.GetOrAdd(key, new NumberConstraint(min, max)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there is a concurrency "issue" here: something could happen to cache
between TryGetValue
and GetOrAdd
.
Nothing bad actually, due to the GetOrAdd
getting when somebody has added the value in the meantime.
Regardless, why not using just GetOrAdd
, without TryGetValue
? If you don't want to build a NumberConstraint
every single time, there is an overload of ConcurrentDictionary
for that:
return cache.GetOrAdd(new CacheKey(min, max), key => new NumberConstraint(min, max));
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To avoid allocations of the lambda that needs to be passed. These tends to create a lot of objects on the critical paths.
Yes, this can go out of sync a little, by creating a few useless instances. Practically, it's not a problem because it doesn't matter who is the winner instance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The lambda is de-sugared into a lazy-initialized singleton (e.g. here ).
The only price paid after the first call is checking whether the static readonly
field is initialized or not, which is very little.
Even assuming closure of min
and max
, that would boil down to setting a mutable field on the singleton:
[System.Runtime.CompilerServices.NullableContext(1)]
public void WithoutStatic(ConcurrentDictionary<string, string> cache)
{
<>c__DisplayClass0_0 <>c__DisplayClass0_ = new <>c__DisplayClass0_0();
<>c__DisplayClass0_.min = 2;
cache.GetOrAdd("keyVal", new Func<string, string>(<>c__DisplayClass0_.<WithoutStatic>b__0));
}
And we can avoid this, since GetOrAdd
conveniently passes the key as input parameter of the lambda, allowing for:
return cache.GetOrAdd(new CacheKey(min, max), static key => new NumberConstraint(key.Min, key.Max));
And that is even faster.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It still creates new Func<string, string>
that needs to be garbage collected. I think Martin was measuring the count of objects when he was doing the SymbolicValue caching.
public static readonly NumberConstraint One = new(1, 1); | ||
private const int CacheLimit = 100_000; // Measured projects produced around 1000 distinct values. | ||
|
||
private static ConcurrentDictionary<CacheKey, NumberConstraint> cache = new(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just wondering if we really need a ConcurrentDictionary
.
We read from this cache a lot more than we write, so ConcurrentDictionary
makes sense, since reads are basically lock-free. However, writes require locking which is quite complex since locks are at bucket level.
Moreover, a normal Dictionary
+ object lock will be worse.
So, I was wondering if there is a way to have a lock-free GetOrAdd
and Clear
too, paying the cost of a technically wrong answer and a new instance in the unlikely event that two threads want to add two NumberConstraints
with the same CacheKey
at the same time.
After all, when we pass the CacheLimit
, we build new instances anyway...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In theory we could use thread-unsafe dictionary. I've never tested how thread-unstable the default collections are so I'll play safe with the synchronized one.
The numbers showed that there's around 1000 values for larger projects. That's a reasonable price to pay.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, let's keep this in mind if we have to improve performance down the line. ConcurrentDictionary
is clever and easy to use, but also very complex internally. There are also insidious things in it, like Keys
returning a new collection at every mutation of the dictionary.
analyzers/src/SonarAnalyzer.Common/SymbolicExecution/Constraints/NumberConstraint.cs
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use of concurrent dictionary can probably be improved, as well as ambiguity from lack of verb mitigated or solved.
Kudos, SonarCloud Quality Gate passed! |
Kudos, SonarCloud Quality Gate passed! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The race condition is understood and not an issue, so the PR looks good to me.
var key = new CacheKey(min, max); | ||
return cache.TryGetValue(key, out var constraint) | ||
? constraint | ||
: cache.GetOrAdd(key, new NumberConstraint(min, max)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The lambda is de-sugared into a lazy-initialized singleton (e.g. here ).
The only price paid after the first call is checking whether the static readonly
field is initialized or not, which is very little.
Even assuming closure of min
and max
, that would boil down to setting a mutable field on the singleton:
[System.Runtime.CompilerServices.NullableContext(1)]
public void WithoutStatic(ConcurrentDictionary<string, string> cache)
{
<>c__DisplayClass0_0 <>c__DisplayClass0_ = new <>c__DisplayClass0_0();
<>c__DisplayClass0_.min = 2;
cache.GetOrAdd("keyVal", new Func<string, string>(<>c__DisplayClass0_.<WithoutStatic>b__0));
}
And we can avoid this, since GetOrAdd
conveniently passes the key as input parameter of the lambda, allowing for:
return cache.GetOrAdd(new CacheKey(min, max), static key => new NumberConstraint(key.Min, key.Max));
And that is even faster.
public static readonly NumberConstraint One = new(1, 1); | ||
private const int CacheLimit = 100_000; // Measured projects produced around 1000 distinct values. | ||
|
||
private static ConcurrentDictionary<CacheKey, NumberConstraint> cache = new(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, let's keep this in mind if we have to improve performance down the line. ConcurrentDictionary
is clever and easy to use, but also very complex internally. There are also insidious things in it, like Keys
returning a new collection at every mutation of the dictionary.
Fixes #7156
Uses similar trick as cache of
SymbolicConstraint
Stats of existing NumberConstraints from build of 3 projects:
NumberConstraint_Akka.txt
NumberConstraint_Runtime.txt
NumberConstraint_WCF.txt