Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SE: Cache NumberConstraint #7251

Merged
merged 2 commits into from
May 26, 2023
Merged

SE: Cache NumberConstraint #7251

merged 2 commits into from
May 26, 2023

Conversation

pavel-mikula-sonarsource
Copy link
Contributor

@pavel-mikula-sonarsource pavel-mikula-sonarsource commented May 23, 2023

Fixes #7156

Uses similar trick as cache of SymbolicConstraint

Stats of existing NumberConstraints from build of 3 projects:
NumberConstraint_Akka.txt
NumberConstraint_Runtime.txt
NumberConstraint_WCF.txt

Copy link
Contributor

@antonioaversa antonioaversa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use of concurrent dictionary can probably be improved, as well as ambiguity from lack of verb mitigated or solved.

}
else if (min.HasValue || max.HasValue)
if (min.HasValue && max.HasValue && min.Value > max.Value)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of curiosity: what is the rationale behind having a swap?
Do we already have scenarios where the caller of From passes the arguments in reversed order?
Isn't that a problem for the caller of From, to ensure that it is passing arguments in the right order?
Wouldn't a fail-fast approach be safer? When the min <= max is broken in the caller, it may reveal bugs at the call site.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to play safe and be sure that Min <= Max contract of NumberConstraint is always kept.

var key = new CacheKey(min, max);
return cache.TryGetValue(key, out var constraint)
? constraint
: cache.GetOrAdd(key, new NumberConstraint(min, max));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there is a concurrency "issue" here: something could happen to cache between TryGetValue and GetOrAdd.
Nothing bad actually, due to the GetOrAdd getting when somebody has added the value in the meantime.
Regardless, why not using just GetOrAdd, without TryGetValue? If you don't want to build a NumberConstraint every single time, there is an overload of ConcurrentDictionary for that:

return cache.GetOrAdd(new CacheKey(min, max), key => new NumberConstraint(min, max));

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To avoid allocations of the lambda that needs to be passed. These tends to create a lot of objects on the critical paths.

Yes, this can go out of sync a little, by creating a few useless instances. Practically, it's not a problem because it doesn't matter who is the winner instance.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The lambda is de-sugared into a lazy-initialized singleton (e.g. here ).

The only price paid after the first call is checking whether the static readonly field is initialized or not, which is very little.

Even assuming closure of min and max, that would boil down to setting a mutable field on the singleton:

    [System.Runtime.CompilerServices.NullableContext(1)]
    public void WithoutStatic(ConcurrentDictionary<string, string> cache)
    {
        <>c__DisplayClass0_0 <>c__DisplayClass0_ = new <>c__DisplayClass0_0();
        <>c__DisplayClass0_.min = 2;
        cache.GetOrAdd("keyVal", new Func<string, string>(<>c__DisplayClass0_.<WithoutStatic>b__0));
    }

And we can avoid this, since GetOrAdd conveniently passes the key as input parameter of the lambda, allowing for:

return cache.GetOrAdd(new CacheKey(min, max), static key => new NumberConstraint(key.Min, key.Max));

And that is even faster.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It still creates new Func<string, string> that needs to be garbage collected. I think Martin was measuring the count of objects when he was doing the SymbolicValue caching.

public static readonly NumberConstraint One = new(1, 1);
private const int CacheLimit = 100_000; // Measured projects produced around 1000 distinct values.

private static ConcurrentDictionary<CacheKey, NumberConstraint> cache = new();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just wondering if we really need a ConcurrentDictionary.
We read from this cache a lot more than we write, so ConcurrentDictionary makes sense, since reads are basically lock-free. However, writes require locking which is quite complex since locks are at bucket level.
Moreover, a normal Dictionary + object lock will be worse.

So, I was wondering if there is a way to have a lock-free GetOrAdd and Clear too, paying the cost of a technically wrong answer and a new instance in the unlikely event that two threads want to add two NumberConstraints with the same CacheKey at the same time.

After all, when we pass the CacheLimit, we build new instances anyway...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In theory we could use thread-unsafe dictionary. I've never tested how thread-unstable the default collections are so I'll play safe with the synchronized one.

The numbers showed that there's around 1000 values for larger projects. That's a reasonable price to pay.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, let's keep this in mind if we have to improve performance down the line. ConcurrentDictionary is clever and easy to use, but also very complex internally. There are also insidious things in it, like Keys returning a new collection at every mutation of the dictionary.

Copy link
Contributor

@antonioaversa antonioaversa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use of concurrent dictionary can probably be improved, as well as ambiguity from lack of verb mitigated or solved.

@sonarcloud
Copy link

sonarcloud bot commented May 25, 2023

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 0 Code Smells

No Coverage information No Coverage information
No Duplication information No Duplication information

@sonarcloud
Copy link

sonarcloud bot commented May 25, 2023

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 0 Code Smells

100.0% 100.0% Coverage
0.0% 0.0% Duplication

Copy link
Contributor

@antonioaversa antonioaversa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The race condition is understood and not an issue, so the PR looks good to me.

var key = new CacheKey(min, max);
return cache.TryGetValue(key, out var constraint)
? constraint
: cache.GetOrAdd(key, new NumberConstraint(min, max));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The lambda is de-sugared into a lazy-initialized singleton (e.g. here ).

The only price paid after the first call is checking whether the static readonly field is initialized or not, which is very little.

Even assuming closure of min and max, that would boil down to setting a mutable field on the singleton:

    [System.Runtime.CompilerServices.NullableContext(1)]
    public void WithoutStatic(ConcurrentDictionary<string, string> cache)
    {
        <>c__DisplayClass0_0 <>c__DisplayClass0_ = new <>c__DisplayClass0_0();
        <>c__DisplayClass0_.min = 2;
        cache.GetOrAdd("keyVal", new Func<string, string>(<>c__DisplayClass0_.<WithoutStatic>b__0));
    }

And we can avoid this, since GetOrAdd conveniently passes the key as input parameter of the lambda, allowing for:

return cache.GetOrAdd(new CacheKey(min, max), static key => new NumberConstraint(key.Min, key.Max));

And that is even faster.

public static readonly NumberConstraint One = new(1, 1);
private const int CacheLimit = 100_000; // Measured projects produced around 1000 distinct values.

private static ConcurrentDictionary<CacheKey, NumberConstraint> cache = new();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, let's keep this in mind if we have to improve performance down the line. ConcurrentDictionary is clever and easy to use, but also very complex internally. There are also insidious things in it, like Keys returning a new collection at every mutation of the dictionary.

@pavel-mikula-sonarsource pavel-mikula-sonarsource merged commit 62a8860 into master May 26, 2023
@pavel-mikula-sonarsource pavel-mikula-sonarsource deleted the Pavel/Cache branch May 26, 2023 06:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

SE: Cache NumberConstraint
2 participants