Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance issues with in / nin filter operators and EF Core (Cache pollution) #6091

Closed
1 task done
nikolai-mb opened this issue Apr 26, 2023 · 4 comments · Fixed by #6711
Closed
1 task done

Performance issues with in / nin filter operators and EF Core (Cache pollution) #6091

nikolai-mb opened this issue Apr 26, 2023 · 4 comments · Fixed by #6711
Labels
Area: Data Issue is related to filtering, sorting, pagination or projections 🌶️ hot chocolate

Comments

@nikolai-mb
Copy link
Contributor

Is there an existing issue for this?

  • I have searched the existing issues

Product

Hot Chocolate

Describe the bug

While trying to reproduce a memory leak we are experiencing in production, i came across some strange performance metrics for queries using in / nin filter operators.

Queries with simple filters such as id: { eq: 1 } versus id: { in: [ 1 ] } have around 10x difference in throughput and allocations. In our case we have queries that use in filter with around 10 integer id's.

I have been able to reproduce the issue using both EF Core SQL Server and EF Core In Memory database providers

Steps to reproduce

Compare performance between a query using a filter based on eq/neq and in/nin. For example: "eq: 1" vs "in: [ 1 ]"

I created a simple repro project for demonstrating the performance difference here: HotChocolateMemoryLeakRepro. I originally believed that this could be the cause of the memory leak we are experiencing, but i have not been able to reproduce any leaks thus far so the name of the repro project is not really accurate.

The repro project is using the In Memory provider and seeds some "CarBrand" entries when the project is started.

The following query can be used in the repro to simulate load:
query { simulateLoad(createMemoryLeak: false) }

Setting the createMemoryLeak parameter to false, will send 100 000 requests with this query:
query { carBrands(where: { id: { eq: 1 } }) { id name } }

Setting the parameter to true, will send the same number of requests with this query:
query { carBrands(where: { id: { in: [ 1 ] } }) { id name } }

During my repro testing i also used dotnet-counters with System.Runtime and Microsoft.EntityFrameworkCore counters

100 000 requests with EQ filter yields these results:

eq-batches
eq-counters

100 000 requests with IN filter yields these results:

in-batches
in-counters

There is a big difference in allocations between the two runs and the EF Core query cache hit rate is 100% for the EQ queries but 0% for the IN queries. This could probably explain the almost 10x allocation and throughput difference if EF Core is compiling a query expression tree for every singe query with IN filter. Not sure if there are any underlying issues that's the root cause for this.

Relevant log output

No response

Additional Context?

No response

Version

13.0.5

@michaelstaib michaelstaib added 🌶️ hot chocolate Area: Data Issue is related to filtering, sorting, pagination or projections labels Apr 27, 2023
@nikolai-mb
Copy link
Contributor Author

After doing some digging, the problem seem to be related to how HC applies the .Contains filter to the Expression in the FilterExpressionBuilder.

Setting the log value for Microsoft.EntityFrameworkCore.Query to "Debug" makes the problem quite obvious, as EF will output a plan compilation entry for every query with a IN or NIN filter.

I ended up rolling my own implementation for the In expression builder which looks like this:

public static class CustomFilterExpressionBuilder
{
    private static readonly ConcurrentDictionary<Type, Type[]> _inGenericTypeMapping = new();
    private static readonly ConcurrentDictionary<Type, MethodInfo> _inGenericMethodMapping = new();

    public static Expression In(
        Expression property,
        Type genericType,
        object? parsedValue)
    {
        var genericMethod = _inGenericMethodMapping.GetOrAdd(genericType, static type =>
            typeof(CustomFilterExpressionBuilder)
                .GetMethod(nameof(InGeneric), BindingFlags.Static | BindingFlags.Public)!
                .MakeGenericMethod(type));

        return (Expression) genericMethod.Invoke(null, new[] {property, genericType, parsedValue})!;
    }
    
    public static Expression InGeneric<T>(
        Expression property,
        Type genericType,
        IEnumerable<T> parsedValue)
    {
        Expression<Func<IEnumerable<T>>> parsedValueLambda = () => parsedValue;
        var parsedValueExpression = parsedValueLambda.Body;

        var typeArguments = _inGenericTypeMapping.GetOrAdd(genericType, static type => new[] {type});

        return Expression.Call(
            typeof(Enumerable),
            nameof(Enumerable.Contains),
            typeArguments,
            parsedValueExpression,
            property);
    }
}

The solution above seems to solve the problem and EF Core now generates property parameterized queries as well as utilizing OpenJSON (EF Core 8 SQL Server feature) correctly.

The EF Core docs for Dynamically-constructed queries states:

Expression API with constant: Dynamically build the expression with the Expression API, using a constant node. This is a frequent mistake when dynamically building expression trees, and causes EF to recompile the query each time it's invoked with a different constant value (it also usually causes plan cache pollution at the database server).

This statement seems to be the root of the issue, as the default HC expression builder for IN/NIN uses a constant expression.

Any thoughts @michaelstaib or @PascalSenn ? I'm not really sure how to proceed from here as this seem to be a EF Core specific issue. I'm fine with rolling my own fix for this, but i would assume many others are experiencing the same performance penalty unknowingly.

@PascalSenn
Copy link
Member

@nikolai-mb thanks for the issue, repro and possible solution!. Do you want to do a PR? Then we see the diff and all our tests run against it too.

@nikolai-mb
Copy link
Contributor Author

Sure! I'll try and get a PR going at some point through the weekend when i find the time.

@nikolai-mb
Copy link
Contributor Author

Took me a bit longer than i expected to get time to dig into this, but i finally got a PR up and going.
Managed to find a much cleaner variant than i initially proposed which now only contains 5 line changes.

Please take a look when appropriate @PascalSenn

@nikolai-mb nikolai-mb changed the title Possible performance issues with in / nin filter operators and EF Core Performance issues with in / nin filter operators and EF Core (Cache pollution) Nov 21, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area: Data Issue is related to filtering, sorting, pagination or projections 🌶️ hot chocolate
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants