Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DjangoConnectionField slice: use max_limit first, if set #965

Merged
merged 7 commits into from
Jun 6, 2020

Conversation

pcraciunoiu
Copy link
Contributor

@pcraciunoiu pcraciunoiu commented May 19, 2020

I found an interesting performance issue, where if I query without first or last, and there is a setting for RELAY_CONNECTION_MAX_LIMIT, the resolver attempts to pull in the entire queryset.

This is way overkill when there are thousands of items, so it seems wise to use max_limit when available.

I opted for optional kwarg in resolve_connection to maintain backwards compatibility if anyone overwrote this method, but it would be cleaner to require it, since the calling method connection_resolver always has it.

On a query that has multiple nested DjangoConnectionFields, I imagine this would reduce queries and data over the SQL wire by a lot.

@pcraciunoiu pcraciunoiu changed the title Use max_limit first, if set. DjangoConnectionField slice: use max_limit first, if set May 19, 2020
@jkimbo
Copy link
Member

jkimbo commented May 21, 2020

This makes sense @pcraciunoiu but can you fix up the tests and add a new one to assert this new behaviour?

@pcraciunoiu
Copy link
Contributor Author

@jkimbo updated tests to show the changes (basically with and without max limit). And also fixed a bug in my approach.

@pcraciunoiu
Copy link
Contributor Author

pcraciunoiu commented Jun 3, 2020

@jkimbo any thoughts on this?

Apparently before this fix, DjangoConnectionField and DjangoFilterConnectionField did not respect the max limit, and returned all the items in the queryset. I can write a test for that as well, but just FYI... it is a breaking change for people who were relying on this (IMO incorrect) behavior.

This is a backwards incompatible change. I'd be open to putting it behind a flag/setting, e.g.

GRAPHENE = {
    "RELAY_CONNECTION_MAX_LIMIT": 50,
    "RELAY_CONNECTION_FIELD_ENABLE_LIMIT": True,
}

But IMO the max limit should actually be respected by default, as it's used in the constructor:

class DjangoConnectionField(ConnectionField):
    def __init__(self, *args, **kwargs):
        self.on = kwargs.pop("on", False)
        self.max_limit = kwargs.pop(
            "max_limit", graphene_settings.RELAY_CONNECTION_MAX_LIMIT
        )

I have confirmed in my projects that these fields return e.g. 1,000 items, even if my limit is set to the default of 100. And testing against this branch, the limit is respected.

Copy link
Member

@jkimbo jkimbo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pcraciunoiu IMO the incorrect behaviour is a bug so I don't consider this a breaking change. If you could write a test to assert this new behaviour that would be good though. That we can know in the future if it breaks again.

]


def test_should_return_max_limit(graphene_settings):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test fails without changes to max_limit or ... in resolve_connection:

_____________________________________ test_should_return_max_limit ______________________________________

graphene_settings = <graphene_django.settings.GrapheneSettings object at 0x7fb4b4f54550>

    def test_should_return_max_limit(graphene_settings):
        graphene_settings.RELAY_CONNECTION_MAX_LIMIT = 4
        reporters = [Reporter(**kwargs) for kwargs in REPORTERS]
        Reporter.objects.bulk_create(reporters)
    
        class ReporterType(DjangoObjectType):
            class Meta:
                model = Reporter
                interfaces = (Node,)
    
        class Query(graphene.ObjectType):
            all_reporters = DjangoConnectionField(ReporterType)
    
        schema = graphene.Schema(query=Query)
        query = """
            query AllReporters {
                allReporters {
                    edges {
                        node {
                            id
                        }
                    }
                }
            }
        """
    
        result = schema.execute(query)
        assert not result.errors
>       assert len(result.data["allReporters"]["edges"]) == 4
E       AssertionError: assert 6 == 4
E        +  where 6 = len([{'node': {'id': 'UmVwb3J0ZXJUeXBlOjE='}}, {'node': {'id': 'UmVwb3J0ZXJUeXBlOjI='}}, {'node': {'id': 'UmVwb3J0ZXJUeXBl...': {'id': 'UmVwb3J0ZXJUeXBlOjQ='}}, {'node': {'id': 'UmVwb3J0ZXJUeXBlOjU='}}, {'node': {'id': 'UmVwb3J0ZXJUeXBlOjY='}}])

@pcraciunoiu
Copy link
Contributor Author

@jkimbo updated with a test, take a peek?

@jkimbo jkimbo merged commit c002034 into graphql-python:master Jun 6, 2020
@jkimbo
Copy link
Member

jkimbo commented Jun 6, 2020

Thanks @pcraciunoiu !

@pcraciunoiu
Copy link
Contributor Author

@jkimbo looks like we do need to do a query count to get next page info. I fixed that + another bug (seems that after wasn't working) and added a test for both in #986

@jkimbo
Copy link
Member

jkimbo commented Jun 25, 2020

Released in v2.11.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants