Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revert field resolver logic to fix poor query performance #1401

Conversation

sjdemartini
Copy link
Collaborator

@sjdemartini sjdemartini commented Apr 29, 2023

This reverts the change to convert_field_to_djangomodel introduced in #1315 for the reasons discussed here
https://github.com/graphql-python/graphene-django/pull/1315/files#r1015659857. As mentioned there, without reverting this code, "queries are forced every time an object is resolved, making an exponential number of queries when nesting without any possibility of optimizing".

That regression prevented graphene-django-optimizer from working with graphene-django v3.0.0b9+ (where this change first was published), as discussed in
#1356 (comment), tfoxy/graphene-django-optimizer#86, and tfoxy/graphene-django-optimizer#83 (comment).

For now, this marks the two tests that depended on this problematic code as "expected to fail", and perhaps they can be reintroduced if there's a way to support this logic in a way that does not prevent select_related and prefetch_related query-optimization and introduce nested N+1s.

As mentioned here #1315 (comment), the query performance regression in graphene-django is blocking upgrade to graphene-django v3 for many users, and fixing this would allow many to begin upgrading and contributing to keep graphene-django going.

Thanks again @firaskafri for pushing graphene-django along, and I appreciate your consideration in merging/publishing this! 🙏


I have tested this change with graphene-django-optimizer, and it should resolve the v3.0.0+ compatibility question/issue for that library mentioned here #1356 and filed here tfoxy/graphene-django-optimizer#86.

This reverts the change to `convert_field_to_djangomodel` introduced in
graphql-python#1315 for the
reasons discussed here
https://github.com/graphql-python/graphene-django/pull/1315/files#r1015659857.
As mentioned there, without reverting this code, "queries are forced
every time an object is resolved, making an exponential number of
queries when nesting without any possibility of optimizing".

That regression prevented `graphene-django-optimizer` from working with
`graphene-django` v3.0.0b9+ (where this change first was published), as
discussed in
graphql-python#1356 (comment),
tfoxy/graphene-django-optimizer#86, and
tfoxy/graphene-django-optimizer#83 (comment).

For now, this marks the two tests that depended on this problematic code
as "expected to fail", and perhaps they can be reintroduced if there's a
way to support this logic in a way that does not prevent
`select_related` and `prefetch_related` query-optimization and introduce
nested N+1s.

As mentioned here
graphql-python#1315 (comment),
this is blocking upgrade to graphene-django v3 for many users, and
fixing this would allow many to begin upgrading and contributing to keep
graphene-django going.
@jaw9c
Copy link
Collaborator

jaw9c commented Apr 30, 2023

Thanks for taking the time to look into this @sjdemartini! It's great to see more people active here after a while :)

It's a bit worrying to remove/merge given those tests failing, if people are relying on using the get_queryset method as an authorization layer, this will cause them issues.

@sjdemartini
Copy link
Collaborator Author

sjdemartini commented May 1, 2023

Thanks for taking a look and commenting @jaw9c! Agreed it's nice to see graphene-django coming back!

I understand the concern, but I think this recent regression is a larger problem. Here are some thoughts:

  • This get_queryset implementation causes exponentially many SQL queries when you are using nested objects. Seeing as nested object queries are a primary benefit of GraphQL, the implementation seems to be an untenable way for the graphene-django package to behave, particularly if there's no workaround to optimize queries.
  • Based on the discussions I linked in the PR description, this change has been problematic for several people and is blocking their upgrades to v3.
  • This implementation was not historically the behavior and was only introduced in v3.0.0b9.
    • Also, based on PePy stats, there are still significantly more downloads of v2 compared to v3.0.0b9+, so more users are relying on behavior prior to this change.
    • (For what it's worth, the original issue filed by the creator of the PR had no upvotes and only one other user's comment Inconsistency in the way ForeignKeys resolutions are treated #1111, so I wouldn't suspect it to be a widely desired approach. That author also said "I have a quick fix for this problem but I think it degrades performances by doing an additional query to the database...I would love to get some feedback as it seems pretty hacky."—And it's not merely an additional query, but an additional query per nested object, i.e. N+1s at each level as it fans out.)
  • There are also almost half as many monthly downloads of graphene-django-optimizer as graphene-django v3, and it's presumably the most popular third party graphene-django package. But that package is fully broken by this resolver code and as such is incompatible with v3.0.0b9+, with no recourse to resolve (no pun intended). (So all of those users must be blocked by this code, and none of them are relying on it for authorization.)
  • Merging this and releasing as a minor version change with clear warnings in the release notes would be sufficient. Anyone who happened to need this functionality could stay on >=3.0.0b9, <3.1.0 for now, and/or could then put forward an alternative approach that does not cause dangerous N+1 SQL query patterns.
  • graphene-django-permissions could be an alternative approach for those who wish to enforce authorization.
    • Coincidentally I published this package for graphene-django model/object authorization last year. It's designed specifically to work with graphene-django-optimizer or alternative select_related/prefetch_related queries, without degrading query performance. (I actually even mentioned in the README why relying on get_queryset is not a feasible approach in terms of performance, and that predates the get_queryset logic I'm reverting here.)

Thanks again for discussing!

@sjdemartini
Copy link
Collaborator Author

btw I just noticed there was an attempt to improve the performance of this code in #1391, which is maybe a good starting place to explore this later, but I tested that code locally with graphene-django-optimizer and it still causes 8 performance-related test failures in that project unfortunately (same as without that PR).

@firaskafri
Copy link
Collaborator

Hey guys, have you tested #1391?

@mahdyhamad
Copy link
Collaborator

@sjdemartini Thank you for taking the time to investigate more about this issue.

I have worked on this PR #1391 as a work around to reduce DB queries from 2 queries to 1 each time a FK/1:1 field is resolved in an attempt to enhance the performance a little while keeping the functionality proposed in #1315.

Regarding graphene-django-optimizer, I did not take it into consideration. After investigation, you are correct that the fix proposed in the PR does not work with graphene-django-optimizer as the field will be resolved independently and will not take advantage of select_related/prefetch_related.

Copy link
Collaborator

@mahdyhamad mahdyhamad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are still tests (skipped) testing non-exiting functionality that should be removed

@sjdemartini
Copy link
Collaborator Author

Thanks for the additional discussion!

There are still tests (skipped) testing non-exiting functionality that should be removed

@mahdyhamad I've gone ahead and deleted the two tests I'd marked with @pytest.mark.xfail. If you meant something different, let me know, and I'll update whichever way you think is cleanest.

@firaskafri firaskafri self-requested a review May 1, 2023 18:08
@mahdyhamad
Copy link
Collaborator

mahdyhamad commented May 2, 2023

Sounds as a good idea to keep some temporary documentation of this discussion 💯
But I think it should be removed later to keep the tests clean

Thank you @sjdemartini

@firaskafri
Copy link
Collaborator

@jaw9c any thoughts? I think this is good to, and we can address this in the release notes to avoid breaking code.

@sjdemartini
Copy link
Collaborator Author

Thanks for the stamp firaskafri!

I've also just added a new unit-test that validates select_related and prefetch_related query-optimizations are respected in resolver code, which should help ensure that if/when we add back functionality like the CustomField resolver being reverted here, it is still performant.

For instance, adding back the resolver code now gives a test failure showing:

Failed: Expected to perform 2 queries but 11 were done

Queries:
========
SELECT "tests_article"."id", "tests_article"."headline", "tests_article"."pub_date", "tests_article"."pub_date_time", "tests_article"."reporter_id", "tests_article"."editor_id", "tests_article"."lang", "tests_article"."importance", "tests_reporter"."id", "tests_reporter"."first_name", "tests_reporter"."last_name", "tests_reporter"."email", "tests_reporter"."a_choice", "tests_reporter"."reporter_type", T3."id", T3."first_name", T3."last_name", T3."email", T3."a_choice", T3."reporter_type" FROM "tests_article" INNER JOIN "tests_reporter" ON ("tests_article"."reporter_id" = "tests_reporter"."id") INNER JOIN "tests_reporter" T3 ON ("tests_article"."editor_id" = T3."id") ORDER BY "tests_article"."headline" ASC
SELECT ("tests_film_reporters"."reporter_id") AS "_prefetch_related_val_reporter_id", "tests_film"."id", "tests_film"."genre", "tests_filmdetails"."id", "tests_filmdetails"."location", "tests_filmdetails"."film_id" FROM "tests_film" INNER JOIN "tests_film_reporters" ON ("tests_film"."id" = "tests_film_reporters"."film_id") LEFT OUTER JOIN "tests_filmdetails" ON ("tests_film"."id" = "tests_filmdetails"."film_id") WHERE "tests_film_reporters"."reporter_id" IN (1, 2)
SELECT "tests_reporter"."id", "tests_reporter"."first_name", "tests_reporter"."last_name", "tests_reporter"."email", "tests_reporter"."a_choice", "tests_reporter"."reporter_type" FROM "tests_reporter" WHERE "tests_reporter"."id" = 2 LIMIT 21
...

@sjdemartini sjdemartini force-pushed the revert-problematic-queryset-resolve branch from e0f1814 to f3f6e08 Compare May 2, 2023 16:09
…related

This test passes after reverting the `CustomField` resolver change
introduced in
graphql-python#1315, but fails
with that resolver code present. For instance, adding back the resolver
code gives a test failure showing:

```
Failed: Expected to perform 2 queries but 11 were done
```

This should ensure there aren't regressions that prevent
query-optimization in the future.
@sjdemartini sjdemartini force-pushed the revert-problematic-queryset-resolve branch from f3f6e08 to 5e8952d Compare May 2, 2023 17:08
@sjdemartini
Copy link
Collaborator Author

Whoops, didn't realize black was configured, sorry about that; just pushed the reformatted version of the test code! (Also opened a PR to include the already-configured pre-commit in the standard dev setup #1403 so that others won't make the same mistake as me 😄)

@jaw9c
Copy link
Collaborator

jaw9c commented May 3, 2023

@sjdemartini Sounds good to me, thought I'd raise since the tests seem worrying but agree that it's something that we can work towards. I'm not convinced it's a particularly good strategy to use the get_queryset as the main authorisation technique, but is very django-esque.

@firaskafri I think it would be worth noting this change in the release notes 👍

Comment on lines +20 to +24
NOTE: For now, we do not expect this get_queryset method to be called for nested
objects, as the original attempt to do so prevented SQL query-optimization with
`select_related`/`prefetch_related` and caused N+1 queries. See discussions here
https://github.com/graphql-python/graphene-django/pull/1315/files#r1015659857
and here https://github.com/graphql-python/graphene-django/pull/1401.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@firaskafri firaskafri merged commit 20a6cec into graphql-python:main May 3, 2023
sjdemartini added a commit to sjdemartini/graphene-django-optimizer that referenced this pull request May 3, 2023
sjdemartini added a commit to sjdemartini/graphene-django-permissions that referenced this pull request May 3, 2023
The new 3.0.2 release includes a fix for the SQL query performance
regression
https://github.com/graphql-python/graphene-django/releases/tag/v3.0.2
(graphql-python/graphene-django#1401) which was
causing test failures on this branch.
@sjdemartini sjdemartini deleted the revert-problematic-queryset-resolve branch June 11, 2023 20:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants