Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Query planner plugins #3177

Merged
merged 21 commits into from
Jun 14, 2023
Merged

Query planner plugins #3177

merged 21 commits into from
Jun 14, 2023

Conversation

Geal
Copy link
Contributor

@Geal Geal commented May 30, 2023

Fixes #3150
Fixes #3199
Related: apollographql/apollo-rs#420

This reintroduces the query planner plugins we had in the past, but with a different API, and for internal use only for now, until we are sure about the API's shape.

Plugin API

The plugins need access to a compiler instance to observe and modify the query, so the caching query planner adds it as part of the query planner request (which means that now the cache and bridge get different request types). For now, if a plugin needs to modify the query, it must serialize the modified version to a string then reparse it, but the goal is to support modification right inside the query: apollographql/apollo-rs#420
That compiler will hold both the schema's type info, and the query.

Once the requests reaches the bridge query planner, it will generate:

  • a query plan from the filtered query
  • selections from the filtered query (the Query object used for response formatting)
  • selections from the original query

Thus execution will depend on the filtered query, so if some fields are added or removed, it wil change the generated query plan. The plan and selections will then be cached in the same way as before.

Details on selections

Query planner plugins can modify the query before sending it through
query planning. They might require different sets of fields on the same
entity, and the query plan might add its own list of fields (keys for
federated queries).
As an example, We could have a User entity with its email field used as
key, and want to remove private information from the query.

So we could have the query:

{
  topProducts {
    name
    reviews {
      author {
        email
        username
      }
    }
  }
}

So here we would filter the query as follows:

{
  topProducts {
    name
    reviews {
      author {
        username
      }
    }
  }
}

But since the email is used as key, it would appear in the JSON object
that accumulates response data.
To avoid sending non requested data to the client, we have the response
formatting phase, that uses selections extracted from the query, to pick
only what is needed from the JSON object, and send it to the client.

Here we need to apply the selections of the filtered query, to make sure
the email is not returned to the client (the original query would let it
go through). But we also need to apply the selections from the original
query, to have the null propagation algorithm apply and return a
response in the shape that matches the client query.

There is probably a better way to do it than applying selections twice,
but here we are sure the behaviour will be correct, and so far the
formatting phase is fast enough, we can spend a bit more time there

Checklist

Complete the checklist (and note appropriate exceptions) before a final PR is raised.

  • Changes are compatible[^1]
  • Documentation[^2] completed
  • Performance impact assessed and acceptable
  • Tests added and passing[^3]
    • Unit Tests
    • Integration Tests
    • Manual Tests

Exceptions

Note any exceptions here

Notes

[^1]. It may be appropriate to bring upcoming changes to the attention of other (impacted) groups. Please endeavour to do this before seeking PR approval. The mechanism for doing this will vary considerably, so use your judgement as to how and when to do this.
[^2]. Configuration is an important part of many changes. Where applicable please try to document configuration examples.
[^3]. Tick whichever testing boxes are applicable. If you are adding Manual Tests:
- please document the manual testing (extensively) in the Exceptions.
- please raise a separate issue to automate the test and label it (or ask for it to be labeled) as manual test

Geal added 3 commits May 30, 2023 11:50
We may need to observe and edit the query between the query plan caching
and the query planner, so this brings back the query planner plugins we
had initially
@github-actions

This comment has been minimized.

@router-perf
Copy link

router-perf bot commented May 30, 2023

CI performance tests

  • const - Basic stress test that runs with a constant number of users
  • no-graphos - Basic stress test, no GraphOS.
  • step - Basic stress test that steps up the number of users over time
  • reload - Reload test over a long period of time at a constant rate of users
  • xlarge-request - Stress test with 10Mb request payload
  • large-request - Stress test with a 1Mb request payload
  • xxlarge-request - Stress test with 100Mb request payload

Geal added 2 commits May 30, 2023 14:47
Query planner plugins can modify the query before sending it through
query planning. They might require different sets of fields on the same
entity, and the query plan might add its own list of fields (keys for
federated queries).
As an example, We could have a User entity with its email field used as
key, and want to remove private information from the query.

So we could have the query:

{
  topProducts {
    name
    reviews {
      author {
        email
        username
      }
    }
  }
}

So here we would filter the query as follows:

{
  topProducts {
    name
    reviews {
      author {
        username
      }
    }
  }
}

But since the email is used as key, it would appear in the JSON object
that accumulates response data.
To avoid sending unrequested data to the client, we have the response
formatting phase, that uses selections extracted from the query, to pick
only what is needed from the JSON object, and send it to the client.

Here we need to apply the selections of the filtered query, to make sure
the email is not returned to the client (the original query would let it
go through). But we also need to apply the selections from the original
query, to have the null propagation algorithm apply and return a
response in the shape that matches the client query.

There is probably a better way to do it than applying selections twice,
but here we are sure the behaviour will be correct, and so far the
formatting phase is fast enough, we can spend a bit more time there
@Geal Geal changed the title Geal/query planner plugins query planner plugins May 30, 2023
@Geal Geal changed the title query planner plugins Query planner plugins May 30, 2023
@Geal Geal marked this pull request as ready for review May 30, 2023 15:33
the compiler is not created inside a blocking spawned task so now it
appears again in the spans. Salsa's events are filtered in the router,
but were not in the test harness, which failed some tests
@Geal Geal requested review from a team, garypen, SimonSapin, BrynCooke and o0Ignition0o and removed request for BrynCooke May 30, 2023 16:19
Geal added 6 commits May 31, 2023 10:11
we want the filtered query to appear under the original query's Studio
entry, and it is indexed by the `statsReportKey` field in the usage
reporting structure. We call the bridge again with the original query to
generate that signature without going through the entire planning
process again
@SimonSapin
Copy link
Contributor

The plugins need access to a compiler instance to observe and modify the query, so the caching query planner adds it as part of the query planner request (which means that now the cache and bridge get different request types).

This sounds similar to #3200, how do the two interact?

@Geal
Copy link
Contributor Author

Geal commented Jun 6, 2023

This sounds similar to #3200, how do the two interact?

once #3200 is merged, this should be updated to use the compiler coming from the supergraph request

);

let mut planner =
CachingQueryPlanner::new(delegate, schema, &configuration, IndexMap::new()).await;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This hard-codes the map of query planner plugins to empty, which makes most of this PR dead code. I’d feel better with at least one smoke test showing a query planner plugin in action but we can do that in a follow-up PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, adding a test to exercise the API would be useful

/// [`Context`] for the request.
#[derive(Clone, Derivative)]
#[derivative(Debug)]
pub(crate) struct CachingRequest {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks identical to struct Request above, why a separate struct?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

before merging #3200, the caching request did not have the compiler. But it's better to keep the types separated, because at some point, CachingRequest will have an additional field to change the cache key

apollo-router/src/query_planner/bridge_query_planner.rs Outdated Show resolved Hide resolved
apollo-router/src/query_planner/bridge_query_planner.rs Outdated Show resolved Hide resolved
Geal and others added 3 commits June 13, 2023 12:00
Co-authored-by: Jeremy Lempereur <jeremy.lempereur@iomentum.com>
Co-authored-by: Simon Sapin <simon@apollographql.com>
SimonSapin added a commit that referenced this pull request Jun 13, 2023
We expect to use this inside query planner plugins:
#3177
@Geal Geal enabled auto-merge (squash) June 14, 2023 08:11
@Geal Geal merged commit ae50056 into dev Jun 14, 2023
@Geal Geal deleted the geal/query-planner-plugins branch June 14, 2023 09:33
SimonSapin added a commit that referenced this pull request Jun 15, 2023
We expect to use them inside query planner plugins:
#3177

<!-- start metadata -->

**Checklist**

Complete the checklist (and note appropriate exceptions) before a final
PR is raised.

- [x] Changes are compatible[^1]
- [x] Documentation[^2] completed
- [ ] Performance impact assessed and acceptable
- Tests added and passing[^3]
    - [x] Unit Tests
    - [ ] Integration Tests
    - [ ] Manual Tests

**Exceptions**

*Note any exceptions here*

**Notes**

[^1]. It may be appropriate to bring upcoming changes to the attention
of other (impacted) groups. Please endeavour to do this before seeking
PR approval. The mechanism for doing this will vary considerably, so use
your judgement as to how and when to do this.
[^2]. Configuration is an important part of many changes. Where
applicable please try to document configuration examples.
[^3]. Tick whichever testing boxes are applicable. If you are adding
Manual Tests:
- please document the manual testing (extensively) in the Exceptions.
- please raise a separate issue to automate the test and label it (or
ask for it to be labeled) as `manual test`

---------

Co-authored-by: Geoffroy Couprie <apollo@geoffroycouprie.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants