Apollo DataSources REST: Request Parallelization #1451

Saeris · 2018-07-31T02:29:59Z

Hey there! Not sure whether to label this a bug report or a feature request. The gist of it is, when multiple requests are in flight to a REST API using apollo-datasource-rest, they all run in sequence, rather than in parallel, which inflates the response time for some queries.

To better illustrate this, I've thrown together a basic example forked from fullstack-workshop-server, which from this screenshot from Engine you can better see what I'm talking about:

And here's the fork itself:

https://github.com/Saeris/fullstack-workshop-server/tree/parallel-requests-example

Basically here's what's happening:

First we execute a query to fetch a list of popular movies.
Because the movie data returned from this REST call is incomplete, we map over the movie IDs and make separate calls to another endpoint to fetch the full movie details. We use a feature of the REST API to append the cast information to our response, so we don't have to make a third call for each movie.

What should happen here is once we have all the movie IDs, we should fetch all of their data in parallel to optimize the response time of our server. That way resolving movies: [Movie]! in our schema will only take as long as the longest response of the 20 requests takes to come back. Which in the screenshot above would be 6.86ms, instead of the total 29.2ms it took to resolve the whole cascade.

Obviously the deeper you went in a schema similar to this, the more time could be saved by running requests in parallel.

I'm not entirely certain as to how this can be accomplished. This just came up as I was picking apart the code for apollo-datasource-rest to solve an issue I'd been having with caching in Lambda.

Also, please note that even though we're clearly hitting the cache for repeat requests for this query (you'll notice the response time shrink considerably), Engine does not appear to be receiving any caching information. I believe I've got the server configured correctly to support it, so I don't know what's wrong there. If you inspect the network requests being sent in Playground, you'll see that cacheControl is included in the extensions along with the other tracing data.

The text was updated successfully, but these errors were encountered:

martijnwalraven · 2018-07-31T04:43:54Z

There is nothing in apollo-datasource-rest keeping it from performing parallel requests, and it is in fact performing parallel requests in your example. The waterfall you're seeing is a bit misleading, because the traces for individual movies aren't actually representing fetches but simply invocations of the resolvers. That's why the timings are so tiny (less than a millisecond to several milliseconds). So the waterfall is a result of Node execution being single threaded.

In your example, both fetching the list of popular movies and fetching individual movies happens in the movies resolver: https://github.com/Saeris/fullstack-workshop-server/blob/563e428a7453f44fbaf00890ba562edd55329310/src/resolvers/query.js#L17-L18

martijnwalraven · 2018-07-31T04:45:54Z

Also, please note that even though we're clearly hitting the cache for repeat requests for this query (you'll notice the response time shrink considerably), Engine does not appear to be receiving any caching information. I believe I've got the server configured correctly to support it, so I don't know what's wrong there. If you inspect the network requests being sent in Playground, you'll see that cacheControl is included in the extensions along with the other tracing data.

The caching you're seeing is caching of REST responses, which is separate from the whole query caching Engine shows. @cacheControl has no effect on data source caching, and Apollo Server currently doesn't cache whole responses.

Saeris · 2018-07-31T07:16:30Z

The caching you're seeing is caching of REST responses, which is separate from the whole query caching Engine shows. @CacheControl has no effect on data source caching, and Apollo Server currently doesn't cache whole responses.

So in the original example from that repo, the @cacheControl defined on the Cast type doesn't actually do anything then? I guess all I'm trying to point out here is that it's a bit misleading how caching works from the examples and the documentation. While I was able to gather this much from taking apart apollo-datasource-rest today, I don't recall reading in the documentation anywhere that the directives don't work with Datasources. As we discussed on Slack, the TTL value does come from the request/response headers unless manually specified. Perhaps it's a good idea to add that to the documentation? I'm sure other people have been confused about this as well. I've been asked a few times on Twitter if I've gotten caching to work yet, and I think others were under the same assumption that I was; that if Engine isn't showing that the cache is being hit, then there's something wrong with the server configuration.

There is nothing in apollo-datasource-rest keeping it from performing parallel requests, and it is in fact performing parallel requests in your example. The waterfall you're seeing is a bit misleading, because the traces for individual movies aren't actually representing fetches but simply invocations of the resolvers. That's why the timings are so tiny (less than a millisecond to several milliseconds). So the waterfall is a result of Node execution being single threaded.

Yeah that is pretty misleading. I wonder if that UI could be improved because to me that seemed like a clear sign that I was doing something wrong.

abernix · 2019-07-02T13:40:29Z

Thanks for opening this originally! We've expanded and changed our documentation quite a bit since this was originally opened and introduced a full-response cache within Apollo Server which takes the place (but open-source and written in JavaScript) of the previous generation Apollo Engine Proxy (black-box Go binary) which once served a similar same purpose. This new full-response cache is documented here and respects the @cacheControl headers.

Furthermore, the data sources documentation has come along since this was first opened as well.

It seems like any remaining concern here might be a concern with Apollo Engine itself, rather than Apollo Server, but that would be best surfaced to the Apollo Engine team using the Support tooling within Apollo Engine. I'm sure the Apollo Platform team would very much appreciate feedback on how the apollo-server-plugin-response-cache plugin cache results are portrayed within Apollo Engine, so please do pass that along to them, but I think we can close this on the open-source side of things.

Thanks again!

abernix closed this as completed Jul 2, 2019

This was referenced Jan 17, 2022

[Snyk] Security upgrade node-fetch from 2.6.7 to 3.1.1 k4ny/apollo-server#5

Open

[Snyk] Security upgrade node-fetch from 2.6.7 to 3.1.1 vutting4221/apollo-server#19

Open

github-actions bot locked as resolved and limited conversation to collaborators Apr 21, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Apollo DataSources REST: Request Parallelization #1451

Apollo DataSources REST: Request Parallelization #1451

Saeris commented Jul 31, 2018 •

edited

Loading

martijnwalraven commented Jul 31, 2018 •

edited

Loading

martijnwalraven commented Jul 31, 2018

Saeris commented Jul 31, 2018

abernix commented Jul 2, 2019

Apollo DataSources REST: Request Parallelization #1451

Apollo DataSources REST: Request Parallelization #1451

Comments

Saeris commented Jul 31, 2018 • edited Loading

martijnwalraven commented Jul 31, 2018 • edited Loading

martijnwalraven commented Jul 31, 2018

Saeris commented Jul 31, 2018

abernix commented Jul 2, 2019

Saeris commented Jul 31, 2018 •

edited

Loading

martijnwalraven commented Jul 31, 2018 •

edited

Loading