RTK Query - query level customization/normalization of cache key #1668

ethanresnick · 2021-10-28T22:59:57Z

The Problem

Imagine a query where args is an options object, and some of the keys have default values. For example, maybe there's a getPosts query that has args of { query: string, recentOnly?: boolean }; when the caller omits recentOnly in the args, the query function acts as though it were false.

The result is that there are two ways to fetch the same data:

useGetPostsQuery({ query: "react" }); // recentOnly is implicity false
useGetPostsQuery({ query: "react", recentOnly: false }); // recentOnly provided explicitly

Conceptually, both of these hooks fetch the same data, but, right now, those two calls will generate different cache keys. This presents a few issues:

Most obviously, the cache just won't quite as work as well. I.e., if one component calls useGetPostsQuery({ query: "react" }), and another does useGetPostsQuery({ query: "react", recentOnly: false }), the second component won't be able benefit from the cached response from the first hook invocation. Moreover, these two cached responses can easily get out of sync (e.g., if one's polling for updates and the other isn't), when there's really no need for them to.
I think the bigger issue, though, is that now manual cache updates are unreliable. The cached data to update might exist under either of the two cache keys, or both, depending on how exactly the query was invoked. Any code doing manual/optimistic updates has to make sure to account for all possible cache keys, which is bug-prone. (Plus, as there are more optional properties in args, the number of possible cache keys for the same data explodes.)
Because the cached responses are not linked to one another and can get out of sync (e.g., if the manual/optimistic update code does not account for all possible cache keys), refactors that will seem to developers like they ought to be safe — like replacing one version of the hook call from my example with the other — will end up causing breaking changes.

Proposed Solution

This is a classic case where two different representations (i.e., { query: "react" } and { query: "react", recentOnly: false }) actually mean the same thing, so the query ought to be able to define sort of normalization step so that they resolve to the same cache key.

I propose extending the API for defining a query to support a normalizeQueryArgs property, like this:

getPosts: build.query<Post[], { query: string, recentOnly: boolean }>({
   query: (args) => ({ url: `/posts?limit=${args.recentOnly? "50" : "all"}` }),
   normalizeQueryArgs: (rawArgs: { query: string, recentOnly?: boolean }) => ({ recentOnly: false, ...rawArgs })
})

The idea is that the argument passed to a useXXXQuery hook, or to a direct initiate call, would go through normalizeQueryArgs, with the result value being passed to the query function. The cache key generation logic would stay identical to today, but it would now be based on a normalized representation of the arguments.

Potential Alternatives

I can think of a couple alternatives here:

First, RTK query's docs could simply warn of these caching issues, and say that any query where args allows multiple values to have the same meaning is an anti-pattern. Then, users would have to avoid/refactor such queries.

My hypothetical getPosts query might get refactored by splitting it into two queries, like getLatestPosts({ query: ".." }) and getAllTimePosts({ query: "..." }). Or, maybe it would be refactored so that recentOnly is always required, so that there are no longer two different cases (missing + false) with the same meaning.

The problem with this approach, I think, is that queries with optional arguments are useful, and their refactored forms are worse/less ergonomic. Consider splitting getPosts into getLatestPosts and getAllTimePosts: if the only difference between getLatestsPosts and getAllTimePosts is the value used for the ?limit query parameter when calling the API (as in my example above), then having two queries is going to introduce a lot of needless duplication (in transformResponse, providesTags, etc). Meanwhile, making keys that could have default values (like recentOnly) required just makes every caller have to do more typing, and requires updating every query call site if a new could-have-been-optional key is added to the options object.

Another approach could be handling this in the global serializeQueryArgs function, but that doesn't seem right, as that function applies to all queries (so it'd need conditional logic based on which query's arguments it's dealing with). It's also defined very far away from the place where each query defines its arguments, so it'd be hard to keep it correct.

The text was updated successfully, but these errors were encountered:

phryneas · 2021-10-29T06:39:33Z

I think we have two very different base views of the "equality" here:

In your

useGetPostsQuery({ query: "react" }); // recentOnly is implicity false

I would say that no, it is not implicitly false. It is undefined, and not part of the argument. It is a very different object than

useGetPostsQuery({ query: "react", recentOnly: false });

Both are falsy values, but only one is false.

You suddenly have four different values here for recentOnly, not two:

true
false
undefined
not given

My suggestion would be to just go down to two again:

true
not given

So if you are using TypeScript,

{ query: string, recentOnly?: true }

Generally, that concept of "args" normalization is interesting, and something that I'm willing to consider in the future (or a per-endpoint definition of serializeQueryArgs 🤔) but since we are very close to a release to 1.7 it will not make it in there - and the release of 1.8 is pretty far away after that (we release 2 or 3 versions a year), so you won't have any immediate benefit from that.

So, right now I'd go either with a type definition that only allows for these two values, or if you are not using TypeScript, maybe writing a wrapper around your query hook to use:

const useMyGetPostsQuery(args, opts) {
  return useGetPostsQuery({...defaultArgs, ...args}, opts)
}

lukascivil · 2021-11-18T23:36:17Z

@phryneas I had a question here because I fell into a similar problem. I need to send parameters to the endpoint but I don't want them to participate in the cache.

What if something like that existed? makes sense?

// Same calls with same key
useGetPostsQuery({ query: "react" });

useGetPostsQuery({ query: "react" }, { extraArg: { recentOnly?: true } }); // we don't care about recentOnly on cache key

I don't know if the architecture supports it or can support it, just a tip.

agomai · 2021-11-19T13:47:47Z

@phryneas I also have a use case that would greatly benefit from this feature.

Consider an endpoint that has a forceUpdate flag / query parameter, that can optionally be used to force fetching fresh data from an IoT device (instead of intermediate backend caches). The data itself is exactly the same, only it is potentially fresher. Hence we really like to store it using the same cache key. I was expecting something like this to exclude forceUpdate from the cache key:

getSomethings: build.query<Something[], { id: string, forceUpdate: boolean }>({
   query: ({ id, forceUpdate = false }) => ({ url: `/something/id${forceUpdate ? '?force-update' : ''}` }),
   serializeQueryArgs: ({ endpointName, queryArgs }) => {
     return `${endpointName(${queryArgs .id}})`
  }
})

The normalizeQueryArgs alternative proposed above would also work, but having serializeQueryArgs on the root level inspired me to expect it on the endpoint level as well.

In understand that sadly this feature cannot make it into 1.7. As this is important to our adoption of RTK Query: is using the root serializeQueryArgs and excluding forceUpdate from there a feasible workaround at the moment? Are there any other solutions for not taking forceUpdate into account for computing the cache key?

frankleng · 2022-05-26T05:17:30Z

working around by doing

serializeQueryArgs: ({ endpointName, queryArgs }) => endpointName

in the createApi call.

This was hard to find, - had assumed that invalidating tags would suffice, but in Redux Dev Tool you can see multiple reducer paths based on the args. In our case, the arg was a timestamp, used to fetch new data since.
None of the refetches would cause the data selected to refresh, even tho network was successful.

phryneas · 2022-05-26T07:28:50Z

@frankleng that's not a good solution since it won't allow you to have two queries with different arguments for the same endpoint to be active at the same time (like {page:2} and {page:3}) and a lot of the "caching functionality" will be disabled.

Generally it is actually wanted to have multiple cache entries. Could you maybe start a new discussion with some code and a description of what you actually want to do?

frankleng · 2022-05-27T23:25:48Z

@phryneas I didn't think so either. but there is currently no need for us to cache based on args.
we do a lot of fetching based on last updated timestamp - it'll be a different timestamp every time. no need to cache.

perhaps a better way is to create another api instance. one with default serialization, one that only serializes endpointName.
ideally we want to define serialization per endpoint as pointed out by others.

phryneas · 2022-05-28T06:41:59Z

It is a 100% valid use case to have useSomeQuery(1) and useSomeQuery(2) on screen at the same time, so the arg has to be part of the cache key. Is there any particular reason why you have a problem with the arg being part of the cache key, apart from aesthetics?

frankleng · 2022-05-31T01:11:07Z

@phryneas thanks.
I'm using

 const [refetch, data] = useLazyQuery();
  
// runs on user action
 await refetch(timestamp) // partial fetch via timestamp

we need to disable cache because actual response of refetch(5hrs ago) will change over time.
strange part was data from refetch(last fetched time) never refreshed, tho the network call had the correct response. it did create a new reducer path in the store, correct args and data.

markerikson · 2022-08-11T03:11:31Z

Update: the plan for 1.9 is to address this by adding support for per-endpoint cache key serialization in a serializeQueryArgs option. (We already had this as a global for the whole API, but it can now be overridden per-endpoint too.)

I just merged #2493 , which should add that.

As a quick example, say you are passing a,b,c,d into a query, but only want a and c used as the cache args. One possible implementation might be:

createApi({
  baseQuery,
  endpoints: (build) => ({
    getSomeData: build.query({
      query: (a, b, c, d) => `/a/b/c/d`,
      serializeQueryArgs: ({queryArgs, endpointDefinition, endpointName}) => {
        // Only need a and c, but again showing for clarity
        const {a, b, c, d} = queryArgs; 
        return defaultSerializeQueryArgs({queryArgs: {a, c}, endpointDefinition, endpointName})
      }
    })
  })
})

so, if you called this as useGetSomeDataQuery({a: 1, b: 2, c: 3, d: 4}), that should end up with a final cache key of... getSomeData('{a:1,c:3}), I think, and thus skipping b and d.

Looking at this example, there's probably a couple ways to make it a bit shorter, but the important thing is it will be feasible to do this.

If anyone wants to try this out, you can install the preview build from #2401 - let us know how it works out?

markerikson · 2022-08-14T01:58:50Z

I'm going to close this as the PR has been merged. If anyone has problems or feels that the implemented serializeQueryArgs option isn't sufficient, please comment and let us know!

phryneas added the enhancement New feature or request label Oct 29, 2021

phryneas added this to the 1.8 milestone Oct 29, 2021

markerikson mentioned this issue Feb 5, 2022

RTK 1.8 and 1.9 planning #2003

Closed

markerikson closed this as completed Aug 14, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RTK Query - query level customization/normalization of cache key #1668

RTK Query - query level customization/normalization of cache key #1668

ethanresnick commented Oct 28, 2021

phryneas commented Oct 29, 2021

lukascivil commented Nov 18, 2021 •

edited

Loading

agomai commented Nov 19, 2021

frankleng commented May 26, 2022 •

edited

Loading

phryneas commented May 26, 2022

frankleng commented May 27, 2022 •

edited

Loading

phryneas commented May 28, 2022

frankleng commented May 31, 2022 •

edited

Loading

markerikson commented Aug 11, 2022

markerikson commented Aug 14, 2022

RTK Query - query level customization/normalization of cache key #1668

RTK Query - query level customization/normalization of cache key #1668

Comments

ethanresnick commented Oct 28, 2021

The Problem

Proposed Solution

Potential Alternatives

phryneas commented Oct 29, 2021

lukascivil commented Nov 18, 2021 • edited Loading

agomai commented Nov 19, 2021

frankleng commented May 26, 2022 • edited Loading

phryneas commented May 26, 2022

frankleng commented May 27, 2022 • edited Loading

phryneas commented May 28, 2022

frankleng commented May 31, 2022 • edited Loading

markerikson commented Aug 11, 2022

markerikson commented Aug 14, 2022

lukascivil commented Nov 18, 2021 •

edited

Loading

frankleng commented May 26, 2022 •

edited

Loading

frankleng commented May 27, 2022 •

edited

Loading

frankleng commented May 31, 2022 •

edited

Loading