Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RTK Query: Subscription leak in React using useQuery() with skip parameter #3182

Closed
mjwvb opened this issue Feb 16, 2023 · 13 comments · Fixed by #3188
Closed

RTK Query: Subscription leak in React using useQuery() with skip parameter #3182

mjwvb opened this issue Feb 16, 2023 · 13 comments · Fixed by #3188
Labels
bug Something isn't working
Milestone

Comments

@mjwvb
Copy link

mjwvb commented Feb 16, 2023

There seems to be a subscription leak in RTK Query (1.9.2) with React (18.2).

When using useQuery / useQuerySubscription with the skip parameter, some of the subscriptions randomly won't go away when the component/hook unmounts or the args change. This leak is very common in our application and causes issues with some of our code in the onCacheEntryAdded endpoint lifecycle method, as its lifecycle is never ended for the leaked subscriptions.

I managed to reproduce the leak in the pokemon example sandbox by brute forcing different subscription mounts on the same endpoint (with keepUnusedDataFor=1) while toggling the skip parameter on/off at a 10ms interval. By pressing the mount button in the sandbox you can initiate the brute force action and watch the active list of subscriptions. You can see that some of the subscriptions will never go away and have become ghost subscriptions. (Of course I turned off the api calls to not ddos the pokemon database.)

Obviously the sandbox is not a real world use case, but it shows there's something wrong. In our own application it happens by simply having many different endpoints in a complex/busy react environment. However it's worth noting that we can somehow fix it by turning off the autoBatchEnhancer, unsure why...

Edit dry-river-ltbu89

subscription leak

@markerikson markerikson added this to the 1.9.x milestone Feb 16, 2023
@markerikson
Copy link
Collaborator

Interesting! Don't immediately have an idea why this would be happening, but thanks for the report and the repro!

@markerikson markerikson added the bug Something isn't working label Feb 16, 2023
@mjwvb
Copy link
Author

mjwvb commented Feb 16, 2023

Thanks for the quick reply! I'm actually a little bit unsure whether this is the actual cause of the subscription leakage in our specific situation, as it is fixed by disabling the autoBatchEnhancer. However the conditions and the result is exactly the same so it is related at least.

@markerikson
Copy link
Collaborator

Huh. At first glance, there's very different behavior between Chrome and Firefox here. With Chrome, clicking "Unmount" makes the subscription count revert back to about 1. With Firefox, it only goes down a couple and leaves most of them in place.

@markerikson
Copy link
Collaborator

@mjwvb : hmm. I actually see this happening in the sandbox, without having the autoBatchEnhancer turned on...

@mjwvb
Copy link
Author

mjwvb commented Feb 19, 2023

In practice the leakage occurs only very randomly and with multiple useQuery() simultaneously when the event loop / cpu is too busy, so very hard to reproduce. I suspect the autoBatchEnhancer does something with the processing order of things, so the chances of a leak happening is much higher when it's turned on.

I noticed very different behavior as well, e.g. happens more often on windows chrome than on macos chrome. Maybe it has to do with the interval timing and how efficient everything is processing.

@markerikson
Copy link
Collaborator

Been trying to dig through these two replays. I did use a local RTK test build from the #3188 branch. That actually fixes this issue on Chrome, but I still see the problem in FF.

Here's the recordings:

@markerikson
Copy link
Collaborator

markerikson commented Feb 20, 2023

@mjwvb : good news! I think I've got a fix, and the actual reason why this is happening.

The skipToken fix in #3188 is part of it, but the other buggy bit is this line here:

usePossiblyImmediateEffect((): void | undefined => {
promiseRef.current = undefined
}, [subscriptionRemoved])

Notice that it's always clearing out this promise ref any time subscriptionRemoved changes.

The cause sequence appears to be:

  • Start with an existing subscription
  • Set {skip: true}
  • Change the query arg and {skip: false} at the same time
  • Query hook runs with subscriptionRemoved: true, clears promise ref, initiates subscription (good)
  • Component re-renders (status, etc)
  • Query hook runs with subscriptionRemoved: false, clears promise ref, initiates subscription (bad)

The problem is that we shouldn't be clearing out the promise ref if the subscription hasn't been removed.

Adding if (subscriptionRemoved) inside that effect fixes things.

update

What's really weird is that the behavior was far worse in Firefox than Chrome. Not only did FF run the updates slower (Chrome was 3x faster), in Chrome only 1-2 subscriptions would be left over, whereas in FF all but 1-2 would be left over.

I was able to record replays in both FF Win and Chrome Linux, and set up the same print statements in both. Did a lot of very manual comparisons between the two to narrow down where things were diverging.

Finally figured out the subscriptionRemoved bit I described above, made the change, did another local build, and I could see the subscriptions all go away as expected.

Then I was able to turn off most of my print statements, focus on the ones in the UI components, and nail down the sequence that triggered it. That let me write a meaningful test that repro'd the bug, and verified that the test failed first and the fix worked.

I'll point out that there's no way I could have solved this without A) your repro project, and B) the app that I work on, https://replay.io :)

Here's those recordings again if anyone wants to look at the debugging I was doing (you can turn on "my" print statements via the Pause sidebar panel):

@markerikson
Copy link
Collaborator

@mjwvb : could you try out the CodeSandbox preview build from #3188 and let me know if that fixes the issue on your side?

@mjwvb
Copy link
Author

mjwvb commented Feb 20, 2023

Nice! Tested the sandbox in multiple browsers and it really seems to be fixed right now. I also installed the specific commit in our application where the issue originally occurred, and now I can keep the autoBatchEnhancer turned on and have no issues at all.

Thanks for fixing the issue so quickly! Such leaks are always a struggle to pinpoint. Having a tool like replay.io seems like a blessing for those kind of issues.

@markerikson
Copy link
Collaborator

@mjwvb : awesome! I've got a couple other small tweaks I want to investigate, then I'll push at 1.9.3 with this fix sometime later today.

@harry-gocity
Copy link

Just wanted to say thanks both @mjwvb for the repro on this and @markerikson for the fix - we've had this issue come up in a very similar and subtle scenario that I've been scratching my head over and struggling to create a repro for.

In our case, we have a set of queries in our app that use args selected from the store, and a mutation implemented with a queryFn. In the mutation queryFn, prior to returning success data we change the same store values that are used by the queries. When we trigger the mutation in the app, we unwrap the result and if successful immediately route to a different page, where the other queries are not used. But somehow, in between the dispatch inside queryFn and the actual router push which unmounts everything, the queries were all refetching with the updated values from the store, and the subscriptions were persisting across the route change, causing problems on the next page.

Bumping to 1.9.5 fixes this for us 🙌🏻

@gtrak
Copy link

gtrak commented Sep 30, 2023

I'm seeing something like this on 1.9.5, react 18.2, independent of the skip param.

What's happening is when using polling, often polling is not updated to use new args on an args change. I did a bit of digging and found that clearing behavior is related to the number of subscriptions in #1933 , and I am seeing more than one subscription in redux state with a single active useQuery hook on the page. I confirmed all the params and args state was correct. I was able to work around it by doing my own polling with a custom hook.

Could there be more to fix around subscriptions?

@phryneas
Copy link
Member

@gtrak could you create a reproduction or replay of that and open a new issue please?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants