Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Potential performance issues with using forwardRef #13456

Closed
markerikson opened this issue Aug 22, 2018 · 32 comments
Closed

Potential performance issues with using forwardRef #13456

markerikson opened this issue Aug 22, 2018 · 32 comments
Labels
Resolution: Stale Automatically closed due to inactivity Type: Needs Investigation

Comments

@markerikson
Copy link
Contributor

markerikson commented Aug 22, 2018

Do you want to request a feature or report a bug?

Bug (assuming that a potential drop in perf counts as a "bug")

What is the current behavior?

I've been setting up a rudimentary perf benchmark harness for React-Redux over in reduxjs/react-redux#983 , so that we can compare performance between builds of React-Redux as we work on version 6. The harness runs one or more benchmark apps in headless Puppeteer, running them once to capture FPS values and a second time to capture Chrome trace reports automatically.

Currently, there's only one benchmark app: a "stock ticker"-type stress test that measures FPS by using the fps-emitter package (which relies on requestAnimationFrame). It's rough, but it does show meaningful differences in FPS values as we swap between different React-Redux builds and change the number of connected components in the benchmark.

We have two WIP candidate PRs for React-Redux v6: reduxjs/react-redux#995 and reduxjs/react-redux#1000 . Initial testing showed that PR 995 was almost as fast as our existing 5.0.7 version, while PR 995 was slower in some runs. However, my initial push of the 995 branch did not include use of React.forwardRef, while the 1000 branch already had that added.

I spent Saturday adding forwardRef and a couple other pieces of functionality to the 995 branch, then re-ran the benchmarks. Per ourcomments in the 983 issue, we saw that the 995 branch had suddenly gotten slower, and that removing forwardRef from the 1000 branch sped things up by a few FPS.

This is admittedly a very artificial stress test scenario, but it's also likely that React-Redux apps can have hundreds or thousands of connected components at once, so the potential slowdown seems concerning.

If the current behavior is a bug, please provide the steps to reproduce and if possible a minimal demo of the problem. Your bug will get fixed much faster if we can run your code and it doesn't have dependencies other than React. Paste the link to your JSFiddle (https://jsfiddle.net/Luktwrdm/) or CodeSandbox (https://codesandbox.io/s/new) example below:

The current perf benchmark repo is at https://github.com/markerikson/benchmark-react-redux-perf . It uses UMD builds of React-Redux so that it can easily switch which build is being used. Several UMD builds are already committed there in the root of the repo. 6.0-mark is the 995 branch, and 6.0-greg is the 1000 branch. They can be reproduced by building their respective branches from the PRs. I've been hand-copying the UMD build artifact from my React-Redux clone over to this benchmark folder and renaming it to whatever seems appropriate.

Clone the repo, yarn to install, yarn perf to build and run the benchmark. Number of connected components can be modified in src/constants.js (requires rebuilding). The harness determines which UMD build versions to run based on an array in perfBenchmark.js, around line 11 (const VERSIONS = ["5.0.7", "6.0-mark"]).

What is the expected behavior?

That there would be minimal to no impact in perf when forwardRef is used to wrap a component such as a HOC, and that HOC is used widely in an application.

Which versions of React, and which browser / OS are affected by this issue? Did this work in previous versions of React?

I've been testing this with Puppeteer 1.6.2, on Windows 10. It looks like the locally installed version of Chromium is 69.0.3494.0 .

Current version of React is 16.4.2, UMD build (as React-Redux UMD expects that both React and Redux are also global variables). I have not tested against a prior version of React, as we are specifically targeting 16.4+ with this next version of React-Redux.

@markerikson
Copy link
Contributor Author

I'll paste in the comments between @cellog and myself from Discord as we noticed this behavior on Saturday:

Pasting some comments between myself and Greg on Discord:

[8:07 PM] acemarke:oh, wow
[8:08 PM] acemarke: the commits I just pushed are way slower than the prior version
[8:08 PM] acemarke: I think a lot of this has to do with the forwardRef and stuff
[8:08 PM] acemarke: dropped from 24-26 to 17-19
[8:08 PM] acemarke: at 2500 components
[8:08 PM] acemarke: could also be that my implementation isn't very good

[8:10 PM] CelloG:dammit :/
[8:10 PM] CelloG: I'll try removing forwardRef and see what happens
[8:11 PM] CelloG: if that speeds it up we need to add back withRef
[8:20 PM] CelloG: you're right!

[8:27 PM] CelloG: should probably tell the React team
[8:27 PM] CelloG: that's a huge issue, it should be performance-agnostic if it isn't passing an actual ref
[8:48 PM] CelloG: Ok, running the latest version now. This only uses forwardRef if withRef is set to the text "forwardRef" so we can catch folks using withRef set to true and warn them to not rely on getWrappedInstance since it don't exist no more
[8:49 PM] CelloG: And my implementation is now doing better. Let me see how it matches to 5.0.7
[8:51 PM] acemarke: Yeah, I was thinking we'd introduce a new forwardRef option or something
[8:54 PM] CelloG: ok, I just pushed the commit
[9:01 PM] CelloG: this version I get slightly faster than your old one, but still 2 fps slower than 5.0.7
[9:01 PM] CelloG: but we are in the ballpark. I'm going to re-run and see if there is any change
[8:20 PM] CelloG: after disabling forwardRef, mine is faster
[8:20 PM] CelloG: 2 fps
[8:21 PM] CelloG: than 995

@cellog
Copy link

cellog commented Aug 22, 2018

to clarify, after disabling forwardRef unless explicitly asked for, the FPS on 1000 doubled, so it was a significant performance increase.

This only happened when running large numbers of "stocks." At 500 stocks, the performance of all 3 (5.0.7, and the 2 patches) was 60 fps. The performance hits occurred at 2500-3500 "stocks."

I should also note that it showed up not just in the fps measure, but in chrome traces (also generated by the benchmark app in a separate run). The turnaround time for processing state updates was double the length in milliseconds when passed through forwardRef.

Again for clarity it's important to note that no refs were actually passed into the benchmark app, so this isn't specific to ref handling, but to forwardRef when no refs are passed in.

@gaearon
Copy link
Collaborator

gaearon commented Aug 22, 2018

Is there a benchmark we can use that reproduces the problem with just React (no React Redux)?

@cellog
Copy link

cellog commented Aug 22, 2018

Sure, the redux part of the benchmark is minimal, just needs to be stripped out. Just need to do it, will post when it's ready

@cellog
Copy link

cellog commented Aug 23, 2018

https://github.com/cellog/benchmark-react-redux-perf/tree/remove-redux is good to go, you can run it with env variable FORWARD set to 1 to enable forwardRef, leave it out to disable it.

I haven't figured out how to modify the benchmark perfBenchmark.js to pass in these env variables, perhaps that isn't needed since you can do the trace manually in chrome?

Because this version is basically redux 4, it's quite slow, so I changed number of components (in constants.js to just 1500. Adjust that at will to make it smaller.

@gaearon
Copy link
Collaborator

gaearon commented Aug 23, 2018

What should I be measuring? If I just record traces from the production build, I see this:

screen shot 2018-08-23 at 02 45 56

versus this:

screen shot 2018-08-23 at 02 46 16

Is this the kind of difference you're referring to?

@markerikson
Copy link
Contributor Author

markerikson commented Aug 23, 2018

In our case, we were measuring overall FPS while the app was running. The yarn perf script should print out recorded FPS from Puppeteer after running the benchmark for 30s, as well as recorded Chrome perf stats from a second run with tracing enabled.

Here's an example of the kinds of numbers we were seeing recently, copied and pasted from issue 1000:

I just ran another perf benchmark for 5.0.7, 995, and 1000 , set to 2500 connected components. Here's the FPS (avg / min-max) and Chrome trace time split results for each one (30-second test runs):

FPS Scripting Painting Rendering
5.0.7 28.26 (24-30) 17326 9569 2072
995 24.26 (19-28) 19067 8246 1767
1000 15.8 (14-17) 22401 5589 1187

In general, what we're seeing is that wrapping our components in forwardRef drops the overall app FPS by several digits, no matter which internal implementation we're using.

I know it's not the best possible benchmark, but it's all I've managed to put together so far.

@gaearon
Copy link
Collaborator

gaearon commented Aug 23, 2018

How does it compare to wrapping in a functional component instead of forwardRef?

@markerikson
Copy link
Contributor Author

Okay, I've got some benchmark numbers to add. The builds, in order:

  • 5.0.7: current build
  • 6.0-995-ed7a9-norefs : 995 branch, before any of the forwardRef stuff was added
  • 6.0-995-0d036-forwardref: 995 branch, with forwardRef and related changes
  • 6.0-995-577ef-funccomp: same as previous, but using a functional component instead of a forwardRef
  • 6.0-995-7f724-nowrap: same as previous, but no wrapping - just exporting Connect
  • 6.0-1000-9782b-forwardref: 1000 branch, always using a forwardRef
  • 6.0-1000-4855c-optref: 1000 branch, with forwardRef made opt-in
Build Avg FPS Scripting Rendering Painting
5.0.7 30.27 (29-31) 17041 9792 2098
6.0-995-ed7a9-norefs 25.20 (24-27) 19016 8316 1757
6.0-995-0d036-forwardref 19.15 (18-20) 21320 6532 1370
6.0-995-577ef-funccomp 20.28 (19-22) 21013 6768 1442
6.0-995-7f724-nowrap 23.14 (21-25) 19499 7969 1665
6.0-1000-9782b-forwardref 15.73 (14-17) 22128 5861 1230
6.0-1000-4855c-optref 27.27 (26-29) 18314 8807 1877

I'm attaching a zip file with the captured Chrome perf traces for reference. (Note that per the benchmark repo, the "Avg FPS" and the Chrome trace are captured on separate runs for each build, so that the trace capture doesn't slow down the FPS.)

react issue 13456 - React-Redux perf benchmark traces.zip

Per the 995 branch results, it looks like both using forwardRef and using a functional component around the rest of connect slow things down noticeably compared to not having that extra component wrapper, but forwardRef is a bit slower than a functional component.

@gaearon
Copy link
Collaborator

gaearon commented Aug 25, 2018

I think it's worth looking into how worse the results are if you add more component layers (e.g. wrap in extra N <div>s or extra N passthrough functional components). If those show similar slowdown it means your application code is so "thin" that even an extra reconciliation layer starts to matter. In this case it's not a problem with forwardRef per se. Even if we optimize a few milliseconds here and there, you can still increase the number of rows until you find the next spot where adding a reconciliation layer is prohibitively high.

In other words I'm arguing that if adding a forwardRef is expensive enough that it shows up in your bench, it might be that adding any layer is too expensive. At that point you might as well reconsider using React altogether for this part of your app, or consider using virtualization. What's the point if you can't add one more component to the tree without it going over your budget?

@markerikson
Copy link
Contributor Author

Yep, that's the weakness with only having one benchmark app so far that only measures one primary metric. I'd like to get us set up with several other benchmark apps as well that would run some more scenarios.

@cellog
Copy link

cellog commented Aug 25, 2018

Yes, but part of what we are trying to measure is how heavy react-redux is in a stressful environment, like a stock ticker app, so little things are naturally amplified.

This benchmark was helpful in finding several unnecessary slowdowns, and also proved decisively that using unstable_observedBits has a dramatic positive effect, blowing the performance of 5.0.7 out of the water.

However, it does highlight that forwardRef is expensive in this situation. What I think is useful from a React perspective is the fact that forwardRef is also expensive in our benchmark even if no refs are present, which presents an opportunity to look at whether that is a necessary expense. I concluded that it was better to avoid forwardRef unless the user explicitly asks for it (which is similar to how react-redux works now anyways) based on what we learned.

If you conclude "yes, the risk of expense is worth it" then the issue can be closed. Or perhaps mark it as a documentation issue if you want to note for library authors that forwardRef does have some expense in certain environments with lots of components updating rapidly.

@jquense
Copy link
Contributor

jquense commented Aug 25, 2018

However, it does highlight that forwardRef is expensive in this situation

I think what Dan is trying to tease out, tho is whether forwardRef is the the expensive thing, or is the slow down the result of having more components. If it's forwardRef's fault there may be something to optimize, if it's the latter tho, the question may be, is the app complex enough to approximate a complex react app. e.g. if you replaced every forwardRef with an extra HoC or div would the slow down be comparable?

@markerikson
Copy link
Contributor Author

Per the benchmarks I pasted a bit earlier, it looks like forwardRef is a bit more expensive than a trivial functional component (19.15 FPS vs 20.28 FPS in the run that I did, vs 23.14 FPS for neither.)

@cellog
Copy link

cellog commented Aug 25, 2018

Right, if you assume that forwardRef should be as expensive as a component with all the life cycle weight, then yes.

@gaearon
Copy link
Collaborator

gaearon commented Aug 25, 2018

Not with lifecycle weight, but I'd expect it to be roughly equivalent to a functional component. Because in a way it is one — you do get a render function there. Which also means that if you don't use lifecycles, you can move the rendering logic into the forwardRef callback, thus removing the layer inside of it. Then you'd go back to one layer.

@gaearon
Copy link
Collaborator

gaearon commented Aug 25, 2018

If it's not equivalent to functional component it would be interesting to profile what makes it different. It's not obvious to me.

@markerikson
Copy link
Contributor Author

Yeah, problem is that <Connect> is a class component which absolutely needs lifecycle stuff.

I can try re-running the benchmark a couple times. There's obviously not a lot of difference between wrapping in a forwardRef and wrapping in a functional component implementation-wise, but it's enough of a difference that it seems like there's actually something different.

@gaearon
Copy link
Collaborator

gaearon commented Aug 26, 2018

What does Connect currently use lifecycle for?

@markerikson
Copy link
Contributor Author

Hmm. So in my PR 995, I'm using an outer/inner component approach and doing the bulk of the work in the inner component's gDSFP. The outer component is just binding a couple methods in its constructor: PR 995 connectAdvanced

In Greg's PR 1000... huh. Okay, also only binding a couple methods in the constructor: PR 1000 connectAdvanced

In both cases, a class method is being given as the render prop for the context consumer, and it's being bound so it can safely access this.props. Also, a memoized selector function is being created per-component-instance, and the selector reference is being saved as an instance field on the component.

So, not actually a lot, but I don't immediately see a way to do with a functional component.

@cellog
Copy link

cellog commented Aug 26, 2018

I see what you're suggesting. If there is a way to create a per-instance cache with a functional component, I would happily convert the top-level component to one, and use forwardRef on it, then benchmark that

@cellog
Copy link

cellog commented Aug 26, 2018

OK, I tried it. There's no way to create a per-instance cache outside of a class component (that I can figure) so that won't work.

Basically, in order to prevent unnecessary re-render on an unimportant part of state updating, we have to use something that can save state, which a functional component can't. There are other ways to solve this with an API change for the future, but we are focusing on getting something out that legacy apps can use as a path to migrate to the new API without a complete rewrite

@gaearon
Copy link
Collaborator

gaearon commented Aug 26, 2018

When do you want to release that? We have something in the works that might help these cases but it’ll take a few months to get it out.

@cellog
Copy link

cellog commented Aug 26, 2018

Depends on when we get a more comprehensive benchmark suite up and running. Is there a PR to follow or is it still internal maybe-will-happen?

@gaearon
Copy link
Collaborator

gaearon commented Aug 26, 2018

It will definitely happen but I don't expect to definitely see a PR until the end of October.

@cellog
Copy link

cellog commented Aug 26, 2018

Ok :) probably best to pretend it doesn't exist until then unless you are allowed to give a hint how we can prepare the code to be ready for it. All very mysterious and exciting, are you all releasing a new iPhone?

@huseyinkucukdal
Copy link

Ok :) probably best to pretend it doesn't exist until then unless you are allowed to give a hint how we can prepare the code to be ready for it. All very mysterious and exciting, are you all releasing a new iPhone?

thank you

@tomlagier
Copy link

tomlagier commented Oct 19, 2018

Just wanted to call out that we ran into this issue - we're trying to get a sticky header to stick on scroll, but we're missing a frame or two which causes a noticeable flicker. A chunk of that seems to be because of forwardRef:

screen shot 2018-10-19 at 2 01 08 pm

I'll be testing some ways of doing this outside of React but it seemed like there were some issues doing this the "React" way. Though, any time you're async rendering and you -have- to hit a certain frame timing you're sort of setting yourself up for pain.

edit: decided to just ditch IE11 support and go for position:sticky so there isn't anything blocking us here, still figured I'd leave this as an observation on forwardRef perf.

@natew
Copy link

natew commented May 3, 2019

I'm mostly curious at to why the difference seems so large between development and production. I have a virtual table that renders a decent amount of items, even in a small window. In production, it's basically fine for me. In development, I see this:

image

And so on for ~50 items. As you can see forwardRef is most of the work here, which seems odd.

@stale
Copy link

stale bot commented Jan 10, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contribution.

@stale stale bot added the Resolution: Stale Automatically closed due to inactivity label Jan 10, 2020
@stale
Copy link

stale bot commented Jan 17, 2020

Closing this issue after a prolonged period of inactivity. If this issue is still present in the latest release, please create a new issue with up-to-date information. Thank you!

@gaearon
Copy link
Collaborator

gaearon commented Apr 1, 2020

@natew My guess is it's the User Timing API itself being slow as hell. We've removed it on master because of how misleading it is.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Resolution: Stale Automatically closed due to inactivity Type: Needs Investigation
Projects
None yet
Development

No branches or pull requests

7 participants