-
-
Notifications
You must be signed in to change notification settings - Fork 32.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[system] Add browser benchmark #22923
Conversation
scripts/perf.js
Outdated
var start = performance.now(); | ||
const page = await browser.openPage(`http://${SERVER}:${PORT}/${APP}?${testCase}`); | ||
const end = performance.now(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a way we can benchmark what's happening in the browser and send that back? Otherwise we'll add a lot of noise from what the browser is doing that is unrelated to the actual benchmark.
Seems like there's plenty of existing APIs for that: https://addyosmani.com/blog/puppeteer-recipes/#runtime-perf-metrics
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks much better, let me refactor to use the metrics()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might also be interesting to compare these metrics()
to React's Profiler component (see also How to use profiling in production).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Refactored using
const perf = await page.evaluate(_ => {
const { loadEventEnd, navigationStart } = performance.timing;
return loadEventEnd - navigationStart;
});
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me compare the numbers now with the React's Profiler component :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added the Profiler on the box-emotion
example. Profiler result:
actualDuration: 19.274999969638884
baseDuration: 15.385000151582062
commitTime: 119.98000007588416
id: "box-emotion"
interactions: Set(0) {}
phase: "mount"
startTime: 98.64500002004206
perf result (10 different runs; this includes the component as well):
Box emotion:
69.00ms
68.00ms
90.00ms
78.00ms
44.00ms
44.00ms
57.00ms
56.00ms
56.00ms
56.00ms
-------------
Avg: 61.80ms
Let me try this on more examples to see if the relative differences are similar.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here are more numbers:
- box-emotion
Box emotion:
69.00ms
67.00ms
77.00ms
93.00ms
43.00ms
57.00ms
56.00ms
56.00ms
57.00ms
43.00ms
-------------
Avg: 61.80ms
actualDuration: 18.609998980537057
baseDuration: 15.669998829253018
commitTime: 110.84500001743436
id: "box-emotion"
interactions: Set(0) {}
phase: "mount"
startTime: 90.1650000596419
- box-styled-components
Box styled-components:
67.00ms
63.00ms
64.00ms
80.00ms
97.00ms
61.00ms
61.00ms
47.00ms
48.00ms
48.00ms
-------------
Avg: 63.60ms
actualDuration: 15.089999069459736
baseDuration: 12.439999729394913
commitTime: 96.09999996609986
id: "naked-styled-components"
interactions: Set(0) {}
phase: "mount"
startTime: 79.29499994497746
- box @material-ui/styles
Box @material-ui/styles:
123.00ms
122.00ms
89.00ms
103.00ms
103.00ms
85.00ms
121.00ms
123.00ms
84.00ms
84.00ms
-------------
Avg: 103.70ms
actualDuration: 44.76000100839883
baseDuration: 41.65500064846128
commitTime: 130.44999993871897
id: "box-material-ui-system"
interactions: Set(0) {}
phase: "mount"
startTime: 83.49999994970858
- naked-styled-components
Box styled-components:
67.00ms
63.00ms
64.00ms
80.00ms
97.00ms
61.00ms
61.00ms
47.00ms
48.00ms
48.00ms
-------------
Avg: 63.60ms
actualDuration: 15.089999069459736
baseDuration: 12.439999729394913
commitTime: 96.09999996609986
id: "naked-styled-components"
interactions: Set(0) {}
phase: "mount"
startTime: 79.29499994497746
I am not sure how to compare the numbers of both as they are quite different... But I can conclude from both measurements for example that @material-ui/styles' Box is slower than the rest, not much else... We can try to render more instances of the components too, currently I am rendering 100 boxes. @eps1lon what do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd keep both: Browser metrics and React metrics.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added with 02fc972
No bundle size changes comparing 72ceb2d...2df44b7 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The solution looks simple and seems to do the job. Nice. Regarding the folder location. Does it mean that we should move /packages/material-ui-benchmark outside of the packages folder in the future?
Co-authored-by: Olivier Tassinari <olivier.tassinari@gmail.com>
I understood what you meant. I meant median as well (hence percentile). Median is not a type of average. |
@eps1lon I guess it depends on the definition. In https://en.wikipedia.org/wiki/Average, they defined an average as "a single number taken as representative of a list of numbers" 🤷♂️ 🙃 |
You conveniently skipped the first part of that quote: "colloquial" which is not the same as a definition. With "mean" I was being colloquial and meant the "arithmetic mean". But I'm not sure why you deflected to "definitions of average". We want percentiles in benchmarks not discuss what an average is. |
Move the script to the
On this to be honest, I am interested in the relative difference between the scenarios we are running. If we have some 30 - 40ms noise on each scenario it won't affect the relative difference. Should we add this noop scenario as a reference, or run it and extract the difference from all examples? I wouldn't change anything regarding this at this moment to be honest. And we can experiment with adding the
Done 👍
Agree, we should use the browser benchmark for scenarios that run some react code in the browser. I will do another iteration for cleaning up the examples and leaving only the relevant in each of the benchmarks we have currently.
I would go with the simplest thing at the start, and if we need to we can alter it in the future. What do you think?
If we see problem with this that would be the way to go. However, at this moment, until we see some potential problems, I would keep the
Again if we use this for the relative difference between the scenarios in one run, the numbers should be stable between them in my opinion. I wouldn't use this scenario for comparing with some scenario running in different environment anyway.
Added median as well as average, I think for start those two numbers should be enough. We are anyway printing all results when running the scenario. In addition to this, I've changed the components scenario to run 1000 instances of the component instead of 100 so that the numbers are showing biggest difference (tried 10000 as well, but the time for running them was too bug and the relative values were similar, so I decided to stay with 1000) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am interested in the relative difference between the scenarios we are running. If we have some 30 - 40ms noise on each scenario it won't affect the relative difference.
Definitely, relative different works great with one limit (more in the next comment). I guess the next person that will look at the results needs to be aware that he can't do any absolute comparison between the results.
Should we add this noop scenario as a reference
I think that the main values of including the noop scenario is to raise awareness of the above (don't do an absolute comparison) and to know when your test case is slow enough. If the first test case runs for 5ms ± 1ms and the second first test case runs for 4ms ± 1ms but you have noop that takes 30ms ± 5ms. The uncertainty will likely be too high: 35ms ± 6ms vs. 36ms ± 6ms to do any comparison.
I would go with the simplest thing at the start, and if we need to we can alter it in the future. What do you think?
Agree, definitely for later (or even never).
I have tried to improve the precision of the measurement on mnajdova#10, to reduce the uncertainty. If we take the noop case, we go from 34ms ±9ms to 4.63ms ±0.25ms (I have used the mean for the baseline value and the difference between the slowest and faster run divided by two for the ± uncertainty). This was achieved using the following leverages:
@eps1lon do you know how we can measure up to the composite stage? https://developers.google.com/web/fundamentals/performance/rendering But maybe we don't want to do that as it would only introduce uncertainty without giving us measurement on something we can optimize. |
I've added |
@eps1lon @oliviertassinari merging this one, had to rebase two times this morning because of conflicts in |
This PR adds browser benchmark for the system performance. Try it with running the following command on the root of the project:
It will build the
benchmark/
project, serve the root and start thescripts/benchmark.js
script.