Statistical estimation of the lighthouse score distribution parameters: Covariance matrices? #12014

koraa · 2021-01-28T14:31:49Z

Hi,

in my continuing quest (see #11570 for previous work) to statistically model the lighthouse performance score I have found that many of the individual performance scores are correlated. This graph should immediately make clear what I mean. Since a lot of of performance measurements are proxies for cpu performance and such, so this is not very surprising.

The graph above is for the correlation of individual scores on an empty page (lighthouse score close to one), but this holds even for lighthouse score which are relatively close to average (0.2-0.8) as this graph shows.

Currently I am estimating the distribution of the mean for each individual score essentially by application of the central limit theorem, derive confidence intervals, combine these under the assumption that variables are uncorrelated. The results are nice, but they could be improved by a structured treatment of score correlations.

So I am wondering, is there any previous work you can refer me to in regards to the correlation between scores? Any insight you could offer as to the magnitude of the correlation?

Optimally there would be a correlation matrix available; I could generate one over the data I have available but I suspect the correlation will be specific to my test data; a wider population of websites would have to be used…

patrickhulce · 2021-01-28T14:55:12Z

Always super interesting to see what you're up to in this area :)

Depending on your exact goals, the global correlations might not be all that useful to you. The correlation between performance metrics changes drastically depending on choices the page makes. Some examples...

When the site does all its work to reach FCP (a traditional HTML/CSS render-blocking page)...
- The correlation between FCP, LCP, Speed Index, and TTI will be 1
- The correlation between TBT/CLS and everything else will be essentially 0
When the site follows a client-side rendering model...
- The correlation between FCP and TTI will be far weaker (you'll always have the permanent correlation of developers that build sites with a poor performance tend to build sites with poor performance across the board and the fact that TTI = max(FCP, last CPU work)).
- The correlation between TBT and TTI will be fairly positive (old investigations I don't remember well and can't find now were somewhere in the ~0.4-0.6 range I think?)

If you're still interested in global correlations from a broader dataset, I'd suggest looking into querying HTTPArchive as a starting point. If you end up with any big takeaways

paulirish · 2023-01-24T20:19:19Z

Since this thread, we've had additional research in this area, though it's not immediately linkable. I think this thread is complete, but if there is additional interest I can rustle up some analyses.

devtools-bot added the needs-priority label Jan 28, 2021

connorjclark assigned brendankenny Jan 11, 2022

paulirish closed this as completed Jan 24, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Statistical estimation of the lighthouse score distribution parameters: Covariance matrices? #12014

Statistical estimation of the lighthouse score distribution parameters: Covariance matrices? #12014

koraa commented Jan 28, 2021

patrickhulce commented Jan 28, 2021

paulirish commented Jan 24, 2023

Statistical estimation of the lighthouse score distribution parameters: Covariance matrices? #12014

Statistical estimation of the lighthouse score distribution parameters: Covariance matrices? #12014

Comments

koraa commented Jan 28, 2021

patrickhulce commented Jan 28, 2021

paulirish commented Jan 24, 2023