-
Notifications
You must be signed in to change notification settings - Fork 9.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FCP may impact by the loadEvent fire or not in simulator throttling. #9627
Comments
update: This may cause by Page's uncatched Error. |
Thank you very much for filing with such detailed trace data and investigation @kkdev163! Much appreciated! You are very, very close in your estimation, and you've found a clever bug in simulation! The issue comes down our optimization that skips very small tasks and the fact that we always include the first ParseHTML event in FCP estimate. This ends up accidentally pulling in a JS file that shouldn't be necessary for the first paint ( lighthouse/lighthouse-core/computed/page-dependency-graph.js Lines 104 to 112 in d735bfd
lighthouse/lighthouse-core/computed/metrics/lantern-first-contentful-paint.js Lines 100 to 107 in d735bfd
Anyone on core feel confident enough in our lantern sesh yesterday to take a stab at fixing this one? ;) I think we just want to always keep the tasks with the first ParseHTML/Layout/Paint events regardless of how long they were. Alternatively we could just remove our optimization on task length and see how slow things get |
Hi, @patrickhulce I adopt your advice of 'just remove optimization on task length'. Generally speaking, After adopting this, some page's fcp、fmp、speedIndex was reduced, and the test normal rate is increasing, but stile has some bad case like below: and it truly increases the test wait time. We use the test config of --onlyCategory=performance and --max-wait-for-load=10000, before adopting this, our pages can test complete within the 30s, after adopting this, some pages' test time may increase to 50s; Since this advice helps a lot, so we adopt this in production, and we add a safe strategy to reduce the bad case rate that when we get test result, we compare this one with the history tests, if this increase more than 50%, we ignore this test result and test again, but if we received bad case more than 3 times in a period, we consider the page's real performance may have problems, so we save the bad case in to database. After adopting this strategy, our metric is more stabled But today in the database, I found another bad case, which has a very high performance score, the test result series is like below: And I also saved these test's html、json、trace、devtoolslog, you can download in this link, I hope this will help yours to fix this bug. |
Thanks very much for the thorough investigation and data @kkdev163! Glad to hear you were able to address a few of the problems, we'll look into the files you provided 👍 |
Also @brendankenny looks like some bad news for totally removing the short task optimization
|
ah, yeah, though I guess not unexpected |
@kkdev163 it looks like that 93 is a result of premature ending of the trace. The main JS bundle request took 7.1s to load on that run which is way too long for all the default network and CPU quiet thresholds to catch it. If you've coupled this with a FWIW there seems to be some crazy high variance in download times for that page's resources which is going to inevitably translate to some performance score variance. The main app bundle takes 31ms to download on some runs and 5+ seconds in other runs. I'm not sure where these tests are being run but they don't seem to have very reliable network capacity. |
@patrickhulce These tests are all run in independent Docker containers that host in the Linux machine. Every test will run in a new clean container, and we controller the max test parallel to the 8(CPU cores). Do you think the network is impacted by the parallel or the docker environment? |
Wait, does this mean it's running 8 parallel tests with only 8 cores? I would definitely not do that. At absolute most 1 run per 2 cores. But even then I'd expect some additional introduced variance. For our stability tests, we only ran 1 parallel run per ~4-8 cores, preferring physically separate machines when possible. We also don't run in docker containers, so we don't fully understand the characteristics there but we suspect there might be these sorts of differences. |
@patrickhulce yes, we run 8 parallel tests with only 8 cores. I will adjust my config according to your advice, thanks. |
Provide the steps to reproduce
My test result can download in https://webapm.nos-jd.163yun.com/lighthouse%2F%2Ftmplh3.zip, it contains 6 times test results(JSON HTML trace file). The first test' FCP is 0.8s, others are larger than 1.6s.
What is the current behavior?
We build a page monitor system based on lighthouse-CLI. In this system, we test page in 20min interval, recently we find some page's metrics data is not stable. Some times the FCP' value may double than normal point.
I choose a very simple page like above to reproduce this behavior, I find some times the FCP is 0.8s some times the FCP is 1.6s. I'm very curious about the reason, so I also saved the trace file. I find when FCP is 1.6s, there are some network requests is not finished in the trace, and the load event is not fired normally.
Because we use the simulator throttling, so the FCP is impacted both by the FCP of trace and the page dependence Node tree. I guess the main reason cause the FCP varied largely is sometimes network request not completed, so network dependence Node computing varied.
I also changed the throttling method to 'devtools', I think this method uses the real FCP, not simulator computing result. I hope this will fine, but when I change to devtools, the page can not be measured in 10s, sometimes will also encounter NO_DCL error.
What is the expected behavior?
I'm curious about why some network request will not be completed in 10s, they are CDN static resource, it should complete very quickly.
And Dose the FCP simular algorithm should be adjusted to reduce the impact of load Event?
I think this problem is mainly caused by we are testing in docker images. We use this docker environment to test.
Can you give me some advice to let the metric not varied a largely.
Environment Information
Related issues
The text was updated successfully, but these errors were encountered: