Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance: Try refactoring get_hooked_blocks() to check impact #5399

Closed

Conversation

gziolo
Copy link
Member

@gziolo gziolo commented Oct 4, 2023

Trac ticket: https://core.trac.wordpress.org/ticket/59383

Important: This is an experiment! The focus should be on testing the performance as I don't think it's worth landing it if there is no positive impact.

I tried refactoring get_hooked_blocks to see if that would have any noticeable impact on the performance of the block themes like Twenty Twenty-Three or Twenty Twenty-Four. Let's keep in mind that there are no active hooked blocks in WordPress core, so it means that by default, get_hooked_blocks won't be fully battle-tested in the context of performance without registering custom blocks using Block Hooks. However, there is a test theme present in the codebase that is used to verify the feature in unit tests.

The most important consideration in this PR was related to the fact that get_hooked_blocks iterates on every block type in the registry. I was seeking the most optimal way to minimize the number of get_hooked_blocks calls. Before this PR, get_hooked_blocks was called 4 times per every parsed block when processing every template, template part, and pattern. With this PR, get_hooked_blocks gets called once per template, template part, or pattern.

Some additional improvements are also applied, like calling get_hooked_blocks only once when returning the list of all registered patterns. The last modification proposed brings the implementation closer to what we had before Block Hooks. When we ensure that there are no registered block hooks or filters that could impact them, we omit the parsing and serializing step.


This Pull Request is for code review only. Please keep all other discussion in the Trac ticket. Do not merge this Pull Request. See GitHub Pull Requests for Code Review in the Core Handbook for more details.

@gziolo
Copy link
Member Author

gziolo commented Oct 5, 2023

I'm playing with local benchmarking by following instructions from https://make.wordpress.org/performance/handbook/measuring-performance/benchmarking-php-performance-with-server-timing/. I created a test post with the content copied from the same page. I also added 5 comments. I'm using this branch, which uses the Twenty-Four theme without any changes applied.

Screenshot 2023-10-05 at 14 10 14

I executed the following command for every configuration:

$ npm run research -- benchmark-server-timing -u http://localhost:8889/2023/10/05/benchmarking/ -n 100 -p

This branch:

╔════════════════════════════════╤════════════════════════════════════════════════╗
║ URL                            │ http://localhost:8889/2023/10/05/benchmarking/ ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ Success Rate                   │ 100%                                           ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ Response Time (p10)            │ 299.47                                         ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ Response Time (p25)            │ 304.17                                         ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ Response Time (p50)            │ 309.68                                         ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ Response Time (p75)            │ 321.71                                         ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ Response Time (p90)            │ 334.47                                         ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ wp-load-alloptions-query (p10) │ 2.29                                           ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ wp-load-alloptions-query (p25) │ 2.34                                           ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ wp-load-alloptions-query (p50) │ 2.4                                            ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ wp-load-alloptions-query (p75) │ 2.5                                            ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ wp-load-alloptions-query (p90) │ 2.65                                           ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ wp-before-template (p10)       │ 121.82                                         ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ wp-before-template (p25)       │ 123.19                                         ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ wp-before-template (p50)       │ 126.38                                         ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ wp-before-template (p75)       │ 131.51                                         ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ wp-before-template (p90)       │ 148.44                                         ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ wp-template (p10)              │ 172.68                                         ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ wp-template (p25)              │ 174.57                                         ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ wp-template (p50)              │ 178.34                                         ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ wp-template (p75)              │ 183.06                                         ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ wp-template (p90)              │ 195.96                                         ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ wp-total (p10)                 │ 295.72                                         ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ wp-total (p25)                 │ 300.27                                         ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ wp-total (p50)                 │ 306.02                                         ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ wp-total (p75)                 │ 317.77                                         ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ wp-total (p90)                 │ 330.03                                         ║
╚════════════════════════════════╧════════════════════════════════════════════════╝

trunk:

╔════════════════════════════════╤════════════════════════════════════════════════╗
║ URL                            │ http://localhost:8889/2023/10/05/benchmarking/ ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ Success Rate                   │ 100%                                           ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ Response Time (p10)            │ 309.98                                         ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ Response Time (p25)            │ 314.62                                         ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ Response Time (p50)            │ 320.08                                         ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ Response Time (p75)            │ 331.87                                         ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ Response Time (p90)            │ 339.95                                         ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ wp-load-alloptions-query (p10) │ 2.28                                           ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ wp-load-alloptions-query (p25) │ 2.33                                           ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ wp-load-alloptions-query (p50) │ 2.39                                           ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ wp-load-alloptions-query (p75) │ 2.47                                           ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ wp-load-alloptions-query (p90) │ 2.56                                           ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ wp-before-template (p10)       │ 124.09                                         ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ wp-before-template (p25)       │ 125.53                                         ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ wp-before-template (p50)       │ 128.85                                         ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ wp-before-template (p75)       │ 133.42                                         ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ wp-before-template (p90)       │ 145.98                                         ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ wp-template (p10)              │ 180.57                                         ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ wp-template (p25)              │ 183.41                                         ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ wp-template (p50)              │ 187.57                                         ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ wp-template (p75)              │ 191.97                                         ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ wp-template (p90)              │ 197.58                                         ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ wp-total (p10)                 │ 305.84                                         ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ wp-total (p25)                 │ 310.75                                         ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ wp-total (p50)                 │ 316.17                                         ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ wp-total (p75)                 │ 327.64                                         ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ wp-total (p90)                 │ 335.31                                         ║
╚════════════════════════════════╧════════════════════════════════════════════════╝

With the Like Button plugin installed and activated. The block gets hooked into the list of comments:

Screenshot 2023-10-05 at 14 21 30

This branch:

╔════════════════════════════════╤════════════════════════════════════════════════╗
║ URL                            │ http://localhost:8889/2023/10/05/benchmarking/ ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ Success Rate                   │ 100%                                           ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ Response Time (p10)            │ 306.48                                         ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ Response Time (p25)            │ 307.52                                         ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ Response Time (p50)            │ 313.17                                         ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ Response Time (p75)            │ 318.99                                         ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ Response Time (p90)            │ 345.78                                         ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ wp-load-alloptions-query (p10) │ 2.32                                           ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ wp-load-alloptions-query (p25) │ 2.38                                           ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ wp-load-alloptions-query (p50) │ 2.43                                           ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ wp-load-alloptions-query (p75) │ 2.46                                           ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ wp-load-alloptions-query (p90) │ 2.62                                           ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ wp-before-template (p10)       │ 122.81                                         ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ wp-before-template (p25)       │ 123.48                                         ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ wp-before-template (p50)       │ 125.1                                          ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ wp-before-template (p75)       │ 130.05                                         ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ wp-before-template (p90)       │ 151.25                                         ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ wp-template (p10)              │ 179.55                                         ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ wp-template (p25)              │ 180.21                                         ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ wp-template (p50)              │ 181.49                                         ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ wp-template (p75)              │ 186.12                                         ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ wp-template (p90)              │ 193.58                                         ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ wp-total (p10)                 │ 302.96                                         ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ wp-total (p25)                 │ 303.97                                         ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ wp-total (p50)                 │ 309.2                                          ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ wp-total (p75)                 │ 314.67                                         ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ wp-total (p90)                 │ 341.14                                         ║
╚════════════════════════════════╧════════════════════════════════════════════════╝

trunk:

╔════════════════════════════════╤════════════════════════════════════════════════╗
║ URL                            │ http://localhost:8889/2023/10/05/benchmarking/ ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ Success Rate                   │ 100%                                           ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ Response Time (p10)            │ 316.24                                         ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ Response Time (p25)            │ 319.83                                         ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ Response Time (p50)            │ 324.47                                         ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ Response Time (p75)            │ 333.79                                         ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ Response Time (p90)            │ 347.55                                         ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ wp-load-alloptions-query (p10) │ 2.35                                           ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ wp-load-alloptions-query (p25) │ 2.42                                           ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ wp-load-alloptions-query (p50) │ 2.47                                           ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ wp-load-alloptions-query (p75) │ 2.55                                           ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ wp-load-alloptions-query (p90) │ 2.68                                           ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ wp-before-template (p10)       │ 125.74                                         ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ wp-before-template (p25)       │ 127.1                                          ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ wp-before-template (p50)       │ 130.59                                         ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ wp-before-template (p75)       │ 134.96                                         ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ wp-before-template (p90)       │ 152.34                                         ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ wp-template (p10)              │ 185.23                                         ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ wp-template (p25)              │ 186.22                                         ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ wp-template (p50)              │ 188.85                                         ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ wp-template (p75)              │ 194.69                                         ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ wp-template (p90)              │ 198.27                                         ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ wp-total (p10)                 │ 312.81                                         ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ wp-total (p25)                 │ 315.23                                         ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ wp-total (p50)                 │ 320.94                                         ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ wp-total (p75)                 │ 330.09                                         ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ wp-total (p90)                 │ 343.06                                         ║
╚════════════════════════════════╧════════════════════════════════════════════════╝

This branch with 5 hooked block types active on a page:

╔════════════════════════════════╤════════════════════════════════════════════════╗
║ URL                            │ http://localhost:8889/2023/10/05/benchmarking/ ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ Success Rate                   │ 100%                                           ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ Response Time (p10)            │ 316.35                                         ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ Response Time (p25)            │ 318.65                                         ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ Response Time (p50)            │ 324.93                                         ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ Response Time (p75)            │ 332.4                                          ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ Response Time (p90)            │ 346.4                                          ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ wp-load-alloptions-query (p10) │ 2.4                                            ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ wp-load-alloptions-query (p25) │ 2.44                                           ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ wp-load-alloptions-query (p50) │ 2.49                                           ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ wp-load-alloptions-query (p75) │ 2.56                                           ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ wp-load-alloptions-query (p90) │ 2.64                                           ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ wp-before-template (p10)       │ 126.92                                         ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ wp-before-template (p25)       │ 128.04                                         ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ wp-before-template (p50)       │ 130                                            ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ wp-before-template (p75)       │ 134.51                                         ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ wp-before-template (p90)       │ 154.47                                         ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ wp-template (p10)              │ 185.22                                         ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ wp-template (p25)              │ 185.9                                          ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ wp-template (p50)              │ 189.12                                         ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ wp-template (p75)              │ 191.98                                         ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ wp-template (p90)              │ 195.85                                         ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ wp-total (p10)                 │ 312.78                                         ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ wp-total (p25)                 │ 315.15                                         ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ wp-total (p50)                 │ 320.73                                         ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ wp-total (p75)                 │ 327.07                                         ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ wp-total (p90)                 │ 341.57                                         ║
╚════════════════════════════════╧════════════════════════════════════════════════╝

trunk:

╔════════════════════════════════╤════════════════════════════════════════════════╗
║ URL                            │ http://localhost:8889/2023/10/05/benchmarking/ ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ Success Rate                   │ 100%                                           ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ Response Time (p10)            │ 326.67                                         ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ Response Time (p25)            │ 331.53                                         ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ Response Time (p50)            │ 343.61                                         ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ Response Time (p75)            │ 355.72                                         ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ Response Time (p90)            │ 375.21                                         ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ wp-load-alloptions-query (p10) │ 2.36                                           ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ wp-load-alloptions-query (p25) │ 2.42                                           ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ wp-load-alloptions-query (p50) │ 2.51                                           ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ wp-load-alloptions-query (p75) │ 2.67                                           ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ wp-load-alloptions-query (p90) │ 3.08                                           ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ wp-before-template (p10)       │ 128.71                                         ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ wp-before-template (p25)       │ 132                                            ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ wp-before-template (p50)       │ 137.56                                         ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ wp-before-template (p75)       │ 145.26                                         ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ wp-before-template (p90)       │ 157.85                                         ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ wp-template (p10)              │ 189.3                                          ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ wp-template (p25)              │ 192.69                                         ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ wp-template (p50)              │ 199.64                                         ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ wp-template (p75)              │ 205.35                                         ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ wp-template (p90)              │ 213.65                                         ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ wp-total (p10)                 │ 323                                            ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ wp-total (p25)                 │ 326.38                                         ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ wp-total (p50)                 │ 339.35                                         ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ wp-total (p75)                 │ 350.69                                         ║
╟────────────────────────────────┼────────────────────────────────────────────────╢
║ wp-total (p90)                 │ 370.99                                         ║
╚════════════════════════════════╧════════════════════════════════════════════════╝

@gziolo gziolo force-pushed the update/get-hooked-blocks-performance-test branch from 0e4b342 to 122d4e5 Compare October 6, 2023 09:43
@gziolo
Copy link
Member Author

gziolo commented Oct 6, 2023

I did another test based on the approach used by @felixarntz in #5413 (comment).

I used the benchmark-server-timing command, with 100 runs each, and I did that 3 times (i.e. 3 medians, each based on 100 runs). Below is a summary of the wp-total metrics my data.

TT3 Hello world post (with Like Button plugin active):

222.09ms (PR) vs 227.4ms (trunk)
221.11ms (PR) vs 226.27ms (trunk)
221.57ms (PR) vs 226.69ms (trunk)

The same tests with TT3 Hello world post, the Like Button plugin active, and 4 additional hooked blocks registered that I copied from the theme's folder created for unit tests:

224.77ms (PR) vs 229.48ms (trunk)
224.65ms (PR) vs 229.81ms (trunk)
224.24ms (PR) vs 229.44ms (trunk)

The same tests with TT4 Hello world post, the Like Button plugin active, and 4 additional blocks registered defining block hooks as above:

261.45ms (PR) vs 269.97ms (trunk)
261.63ms (PR) vs 271.49ms (trunk)
261.64ms (PR) vs 271.07ms (trunk)

The result is consistently better, but only a tiny fraction.

By the way, I'm using npm run env:start with Docker. I assume the reason why the response time is multiplied by a factor of 3 on my machine compared to the results reported by Felix.

@gziolo gziolo requested review from felixarntz and joemcgill October 6, 2023 10:12
@gziolo gziolo self-assigned this Oct 6, 2023
@felixarntz
Copy link
Member

@gziolo @ockham Sharing my benchmark summary here.

I used the benchmark-server-timing command, with 100 runs each, and I did that 3 times (i.e. 3 medians, each based on 100 runs). All benchmarks without any plugins active (other than Performance Lab for the Server-Timing headers). Below is a summary of the wp-total metrics my data.

TL;DR Other than #5413 (which was rather neutral), here I can see notable benefits, in some cases massive benefits! 🎉

With TT4 home page:

  • 84.97ms (PR) vs 107.8ms (trunk) --> 21.2% faster 🎉
  • 84.85ms (PR) vs 110.21ms (trunk) --> 23.0% faster 🎉
  • 84.87ms (PR) vs 107.8ms (trunk) --> 21.3% faster 🎉

With TT3 home page:

  • 77.87ms (PR) vs 79.2ms (trunk) --> 1.7% faster 👍
  • 78.3ms (PR) vs 79.4ms (trunk) --> 1.4% faster 👍
  • 78.91ms (PR) vs 78.95ms (trunk) --> 0.1% faster 🆗

Especially for TT4, that looks incredible, and brings it much closer in performance to TT3, despite the so much richer layout and content.

To make sure it doesn't have negative impact on classic themes, I also benchmarked with TT1 - in hopes it's pretty much a zero sum game, which it indeed is.

With TT1 home page:

  • 49ms (PR) vs 48.67ms (trunk) --> 0.7% slower 🆗
  • 48.23ms (PR) vs 48.45ms (trunk) --> 0.5% faster 🆗
  • 48.64ms (PR) vs 48.42ms (trunk) --> 0.5% slower 🆗

The difference here is so tiny and goes into both directions, so we can assume this to be mostly due to variance.

I wanted to see whether more complex block theme content means more benefit, so I also ran a benchmark against the 3P theme "Frost".

With Frost home page:

  • 74.21ms (PR) vs 76.1ms (trunk) --> 2.5% faster 👍
  • 74.22ms (PR) vs 75.96ms (trunk) --> 2.3% faster 👍
  • 74.57ms (PR) vs 75.74ms (trunk) --> 1.5% faster 👍

Interestingly, the benefit here is by far not as high as TT4, but it's still notably positive, close to TT3.


I am curious why TT4 sees such a high benefit here. Since I don't have much understanding of the logic behind all this, can you shed some light on this? Do you have any idea? Based on the impact, this looks totally worth prioritizing even this late in the 6.4 cycle, particularly as it could have a major impact on TT4 performance, which would alleviate some of the "problems" related to https://core.trac.wordpress.org/ticket/59465.

@joemcgill
Copy link
Member

I'm seeing similarly large performance improvements with this PR, when I run XHProf with this PR against Trunk, when the Twenty Twenty-four (TT4) theme is active.

TL;DR

On the homepage of TT4, this change reduces the total number of calls to get_hooked_blocks() from 620 to 16, decreasing the inclusive wall time (IWT) from 23,182 µs to 1,293 µs. Representing ~2% improvement in total execution time.

Trunk

XHProf run of trunk

This PR

Note: Prior to running profiles, I merged trunk back into the branch to ensure no other difference would affect the measurements.

XHProf run of this PR

@dmsnell
Copy link
Member

dmsnell commented Oct 6, 2023

2.0 GHz EPYC server 5.0 GHz i9 Laptop plugged in with fans a'blazing
between-stats-gziolo-patch between-stats-gziolo-laptop

nice marked improvement. obviously there's a bigger impact on the slower server CPU, but the percentage improvement is about the same.

@gziolo
Copy link
Member Author

gziolo commented Oct 9, 2023

I am curious why TT4 sees such a high benefit here. Since I don't have much understanding of the logic behind all this, can you shed some light on this? Do you have any idea? Based on the impact, this looks totally worth prioritizing even this late in the 6.4 cycle, particularly as it could have a major impact on TT4 performance, which would alleviate some of the "problems" related to https://core.trac.wordpress.org/ticket/59465.

I spent some time trying to understand how TT4 got assembled to better understand the results we see. It's a very interesting discovery to see how this theme got structured for the homepage. The template used is a simple one line reference to a pattern shipped in the theme:

<!-- wp:pattern {"slug":"twentytwentyfour/template-home"} /-->

The pattern itself isn't very complex, but it references two template parts and another pattern:

<!-- wp:template-part {"slug":"header"} /-->
<!-- wp:group {"tagName":"main","style":{"spacing":{"blockGap":"0","margin":{"top":"0"}}},"layout":{"type":"default"}} -->
<main class="wp-block-group" style="margin-top:0">
<!-- wp:pattern {"slug":"twentytwentyfour/home"} /-->
</main>
<!-- /wp:group -->
<!-- wp:template-part {"slug":"footer","area":"footer","tagName":"footer"} /-->

When peeking at the nested pattern, we can see even more patterns used:

<!-- wp:pattern {"slug":"twentytwentyfour/hero"} /-->
<!-- wp:pattern {"slug":"twentytwentyfour/feature-grid"} /-->
<!-- wp:pattern {"slug":"twentytwentyfour/features-with-images"} /-->
<!-- wp:pattern {"slug":"twentytwentyfour/testimonial-centered"} /-->
<!-- wp:pattern {"slug":"twentytwentyfour/posts-featured"} /-->
<!-- wp:pattern {"slug":"twentytwentyfour/cta"} /-->

The footer template part also references a pattern:

<!-- wp:pattern {"slug":"twentytwentyfour/footer"} /-->

I didn't dig deeper, but this way, we can see that we have at least the following files referenced that need to get processed when seeking places to inject hooked blocks:

  • 1 template
  • 2 template parts
  • 9 patterns

@joemcgill shared the data from profiling around get_hooked_blocks call in this branch. The number 16 is very close to the number of files that need to be processed in the theme, and this is precisely what the changes proposed are targeting. The way the patch works, it optimizes the most all these cases when there are no hooked blocks registered (and the filter related to the functionality). In particular for files with patterns, it's possible to completely avoid calling parse_blocks and the newly introduced helper that increases the capability serialize_blocks when we know that there is no processing necessary:

$content = $pattern['content'];
if ( ! empty( $block_hooks ) || has_filter( 'hooked_block_types' ) ) {
$blocks = parse_blocks( $content );
$before_block_visitor = make_before_block_visitor( $pattern, $block_hooks );
$after_block_visitor = make_after_block_visitor( $pattern, $block_hooks );
$content = traverse_and_serialize_blocks( $blocks, $before_block_visitor, $after_block_visitor );
}

In the case of templates and template parts, we still need to keep the existing logic for injecting the theme slug into template part blocks used in these templates, but we can completely skip the part related to block hooks:

$before_block_visitor = '_inject_theme_attribute_in_template_part_block';
$after_block_visitor = null;
$block_hooks = get_hooked_blocks();
if ( ! empty( $block_hooks ) || has_filter( 'hooked_block_types' ) ) {
$before_block_visitor = make_before_block_visitor( $template, $block_hooks );
$after_block_visitor = make_after_block_visitor( $template, $block_hooks );
}
$blocks = parse_blocks( $template_content );
$template->content = traverse_and_serialize_blocks( $blocks, $before_block_visitor, $after_block_visitor );


Now, it probably would be enough to apply these changes if we would care only about TT4 without any hooked blocks registered, as make_before_block_visitor and make_after_block_visitor are no longer executed. These two functions were responsible for the excessive number of get_hooked_blocks (~600 times as reported).

However, the refactoring applied also covers all these cases when a site has to handle Block Hooks. The way the feature works can be simplified to the following, every template, template part, and pattern used on the front end needs to be preprocessed before being used to render the full page. Under the hoods, the content of every template gets parsed into the block representation, and the new helper function traverse_and_serialize_blocks serializes it back to the format saved on the disk. Two callbacks get passed, which for Block Hooks are make_before_block_visitor and make_after_block_visitor. There are four positions where blocks can be inserted: before, after, firstChild, and lastChild so these helpers for some parsed blocks can be called in total 4 times. Some templates/patterns have quite complex structures, example:

https://github.com/WordPress/wordpress-develop/blob/trunk/src/wp-content/themes/twentytwentyfour/patterns/footer.php

This is where the traversing might require more processing because of the number of blocks that get returned from the block parser. Before this patch, every callback would call get_hooked_blocks internally, so that definitely led to some excessive numbers like @joemcgill reported when profiling trunk. In the revised version, the computed list of hooked blocks gets passed to the callbacks we essentially remove the need to compute the same list every time.

@ockham
Copy link
Contributor

ockham commented Oct 9, 2023

Thanks a lot @felixarntz and @joemcgill for benchmarking and profiling, and @gziolo for explaining the reasons for the improvements we're seeing!

Looks like we have a winner 😄 -- I've closed my #5406 and #5413 which have done little to nothing to help improve performance.

I'll review this PR now so we can get it merged in time for Beta 3 😊

* @return callable A function that returns the serialized markup for the given block,
* including the markup for any hooked blocks before it.
*/
function make_before_block_visitor( $context ) {
function make_before_block_visitor( $context, $block_hooks ) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could change the argument order maybe? 🤔 $block_hooks is kinda essential for the whole thing, whereas $context is only needed for the filter.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

@gziolo gziolo Oct 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I had some issues with variable shadowing or too close similarity with another variable. However, I will take another try to keep the $hooked_blocks as a param passed to the function.

Sure, I can move the param to the first position. It's late in the process, but if we would get agreement from the release squad, we could also consider renaming the helper function to be more specific now it requires the list of hooked blocks.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I had some issues with variable shadowing or too close similarity with another variable. However, I will take another try to keep the $hooked_blocks as a param passed to the function.

Ah, didn't realize! Thank you, it'd be cool to keep it 😄

Sure, I can move the param to the first position. It's late in the process, but if we would get agreement from the release squad, we could also consider renaming the helper function to be more specific now it requires the list of hooked blocks.

You mean make_before_block_visitor (and make_after_block_visitor)? I’m struggling to come up with a good name for those 🤔 Arguably, they are still visitor factories — they just also accept a list of hooked blocks now…

Do you have anything specific in mind that should be reflected by their names?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I addressed all the feedback with 21efb9e.

I don't have a clear vision how to rename these helpers so maybe it's fine to leave them as they are.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the discussion here, I am also in favor of the current approach with $hooked_blocks as the first parameter, so LGTM.

Note that the function is marked private, that would still give us some additional flexibility for the future if we wanted to change things.

src/wp-includes/blocks.php Outdated Show resolved Hide resolved
src/wp-includes/blocks.php Outdated Show resolved Hide resolved
Copy link
Contributor

@ockham ockham left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks again! I left a couple of notes, but nothing substantial standing in the way of landing this PR 😊

@spacedmonkey
Copy link
Member

For context, I worked on a similar ticket that would cache / only parse once the block pattern. See #5421. I wonder if there are any lessons left there and we could add some of this to this PR.

@ockham
Copy link
Contributor

ockham commented Oct 9, 2023

For context, I worked on a similar ticket that would cache / only parse once the block pattern. See #5421. I wonder if there are any lessons left there and we could add some of this to this PR.

I think that's a promising optimization! I was thinking to keep the change separate so we can benchmark it to see its impact in an isolated manner, but maybe it's fine to err on the "optimistic" side and include it in this PR. Curious to hear everyone's thoughts!

@gziolo
Copy link
Member Author

gziolo commented Oct 9, 2023

For context, I worked on a similar ticket that would cache / only parse once the block pattern. See #5421. I wonder if there are any lessons left there and we could add some of this to this PR.

My hypothesis (take it with a grain of salt) is that for the front end, it might not have an expected impact, as it looks like there are over 50 patterns in TT4, and we see 16 calls of get_hooked_blocks reported when profiling the homepage. In effect, I read it as only these patterns used in templates get processed, and it doesn't look like they are used more than once. It might be a different story in the admin, though. I would definitely follow up with some further validation of that PR to learn more about how these patterns are consumed on the front end. I might be simply wrong with my working hypothesis. In general, I agree with @spacedmonkey that the whole process of loading theme patterns is a good candidate for further optimization.

@gziolo gziolo force-pushed the update/get-hooked-blocks-performance-test branch from 122d4e5 to 21efb9e Compare October 9, 2023 13:16
@spacedmonkey
Copy link
Member

I think that #5421 and this PR do very similar things. Instead of parsing the block everytime, it is only done on demand. My PR parses the blocks once and caches the result. This PR only does it if there are block hooks. Very similar.

@ockham
Copy link
Contributor

ockham commented Oct 9, 2023

I think that #5421 and this PR do very similar things. Instead of parsing the block everytime, it is only done on demand. My PR parses the blocks once and caches the result. This PR only does it if there are block hooks. Very similar.

They look rather different to me, TBH 🤔 I'll try to elaborate:

Note that hooked blocks aren’t only inserted into patterns, but also into templates and template parts (where it happens upon loading from the respective block theme files — so caching at read time isn’t needed there).

This PR does two things:

  • It runs get_hooked_blocks only once per template (or template part, or pattern), rather than for each block encountered. This is likely the biggest factor in reducing overhead when there are hooked blocks.
  • It only even applies parsing, hooked block insertion, and re-serializing if there are any hooked blocks at all. This is an optimization on top of the other one for the probably "default" case of no hooked blocks whatsoever.

However, it doesn't cache patterns' markup after hooked blocks have been inserted.

@gziolo
Copy link
Member Author

gziolo commented Oct 9, 2023

All checks are green. I will wait until tomorrow morning to ensure that the feedback from @spacedmonkey is taken into account. If anyone with the committ access would like to land this patch today after reaching the consensus, I would be more than happy to see that happen. Thank you everyone for guiding me through the performance analysis and your invaluable help in confirming the results initially reported 🙌🏻

@spacedmonkey
Copy link
Member

TT3 theme - 1000 runs.

Trunk PR
Response Time (median) 85.46 85.45
wp-load-alloptions-query (median) 0.65 0.65
wp-before-template (median) 35.14 34.83
wp-template (median) 46.88 46.95
wp-total (median) 82.02 81.91

TT4 theme - 1000 runs.

Trunk PR
Response Time (median) 100.63 95.83
wp-before-template (median) 38.47 37.67
wp-template (median) 58.42 54.4
wp-total (median) 97.15 92.2

I am not seeing much of an improve with TT3, but I am seeing a benefit from TT4.

@hellofromtonya
Copy link
Contributor

I'm seeing comments / discussions about function parameter order and function naming. Is there consensus to keep the order and naming in this PR?

I ask because: once it ships in Core, really really hard to make changes due to BC.

@ockham
Copy link
Contributor

ockham commented Oct 9, 2023

I'm seeing comments / discussions about function parameter order and function naming. Is there consensus to keep the order and naming in this PR?

I ask because: once it ships in Core, really really hard to make changes due to BC.

AFAICT, Grzegorz implemented my suggestions with regard to the param order. As for the callback naming, neither of us could come up with a better name, so we were thinking to leave as-is.

Copy link
Member

@felixarntz felixarntz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gziolo PR looks great to me, also thank you for clarifying why the impact of this PR on TT4 performance specifically is so high: Looks like the impact is greater the more patterns are used by a page.

Just to double check, I ran another benchmark with the latest state of the PR against the TT4 home page, and it's still looking as excellent as before: 84.61ms instead of 107.65ms for total load time!

For reference, when benchmarking this with the "Hello world!" post on TT4, the PR achieves 97.2ms instead of 99.9ms, so a small but still notable improvement.

@ockham
Copy link
Contributor

ockham commented Oct 9, 2023

Thanks again everybody, really great collab on this one!

Committed to Core in https://core.trac.wordpress.org/changeset/56805.

@ockham ockham closed this Oct 9, 2023
@gziolo gziolo deleted the update/get-hooked-blocks-performance-test branch October 9, 2023 17:26
@ockham
Copy link
Contributor

ockham commented Oct 10, 2023

Looks like we introduced a regression: WordPress/gutenberg#55202 😕

Work on a fix is underway in #5450.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants