-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance: Try refactoring get_hooked_blocks()
to check impact
#5399
Performance: Try refactoring get_hooked_blocks()
to check impact
#5399
Conversation
Performance results from the CI job https://github.com/WordPress/wordpress-develop/actions/runs/6405613311/attempts/2?pr=5399: The first time, it didn't look so promising https://github.com/WordPress/wordpress-develop/actions/runs/6405613311/attempts/1?pr=5399: I will re-trigger it a few more times, and I will keep updating this comment. https://github.com/WordPress/wordpress-develop/actions/runs/6405613311/attempts/3?pr=5399 https://github.com/WordPress/wordpress-develop/actions/runs/6405613311/attempts/4?pr=5399 https://github.com/WordPress/wordpress-develop/actions/runs/6405613311/attempts/5?pr=5399 |
I'm playing with local benchmarking by following instructions from https://make.wordpress.org/performance/handbook/measuring-performance/benchmarking-php-performance-with-server-timing/. I created a test post with the content copied from the same page. I also added 5 comments. I'm using this branch, which uses the Twenty-Four theme without any changes applied. I executed the following command for every configuration:
This branch:
With the Like Button plugin installed and activated. The block gets hooked into the list of comments: This branch:
This branch with 5 hooked block types active on a page:
|
0e4b342
to
122d4e5
Compare
I did another test based on the approach used by @felixarntz in #5413 (comment).
TT3 Hello world post (with Like Button plugin active): 222.09ms (PR) vs 227.4ms (trunk) The same tests with TT3 Hello world post, the Like Button plugin active, and 4 additional hooked blocks registered that I copied from the theme's folder created for unit tests: 224.77ms (PR) vs 229.48ms (trunk) The same tests with TT4 Hello world post, the Like Button plugin active, and 4 additional blocks registered defining block hooks as above: 261.45ms (PR) vs 269.97ms (trunk) The result is consistently better, but only a tiny fraction. By the way, I'm using |
@gziolo @ockham Sharing my benchmark summary here. I used the TL;DR Other than #5413 (which was rather neutral), here I can see notable benefits, in some cases massive benefits! 🎉 With TT4 home page:
With TT3 home page:
Especially for TT4, that looks incredible, and brings it much closer in performance to TT3, despite the so much richer layout and content. To make sure it doesn't have negative impact on classic themes, I also benchmarked with TT1 - in hopes it's pretty much a zero sum game, which it indeed is. With TT1 home page:
The difference here is so tiny and goes into both directions, so we can assume this to be mostly due to variance. I wanted to see whether more complex block theme content means more benefit, so I also ran a benchmark against the 3P theme "Frost". With Frost home page:
Interestingly, the benefit here is by far not as high as TT4, but it's still notably positive, close to TT3. I am curious why TT4 sees such a high benefit here. Since I don't have much understanding of the logic behind all this, can you shed some light on this? Do you have any idea? Based on the impact, this looks totally worth prioritizing even this late in the 6.4 cycle, particularly as it could have a major impact on TT4 performance, which would alleviate some of the "problems" related to https://core.trac.wordpress.org/ticket/59465. |
I spent some time trying to understand how TT4 got assembled to better understand the results we see. It's a very interesting discovery to see how this theme got structured for the homepage. The template used is a simple one line reference to a pattern shipped in the theme:
The pattern itself isn't very complex, but it references two template parts and another pattern: wordpress-develop/src/wp-content/themes/twentytwentyfour/patterns/template-home-business.php Lines 11 to 19 in eb0ae56
When peeking at the nested pattern, we can see even more patterns used: wordpress-develop/src/wp-content/themes/twentytwentyfour/patterns/page-01-business-home.php Lines 13 to 18 in eb0ae56
The footer template part also references a pattern:
I didn't dig deeper, but this way, we can see that we have at least the following files referenced that need to get processed when seeking places to inject hooked blocks:
@joemcgill shared the data from profiling around wordpress-develop/src/wp-includes/class-wp-block-patterns-registry.php Lines 163 to 169 in 122d4e5
In the case of templates and template parts, we still need to keep the existing logic for injecting the theme slug into template part blocks used in these templates, but we can completely skip the part related to block hooks: wordpress-develop/src/wp-includes/block-template-utils.php Lines 552 to 560 in 122d4e5
Now, it probably would be enough to apply these changes if we would care only about TT4 without any hooked blocks registered, as However, the refactoring applied also covers all these cases when a site has to handle Block Hooks. The way the feature works can be simplified to the following, every template, template part, and pattern used on the front end needs to be preprocessed before being used to render the full page. Under the hoods, the content of every template gets parsed into the block representation, and the new helper function This is where the traversing might require more processing because of the number of blocks that get returned from the block parser. Before this patch, every callback would call |
Thanks a lot @felixarntz and @joemcgill for benchmarking and profiling, and @gziolo for explaining the reasons for the improvements we're seeing! Looks like we have a winner 😄 -- I've closed my #5406 and #5413 which have done little to nothing to help improve performance. I'll review this PR now so we can get it merged in time for Beta 3 😊 |
src/wp-includes/blocks.php
Outdated
* @return callable A function that returns the serialized markup for the given block, | ||
* including the markup for any hooked blocks before it. | ||
*/ | ||
function make_before_block_visitor( $context ) { | ||
function make_before_block_visitor( $context, $block_hooks ) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could change the argument order maybe? 🤔 $block_hooks
is kinda essential for the whole thing, whereas $context
is only needed for the filter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, $hooked_blocks
maybe? See https://github.com/WordPress/wordpress-develop/pull/5399/files#r1350125889 😬
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I had some issues with variable shadowing or too close similarity with another variable. However, I will take another try to keep the $hooked_blocks
as a param passed to the function.
Sure, I can move the param to the first position. It's late in the process, but if we would get agreement from the release squad, we could also consider renaming the helper function to be more specific now it requires the list of hooked blocks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I had some issues with variable shadowing or too close similarity with another variable. However, I will take another try to keep the
$hooked_blocks
as a param passed to the function.
Ah, didn't realize! Thank you, it'd be cool to keep it 😄
Sure, I can move the param to the first position. It's late in the process, but if we would get agreement from the release squad, we could also consider renaming the helper function to be more specific now it requires the list of hooked blocks.
You mean make_before_block_visitor
(and make_after_block_visitor
)? I’m struggling to come up with a good name for those 🤔 Arguably, they are still visitor factories — they just also accept a list of hooked blocks now…
Do you have anything specific in mind that should be reflected by their names?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I addressed all the feedback with 21efb9e.
I don't have a clear vision how to rename these helpers so maybe it's fine to leave them as they are.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the discussion here, I am also in favor of the current approach with $hooked_blocks
as the first parameter, so LGTM.
Note that the function is marked private, that would still give us some additional flexibility for the future if we wanted to change things.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks again! I left a couple of notes, but nothing substantial standing in the way of landing this PR 😊
For context, I worked on a similar ticket that would cache / only parse once the block pattern. See #5421. I wonder if there are any lessons left there and we could add some of this to this PR. |
I think that's a promising optimization! I was thinking to keep the change separate so we can benchmark it to see its impact in an isolated manner, but maybe it's fine to err on the "optimistic" side and include it in this PR. Curious to hear everyone's thoughts! |
My hypothesis (take it with a grain of salt) is that for the front end, it might not have an expected impact, as it looks like there are over 50 patterns in TT4, and we see 16 calls of |
122d4e5
to
21efb9e
Compare
I think that #5421 and this PR do very similar things. Instead of parsing the block everytime, it is only done on demand. My PR parses the blocks once and caches the result. This PR only does it if there are block hooks. Very similar. |
They look rather different to me, TBH 🤔 I'll try to elaborate: Note that hooked blocks aren’t only inserted into patterns, but also into templates and template parts (where it happens upon loading from the respective block theme files — so caching at read time isn’t needed there). This PR does two things:
However, it doesn't cache patterns' markup after hooked blocks have been inserted. |
All checks are green. I will wait until tomorrow morning to ensure that the feedback from @spacedmonkey is taken into account. If anyone with the committ access would like to land this patch today after reaching the consensus, I would be more than happy to see that happen. Thank you everyone for guiding me through the performance analysis and your invaluable help in confirming the results initially reported 🙌🏻 |
TT3 theme - 1000 runs.
TT4 theme - 1000 runs.
I am not seeing much of an improve with TT3, but I am seeing a benefit from TT4. |
I'm seeing comments / discussions about function parameter order and function naming. Is there consensus to keep the order and naming in this PR? I ask because: once it ships in Core, really really hard to make changes due to BC. |
AFAICT, Grzegorz implemented my suggestions with regard to the param order. As for the callback naming, neither of us could come up with a better name, so we were thinking to leave as-is. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gziolo PR looks great to me, also thank you for clarifying why the impact of this PR on TT4 performance specifically is so high: Looks like the impact is greater the more patterns are used by a page.
Just to double check, I ran another benchmark with the latest state of the PR against the TT4 home page, and it's still looking as excellent as before: 84.61ms instead of 107.65ms for total load time!
For reference, when benchmarking this with the "Hello world!" post on TT4, the PR achieves 97.2ms instead of 99.9ms, so a small but still notable improvement.
Thanks again everybody, really great collab on this one! Committed to Core in https://core.trac.wordpress.org/changeset/56805. |
Looks like we introduced a regression: WordPress/gutenberg#55202 😕 Work on a fix is underway in #5450. |
Trac ticket: https://core.trac.wordpress.org/ticket/59383
Important: This is an experiment! The focus should be on testing the performance as I don't think it's worth landing it if there is no positive impact.
I tried refactoring
get_hooked_blocks
to see if that would have any noticeable impact on the performance of the block themes like Twenty Twenty-Three or Twenty Twenty-Four. Let's keep in mind that there are no active hooked blocks in WordPress core, so it means that by default,get_hooked_blocks
won't be fully battle-tested in the context of performance without registering custom blocks using Block Hooks. However, there is a test theme present in the codebase that is used to verify the feature in unit tests.The most important consideration in this PR was related to the fact that
get_hooked_blocks
iterates on every block type in the registry. I was seeking the most optimal way to minimize the number ofget_hooked_blocks
calls. Before this PR,get_hooked_blocks
was called 4 times per every parsed block when processing every template, template part, and pattern. With this PR,get_hooked_blocks
gets called once per template, template part, or pattern.Some additional improvements are also applied, like calling
get_hooked_blocks
only once when returning the list of all registered patterns. The last modification proposed brings the implementation closer to what we had before Block Hooks. When we ensure that there are no registered block hooks or filters that could impact them, we omit the parsing and serializing step.This Pull Request is for code review only. Please keep all other discussion in the Trac ticket. Do not merge this Pull Request. See GitHub Pull Requests for Code Review in the Core Handbook for more details.