-
Notifications
You must be signed in to change notification settings - Fork 4.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Return inner HTML before and after inner blocks when parsing and fix … #8760
Conversation
…rendering of inner blocks - WordPress#8214
@youknowriad @mtias @gziolo @aduth Could you please look over this and let me know if this can be considered. The js parsing fails for column and columns blocks and will need some work to produce the same as the PHP parser. I can have a look if you think it's worth continuing with. |
How does this work with many layers of nested blocks? I'm not too familiar with the inner workings of the parser (or any parser, for that matter) but it seems like maybe an approach where blocks are rendered from the inside out, with the rendered blocks passed up the chain, might be a more stable way of handling it? |
This will work fine with many layers of nested blocks. It uses recursion to render any inner blocks. There's a test that includes 3 layers of nesting with a paragraph block inside a column block inside a columns block. |
$block['innerHTMLBeforeInnerBlocks'] = implode( '', $innerHTMLBeforeInnerBlocks ); | ||
$block['innerHTMLAfterInnerBlocks'] = implode( '', $innerHTMLAfterInnerBlocks ); | ||
} | ||
return $block; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hm…it seems like we are trying to manually edit automatically-generated code here. @brucepearson did you get a chance to look at the spec grammar which is run through php-pegjs
to generate this file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @dmsnell. Is the concept OK that I've applied here? I can then translate to the grammar file.
if ( isset( $block['innerBlocks'] ) && count( $block['innerBlocks'] ) ) { | ||
$raw_content = $block['innerHTMLBeforeInnerBlocks']; | ||
foreach ( $block['innerBlocks'] as $inner_block ) { | ||
$raw_content .= gutenberg_render_block( $inner_block ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
although we shouldn't expect deeply nested blocks we may want to consider the possibility of their existence. suppose someone made a post with 1000 levels of nesting - that could theoretically cause WordPress to crash on account of overflowing the call stack.
I'd recommend considering one of two changes to the behavior here:
- set a limit on the recursion and abort with some defined behavior when we're too deep
- flip the recursion on its head and recurse using a trampoline structure to track the recursion in an alternate data structure or control structure to eliminate the risk of stack overflow
it should be noted that this work overlaps what's happening in #8083. if we get a fast internal PHP parser we might dramatically rewrite this function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dmsnell Is it worth continuing with this considering you might change the PHP parser. The intention of this change is to fix rendering of inner blocks that doesn't happen at all right now.
sorry for the delay here @brucepearson - I don't want to say too much about this method in the current state of the PR because I think we have to step back and make a change to the way we wrote the code up - actually it can help to take a few steps back and write up from a high level what you are doing and how it solves the problem (no code 😉) unfortunately we can't just change the generated parser code - if we do that it will immediately disappear once we regenerate it. additionally it's hard to make assertions about how those changes will affect the parse. since the PEG parser places constraints on how we parse we might end up triggering unrelated bugs if we do a kind of deep surgery on it. instead we need to think about how what changes we want to make from a semantic point of view - what is wrong with the specification (the grammar on the other hand, if we adopt the parser implementation in #8083 then it may not be an issue. I don't believe that we will discover that the problem lies in the grammar but rather in the limited parser implementation in PHP that has been serving us well up until now. we haven't had a spec-compliant parser in PHP because the auto-generated one was too slow to be practical. if we change that we should be able to handle full nesting and dynamic blocks without any exceptional cases or code snippets. does that help? I'll try to be more responsive with this PR this week. |
@dmsnell I'm working on adding translation support for Gutenberg to WPML. Basically what happens is:
Once the strings are extracted out of the blocks we can translate each string separately and the translator does not need to know about blocks at all. When translation is complete we need to build the translated page.
This works fine except for the inner blocks. To render the inner blocks we need to render the pre-inner html, then the inner blocks and then the post-inner-html. |
@brucepearson - that's a great description and really helps me (and I'm sure others) understand your use-case. I'd like to see if we can't get a new and performant parser merged into the project - such as in #8083 - so that you won't have to do any adjustments to make the server-side parse work for you. additionally and sadly we're going to have to address something else in here that after some inspection I think may be a bit difficult: we can't assume anything about the ordering of inner blocks. in the patch here I see us trying to scan linearly in the following sequence: HTML before first inner block, list of inner blocks, HTML after last inner block. we very much expect a variety of sequences however including HTML between inner blocks. one of the earliest discussion around nesting involved what to do with inner blocks. as it stands today we cannot infer after the parse in what positions the inner blocks were found - we only get an array of them in the order we found them but no placemarkers. @mtias and @aduth I know discussed this point as well. as it stands we may still be able to somehow add those markers without breaking the existing format of the parser - say by adding new attributes like though this may sound frustrating it's not our only option even. there are a number of ways we should be able to address a need like this including one thing I'd like to see at some point: a block visitor where we can register a function to run over each block for additional processing. I think I'll pause now and let you catch up and respond 😄 |
@dmsnell I see there's a new parser in Gutenberg 3.8. Is there plans to improve handling of the inner blocks any time soon? |
@brucepearson thanks for popping in again here!
yes! I'd expect the process to work like this…
I've played around and had working code to provide a new property to the blocks alongside parse( '<!-- wp:my/block -->a<!-- wp:my/inner -->b<!-- /wp:my/inner -->c<!-- /wp:my/block -->' ) === [
{
blockName: 'my/block',
attrs: {},
innerHTML: 'ac',
innerBlocks: [
{ blockName: 'my/inner', attrs: {}, innerHTML: 'b', innerBlocks: [] }
],
innerBlockIndices: [
1
]
}
] The idea is that when the render callback in PHP has access to not only the inner blocks but also to the indices from whence they came then PHP is free to process and reassemble the original block with the limitations that are currently there. Thoughts? |
@dmsnell The plan looks good. The One further point... I'm currently using 'hackish' code to do this:
Do you already have this planned? |
There are numerous needs to process posts and block content from its structured form without demanding that plugin authors implement their own parsing systems. Since the new default parser was implemented in #8083 the server-side parse is now fast enough to consider doing full parses of our documents and with that brings the idea that we can filter block content from the parser itself. In this patch I'm exploring an API to allow extending the parser's behavior by post-processing blocks as they enter the parser's output array. This new filter gives the ability to transform all of the block's properties as they finish parsing. In the case of inner blocks the filter runs as the inner blocks have finished their own nesting. In the case of top-level blocks the filter runs after all inner content has finished parsing. One use case is in #8760 where we want to replace the HTML parts of blocks while preserving other structure. Another use case could be removing specific inner blocks or content based on the current user requesting a post. This filter exposes a kind of visitor pattern for the nested parse. > **THIS IS AN INCOMPLETE PATCH DO NOT MERGE**
@brucepearson I've started working on this in #10463 such that the new parser is used for in #10108 I started exploring a block-level filter which I think also comes into play. my idea is that we'll have to end up doing the full parse in if you don't mind could you look at the imagined interface in #10108 and provide your feedback on that API, how you would imagine wanting to hook into the block process, how you would imagine writing your plugin to do the replacements on render? |
I don't think replace on render would work for us. It has a few problems:
I do believe that the filter could work as long as there's a function to render the parsed blocks back to the post content in the format for the Gutenberg editor. eg.
|
@dmsnell I noticed in 4.0.0 that |
Attempt three at including positional information from the parse to enable isomorphic reconstruction of the source `post_content` after parsing. See alternate attempts: #11082, #11309 Motivated by: #7247, #8760, Automattic/jetpack#10256 Enables: #10463, #10108 ## Abstract Add new `innerContent` property to each block in parser output indicating where in the innerHTML each innerBlock was found. ## Status - will update fixtures after design review indicates this is the desired approach - all parsers passing new tests for fragment behavior ## Summary Inner blocks, or nested blocks, or blocks-within-blocks, can exist in Gutenberg posts. They are serialized in `post_content` in place as normal blocks which exist in between another block's comment delimiters. ```html <!-- wp:outerBlock --> Check out my <!-- wp:voidInnerBlock /--> and my other <!-- wp:innerBlock --> with its own content. <!-- /wp:innerBlock --> <!-- /wp:outerBlock --> ``` The way this gets parsed leaves us in a quandary: we cannot reconstruct the original `post_content` after parsing because we lose the origin location information for each inner block since they are only passed as an array of inner blocks. ```json { "blockName": "core/outerBlock", "attrs": {}, "innerBlocks": [ { "blockName": "core/voidInnerBlock", "attrs": {}, "innerBlocks": [], "innerHTML": "" }, { "blockName": "core/innerBlock", "attrs": {}, "innerBlocks": [], "innerHTML": "\nwith its own content.\n" } ], "innerHTML": "\nCheck out my\n\nand my other\n\n" } ``` At this point we have parsed the blocks and prepared them for attaching into the JavaScript block code that interprets them but we have lost our reverse transformation. In this PR I'd like to introduce a new mechanism which shouldn't break existing functionality but which will enable us to go back and forth isomorphically between the `post_content` and first stage of parsing. If we can tear apart a Gutenberg post and reassemble then it will let us to structurally-informed processing of the posts without needing to be aware of all the block JavaScript. The proposed mechanism is a new property as a **list of HTML fragments with `null` values interspersed between those fragments where the blocks were found**. ```json { "blockName": "core/outerBlock", "attrs": {}, "innerBlocks": [ { "blockName": "core/voidInnerBlock", "attrs": {}, "innerBlocks": [], "blockMarkers": [], "innerHTML": "" }, { "blockName": "core/innerBlock", "attrs": {}, "innerBlocks": [], "blockMarkers": [], "innerHTML": "\nwith its own content.\n" } ], "innerHTML": "\nCheck out my\n\nand my other\n\n", "innerContent": [ "\nCheck out my\n", null, "\n and my other\n", null, "\n" ], } ``` Doing this allows us to replace those `null` values with their associated block (sequentially) from `innerBlocks`. ## Questions - Why not use a string token instead of an array? - See #11309. The fundamental problem with the token is that it could be valid content input from a person and so there's a probability that we would fail to split the content accurately. - Why add the `null` instead of leaving basic array splits like `[ 'before', 'after' ]`? - By inspection we can see that without an explicit marker we don't know if the block came before or after or between array elements. We could add empty strings `''` and say that blocks exist only _between_ array elements but the parser code would have to be more complicated to make sure we appropriately add those empty strings. The empty strings are a bit odd anyway. - Why add a new property? - Code already depends on `innerHTML` and `innerBlocks`; I don't want to break any existing behaviors and adding is less risky than changing.
@brucepearson and @dmsnell - what's the status of this one? |
@gziolo @brucepearson the parser in core splits the inner HTML into a list of HTML and in #11334 we added the change which breaks apart the by default |
Let's err on the side of closing. I've tested and commented as such at #8214 (comment) . If further work is needed, the issue can be reopened or new ones created. |
Description
Fix rendering of inner blocks - #8214
gutenberg_render_block
does not render inner blocksTo Reproduce
Use Gutenberg editor to add 2 columns. You end up with something like this
Use
gutenberg_parse_blocks
and thengutenberg_render_block
to re-generate the content.Expected behaviour
$new_content
should be the same as$content
but it's not. The inner blocks of columns is not rendered.How has this been tested?
This code has been tested as described above.
This has been covered by PHP Unit class Test_Render.
Types of changes
Introduced new
innerHTMLBeforeInnerBlocks
andinnerHTMLAfterInnerBlocks
to block parser. This data is set only when there are inner blocks.Updated
gutenberg_render_block
to use this data to properly render the inner blocks.Added tests to cover this.
Checklist: