-
Notifications
You must be signed in to change notification settings - Fork 4.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parser: Add new list of HTML fragments to parse output #11334
Conversation
6c62b3f
to
7a39478
Compare
To clarify, a point of having the Input <!-- wp:block -->before<!-- wp:void /-->between<!-- wp:void /-->after<!-- /wp:block --> Output innerContent: [ 'before', null, 'between', null, 'after' ] Or even: Input <!-- wp:block -->before<!-- wp:void /--><!-- wp:void /-->between<!-- wp:void /-->after<!-- /wp:block --> Output innerContent: [ 'before', null, null, 'between', null, 'after' ] ? Not speaking to any practical implications of the parser implementation, but the editor implementation of nested blocks was very much based on the assumption that inner blocks occur at most once in a single sequence order, so it is not aware of (and could not easily support) interspersed inner blocks. This can be seen in the documentation of the
As far as the parser is concerned, I could grant that it not care to be tailored to this editor implementation detail, even going so far as to applaud that it could support some future editor enhancement toward this goal. All the same, it's a misalignment worth pointing out, and I haven't really seen compelling evidence that the InnerBlocks limitations provide an insufficient base upon which to build nested blocks (where more advanced cases can create nested arrangements, e.g. Columns -> Column). In all then, and even in acknowledging some redundancy between The editor today would have to assume that the first and last strings are the content surrounding |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are test failures (which, at a glance, appear mostly to do with mere addition of the property where relevant) but conceptually and code-wise this would have my approval.
Thanks @aduth - to summarize a conversation we had about having one location vs having each inner block location marked: I'm not going to enforce any behaviors through the parser. Even though this change means that the parser is capable of handling the case where inner blocks are interspersed with HTML it doesn't mean they have to. Having separate locations is the "default" behavior of the code whereas forcing a single location for the list of inner blocks would require more special-casing and complexity in the parser. I'd like to leave this simpler version in especially since it gives us the options later to support broader |
327f01c
to
857b02f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some merge conflicts to resolve.
} | ||
], | ||
"innerHTML": "\n\t<div class=\"wp-block-column\">\n\t\t\n\t\t\n\t</div>\n\t" | ||
"innerHTML": "\n\t<div class=\"wp-block-column\">\n\t\t\n\t\t\n\t</div>\n\t", | ||
"innerContent": [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
"innerContent": [ | ||
"\n\t<div class=\"wp-block-column\">\n\t\t", | ||
null, | ||
"\n\t\t", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Trying to think if the serializer behavior of newline-delimited blocks could have an impact in the ability to re-serialize, but I'm assuming the editor will disregard this as far as parsing/rehydrating the top-level columns block (which does and will continue to operate on innerHTML
?).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good point. we may have to take some additional care to make sure that the serializer matches behavior with the parser, but since it's been working so far I'm guessing it will continue to work for a while.
9a584f6
to
84ba8d5
Compare
rebased |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 👍 Nice work here.
$this->blockName = $name; | ||
$this->attrs = $attrs; | ||
$this->innerBlocks = $innerBlocks; | ||
$this->innerHTML = $innerHTML; | ||
$this->innerContent = $innerContent; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
New assignment introduces misalignment on the =
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will follow-up in a new PR, thanks for spotting this!
❤️ |
@@ -5,12 +5,14 @@ | |||
"align": "right" | |||
}, | |||
"innerBlocks": [], | |||
"innerHTML": "\n<figure class=\"wp-block-audio alignright\">\n <audio controls=\"\" src=\"https://media.simplecast.com/episodes/audio/80564/draft-podcast-51-livePublish2.mp3\"></audio>\n</figure>\n" | |||
"innerHTML": "\n<figure class=\"wp-block-audio alignright\">\n <audio controls=\"\" src=\"https://media.simplecast.com/episodes/audio/80564/draft-podcast-51-livePublish2.mp3\"></audio>\n</figure>\n", | |||
"innerContent": [ "\n<figure class=\"wp-block-audio alignright\">\n <audio controls=\"\" src=\"https://media.simplecast.com/episodes/audio/80564/draft-podcast-51-livePublish2.mp3\"></audio>\n</figure>\n" ] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm guessing that these were manually edited?
It's meant to be generated via npm run fixtures:regenerate
, which on master is producing local changes because of stylistic differences of the generated result.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah that's right @aduth - I was unaware of fixtures:regenerate
when I wrote this
Attempt three at including positional information from the parse to enable isomorphic reconstruction of the source
post_content
after parsing.See alternate attempts: #11082, #11309
Motivated by: #7247, #8760, Automattic/jetpack#10256
Enables: #10463, #10108
Abstract
Add new
innerContent
property to each block in parser output indicating where in the innerHTML each innerBlock was found.Status
Summary
Inner blocks, or nested blocks, or blocks-within-blocks, can exist in Gutenberg posts. They are serialized in
post_content
in place as normal blocks which exist in between another block's comment delimiters.The way this gets parsed leaves us in a quandary: we cannot reconstruct the original
post_content
after parsing because we lose the origin location information for each inner block since they are only passed as an array of inner blocks.At this point we have parsed the blocks and prepared them for attaching into the JavaScript block code that interprets them but we have lost our reverse transformation.
In this PR I'd like to introduce a new mechanism which shouldn't break existing functionality but which will enable us to go back and forth isomorphically between the
post_content
and first stage of parsing. If we can tear apart a Gutenberg post and reassemble then it will let us to structurally-informed processing of the posts without needing to be aware of all the block JavaScript.The proposed mechanism is a new property as a list of HTML fragments with
null
values interspersed between those fragments where the blocks were found.Doing this allows us to replace those
null
values with their associated block (sequentially) frominnerBlocks
.Questions
Why not use a string token instead of an array?
Why add the
null
instead of leaving basic array splits like[ 'before', 'after' ]
?''
and say that blocks exist only between array elements but the parser code would have to be more complicated to make sure we appropriately add those empty strings. The empty strings are a bit odd anyway.Why add a new property?
innerHTML
andinnerBlocks
; I don't want to break any existing behaviors and adding is less risky than changing.