-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HTML API: Improve implementation of adoption agency algorithm #6983
base: trunk
Are you sure you want to change the base?
HTML API: Improve implementation of adoption agency algorithm #6983
Conversation
Test using WordPress PlaygroundThe changes in this pull request can previewed and tested using a WordPress Playground instance. WordPress Playground is an experimental project that creates a full WordPress instance entirely within the browser. Some things to be aware of
For more details about these limitations and more, check out the Limitations page in the WordPress Playground documentation. |
Since the HTML Processor started visiting all nodes in a document, both real and virtual, the breadcrumb accounting became a bit complicated and it's not entirely clear that it is fully reliable. In this patch the breadcrumbs are rebuilt separately from the stack of open elements in order to eliminate the problem of the stateful stack interactions and the post-hoc event queue. Breadcrumbs are greatly simplified as a result, and more verifiably correct, in this construction.
The HTML Processor internally throws an exception when it reaches HTML that it knows it cannot process, but this exception is not made available to calling code. It can be useful to extract more knowledge about why it gave up, especially for debugging purposes. In this patch, more context is added to the WP_HTML_Unsupported_Exception and the last exception is made available to calling code, if it asks.
f97c923
to
8cbfce1
Compare
8cbfce1
to
e97f678
Compare
Since the HTML Processor started visiting all nodes in a document, both real and virtual, the breadcrumb accounting became a bit complicated and it's not entirely clear that it is fully reliable. In this patch the breadcrumbs are rebuilt separately from the stack of open elements in order to eliminate the problem of the stateful stack interactions and the post-hoc event queue. Breadcrumbs are greatly simplified as a result, and more verifiably correct, in this construction.
The HTML Processor internally throws an exception when it reaches HTML that it knows it cannot process, but this exception is not made available to calling code. It can be useful to extract more knowledge about why it gave up, especially for debugging purposes. In this patch, more context is added to the WP_HTML_Unsupported_Exception and the last exception is made available to calling code, if it asks.
…rithm. As part of work to add more spec support to the HTML API, this patch fills out the active format reconstruction algorithm so that more HTML can be supported in situations requiring that reconstruction, for example, when a formatting element such as an A tag or a CODE tag is implicitly closed. See Core-61576
The following accounts have interacted with this PR and/or linked issues. I will continue to update these lists as activity occurs. You can also manually ask me to refresh this list by adding the Core Committers: Use this line as a base for the props when committing in SVN:
To understand the WordPress project's expectations around crediting contributors, please review the Contributor Attribution page in the Core Handbook. |
@sirreal mind sharing any thoughts on this? I don't think it adds much other than more descriptive bailing messages, but it does structure the algorithm more and include more comments. my thinking is that this might be useful to guide us when we spend more focus on trying to handle these situations. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are there adoption cases handled in this PR that weren't handled before? All of the examples I'm trying bail for different reasons.
Now that we have insert_virtual_token
can we support some more cases?
@@ -5312,32 +5318,31 @@ public function reset_insertion_mode(): void { | |||
* | |||
* @see https://html.spec.whatwg.org/#adoption-agency-algorithm | |||
*/ | |||
private function run_adoption_agency_algorithm(): void { | |||
private function run_adoption_agency_algorithm(): ?string { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would need a @return
tag about the strings and their meanings.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Type and phpdoc tweaks on walk_down.
@@ -622,12 +622,23 @@ public function remove_node( WP_HTML_Token $token ): bool { | |||
* see WP_HTML_Open_Elements::walk_up(). | |||
* | |||
* @since 6.4.0 | |||
* @since 6.7.0 Accepts $below_this_node to start traversal below a given node, if it exists. | |||
* | |||
* @param ?WP_HTML_Token $below_this_node Start traversing below this node, if provided and if the node exists. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix PHPdoc:
* @param ?WP_HTML_Token $below_this_node Start traversing below this node, if provided and if the node exists. | |
* @param WP_HTML_Token|null $below_this_node Start traversing below this node, if provided and if the node exists. |
*/ | ||
public function walk_down() { | ||
$count = count( $this->stack ); | ||
public function walk_down( $below_this_node = null ) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
public function walk_down( $below_this_node = null ) { | |
public function walk_down( WP_HTML_Token $below_this_node = null ) { |
This is difficult to interpret and interacts with active format reconstruction in interesting ways. I think my preference would be to work on that first (#6982) then come back to this. |
…ml-api/improve-adoption-agency-algorithm
Trac ticket: Core-61576
Status
This still needs work. With the added support, no new tests run. Does it do anything? We need to figure out.
Description
Improves support for the adoption agency algorithm in HTML, which re-parents nodes that appear in a different order than their corresponding tag in HTML, or when encountering certain closing elements for which no opening element exists.
This depends on resolving the issue: what do we do with nodes that are implicitly created after arriving at another tag?
html5lib
Test Suite