Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WP_HTML_Tag_Processor: Inject dynamic data to block HTML markup in PHP #42485

Merged
merged 40 commits into from
Sep 23, 2022
Merged
Show file tree
Hide file tree
Changes from 16 commits
Commits
Show all changes
40 commits
Select commit Hold shift + click to select a range
4887c5c
Introduce `WP_HTML_Walker` for reliably modifying HTML attributes.
dmsnell Aug 12, 2022
3f148db
Add TODO for `/` in attribute area
dmsnell Aug 12, 2022
6d089f2
Add TODO for handling character references in `class` attribute
dmsnell Aug 12, 2022
2dbfa03
Add TODO for attribute escaping
dmsnell Aug 12, 2022
fadd51b
Ignore slashes when parsing attribute names
adamziel Aug 15, 2022
17879e8
Remove test_get_tag_returns_raw_open_tag_name as get_tag always retur…
adamziel Aug 15, 2022
bdd56a6
Correctly check the indices of <!-- characters in skip_script_data -
adamziel Aug 15, 2022
8a48fab
Fix linting issues
gziolo Aug 16, 2022
e9a71d6
Improve existing unit tests
gziolo Aug 16, 2022
9bcfd3a
Don't close the walker in the __toString() method
adamziel Aug 18, 2022
f6d1bcc
Code style update
adamziel Aug 18, 2022
541d86b
Update phpunit/html/wp-html-walker-test.php
adamziel Aug 19, 2022
cdd8ae0
Update phpunit/html/wp-html-walker-test.php
adamziel Aug 19, 2022
1852cf7
Update lib/experimental/html/class-wp-html-walker.php
adamziel Aug 19, 2022
6c4647d
Update lib/experimental/html/class-wp-html-walker.php
adamziel Aug 19, 2022
4554dd3
Update lib/experimental/html/class-wp-html-walker.php
dmsnell Sep 22, 2022
e2eb423
Refactor early-abort to remove level of nesting
dmsnell Sep 22, 2022
68b55c4
Rewrite code to separate stages of computation
dmsnell Sep 22, 2022
f664c08
Expand name of $c to $character
dmsnell Sep 22, 2022
147647b
typo: type of integer is `int`
dmsnell Sep 22, 2022
d8b8ddd
typo: type of integer is `int`
dmsnell Sep 22, 2022
a325fa3
Reorder the test methods
adamziel Sep 23, 2022
a4fb703
Add the missing error messages
adamziel Sep 23, 2022
c017420
Reorder tests to keep set_attribute tests grouped together
adamziel Sep 23, 2022
07041ee
Update phpunit/html/wp-html-walker-test.php
adamziel Sep 23, 2022
63f4e34
Update phpunit/html/wp-html-walker-test.php
adamziel Sep 23, 2022
1021d9b
Update phpunit/html/wp-html-walker-test.php
adamziel Sep 23, 2022
e0910be
Explain the test_set_attribute_takes_priority_over_add_class test
adamziel Sep 23, 2022
273c502
Merge branch 'add/html-tokenizer-2' of github.com:WordPress/gutenberg…
adamziel Sep 23, 2022
19a8546
Rename test_works_with_single_quote_marks to test_parses_html_attribu…
adamziel Sep 23, 2022
c96f08b
Update the test names to suggest the failure mode
adamziel Sep 23, 2022
218afe4
Update phpunit/html/wp-html-walker-test.php
adamziel Sep 23, 2022
81d43e2
Add covers annotations
adamziel Sep 23, 2022
53143f8
Rename WP_HTML_Walker to WP_HTML_Tag_Processor
adamziel Sep 23, 2022
bddc2db
Remove the is_closed method, as the idea of a closed walker does not …
adamziel Sep 23, 2022
f299bf0
Update lib/experimental/html/class-wp-html-text-replacement.php
adamziel Sep 23, 2022
13c3270
Update lib/experimental/html/class-wp-html-attribute-token.php
adamziel Sep 23, 2022
5c2d080
Rename $w to $p
adamziel Sep 23, 2022
509684c
Merge branch 'add/html-tokenizer-2' of github.com:WordPress/gutenberg…
adamziel Sep 23, 2022
d9d3bab
Rename phpdoc references to $w to $p
adamziel Sep 23, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
86 changes: 86 additions & 0 deletions lib/experimental/html/class-wp-html-attribute-token.php
Original file line number Diff line number Diff line change
@@ -0,0 +1,86 @@
<?php
/**
* HTML Walker: Attribute token structure class.
*
* @package WordPress
* @subpackage HTML
* @since 6.1.0
*/

/**
* Data structure for the attribute token that allows to drastically improve performance.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@adamziel and @dmsnell, is the note about the performance implications valid? Should we also include the same note for WP_Class_Name_Update and WP_Text_Replacement? I know you discussed that extensively, but I'm not sure which part made the biggest difference.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is valid, I think in @dmsnell's testing it reduced memory usage by like 90% compared to array() which sounds almost unbelievable.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes we can update these comments across the board. array() was surprisingly terribly inefficient with memory use.

*
* @since 6.1.0
adamziel marked this conversation as resolved.
Show resolved Hide resolved
*
* @see WP_HTML_Walker
*/
class WP_HTML_Attribute_Token {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please lock down this code so it is not accessible from plugins or other parts of WordPress. It should only be used for the cases described in this PR.

/**
* Attribute name.
*
* @since 6.1.0
* @var string
*/
public $name;

/**
* Attribute value.
*
* @since 6.1.0
* @var int
*/
public $value_starts_at;

/**
* How many bytes the value occupies in the input HTML.
*
* @since 6.1.0
* @var int
*/
public $value_length;

/**
* The string offset where the attribute name starts.
*
* @since 6.1.0
* @var int
*/
public $start;

/**
* The string offset after the attribute value or its name.
*
* @since 6.1.0
* @var int
*/
public $end;

/**
* Whether the attribute is a boolean attribute with value `true`.
*
* @since 6.1.0
* @var bool
*/
public $is_true;

/**
* Constructor.
*
* @since 6.1.0
*
* @param string $name Attribute name.
* @param integer $value_start Attribute value.
* @param integer $value_length Number of bytes attribute value spans.
* @param integer $start The string offset where the attribute name starts.
* @param integer $end The string offset after the attribute value or its name.
* @param boolean $is_true Whether the attribute is a boolean attribute with true value.
dmsnell marked this conversation as resolved.
Show resolved Hide resolved
*/
public function __construct( $name, $value_start, $value_length, $start, $end, $is_true ) {
$this->name = $name;
$this->value_starts_at = $value_start;
$this->value_length = $value_length;
$this->start = $start;
$this->end = $end;
$this->is_true = $is_true;
}
}
56 changes: 56 additions & 0 deletions lib/experimental/html/class-wp-html-text-replacement.php
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
<?php
/**
* HTML Walker: Text replacement class.
*
* @package WordPress
* @subpackage HTML
* @since 6.1.0
*/

/**
* Data structure used to replace existing content from start to end that allows to drastically improve performance.
*
* @since 6.1.0
adamziel marked this conversation as resolved.
Show resolved Hide resolved
*
* @see WP_HTML_Walker
*/
class WP_HTML_Text_Replacement {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As above, please change so this code is not accessible.

/**
* Byte offset into document where replacement span begins.
*
* @since 6.1.0
* @var int
*/
public $start;

/**
* Byte offset into document where replacement span ends.
*
* @since 6.1.0
* @var int
*/
public $end;

/**
* Span of text to insert in document to replace existing content from start to end.
*
* @since 6.1.0
* @var string
*/
public $text;

/**
* Constructor.
*
* @since 6.1.0
*
* @param integer $start Byte offset into document where replacement span begins.
* @param integer $end Byte offset into document where replacement span ends.
* @param string $text Span of text to insert in document to replace existing content from start to end.
dmsnell marked this conversation as resolved.
Show resolved Hide resolved
*/
public function __construct( $start, $end, $text ) {
$this->start = $start;
$this->end = $end;
$this->text = $text;
}
}
Loading