Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Tag Processor] Merge the test files into a single file #44593

Closed
wants to merge 3 commits into from
Closed
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -1,15 +1,15 @@
<?php
/**
* Unit tests covering WP_HTML_Tag_Processor functionality.
* This file takes about 100ms to run because it does not load
* any WordPress libraries:
* This file can be run for rapid development without loading any
* WordPress libraries as follows:
*
* ```
* ./vendor/bin/phpunit --no-configuration ./phpunit/html/wp-html-tag-processor-test.php
adamziel marked this conversation as resolved.
Show resolved Hide resolved
* ```
*
* Put all new WP_HTML_Tag_Processor tests here, and only add new cases to
* wp-html-tag-processor-test-wp.php when they cannot run without WordPress.
* There might be minor discrepancies between the command above and a full
* CI run with WordPress loaded – the latter is always right.
*
* @package WordPress
* @subpackage HTML
Expand Down Expand Up @@ -277,10 +277,10 @@ public function test_set_attribute_on_a_non_existing_tag_does_not_change_the_mar
* @dataProvider data_set_attribute_escapable_values
* @covers set_attribute
*/
public function test_set_attribute_prevents_xss( $attribute_value ) {
public function test_set_attribute_prevents_xss( $value_to_set, $expected_result ) {
$p = new WP_HTML_Tag_Processor( '<div></div>' );
$p->next_tag();
$p->set_attribute( 'test', $attribute_value );
$p->set_attribute( 'test', $value_to_set );

/*
* Testing the escaping is hard using tools that properly parse
Expand All @@ -297,23 +297,26 @@ public function test_set_attribute_prevents_xss( $attribute_value ) {
preg_match( '~^<div test=(.*)></div>$~', (string) $p, $match );
list( , $actual_value ) = $match;

$this->assertEquals( $actual_value, '"' . esc_attr( $attribute_value ) . '"' );
$this->assertEquals( $actual_value, '"' . $expected_result . '"' );
}

/**
* Data provider with HTML attribute values that might need escaping.
*/
public function data_set_attribute_escapable_values() {
return array(
array( '"' ),
array( '&quot;' ),
array( '&' ),
array( '&amp;' ),
array( '&euro;' ),
array( "'" ),
array( '<>' ),
array( '&quot";' ),
array( '" onclick="alert(\'1\');"><span onclick=""></span><script>alert("1")</script>' ),
array( '"', '&quot;' ),
array( '&quot;', '&quot;' ),
array( '&', '&amp;' ),
array( '&amp;', '&amp;' ),
array( '&euro;', '&euro;' ),
array( "'", '&#039;' ),
array( '<>', '&lt;&gt;' ),
array( '&quot";', '&amp;quot&quot;;' ),
array(
'" onclick="alert(\'1\');"><span onclick=""></span><script>alert("1")</script>',
'&quot; onclick=&quot;alert(&#039;1&#039;);&quot;&gt;&lt;span onclick=&quot;&quot;&gt;&lt;/span&gt;&lt;script&gt;alert(&quot;1&quot;)&lt;/script&gt;',
),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still uncomfortable with re-encoding the tests of esc_attr() in our test here. this adds a deep coupling that can come back and frustrate people trying to make unrelated changes.

for instance, in our actual code and discussion of the tag processor we have essentially delegated responsibility for attribute encoding and sanitization to esc_attr(). we don't do additional work and we don't depend on any particular behavior other than its own guarantees.

why should we then extend additional constraints in the tests that we never wanted to make in the code or in conversation? we're not actually testing if this prevents XSS anyway - we're testing that it runs through esc_attr(), and I'd rather leave the XSS prevention testing to that function.

ultimately this isn't the biggest problem and our code is littered with deeply-coupled integration tests like it, but at the same token our test suite is already in a place where it's frustrating how often tests fail probabilistically and where it can be hard to understand or figure out what's wrong when seemingly-unrelated tests fail.

if we're going to take responsibility for XSS protection and incorporate that code ourselves (which I don't think we'll do) then I think it would be more appropriate to test here. if we're going to delegate to esc_attr() then we should either lean on its tests to cover the scenarios, or if it doesn't have them, lean on the history and reporting that has happened over the years to vet it.

" -> &quot;, for example, is neither an XSS attack nor evidence that we're protected. it's a good demonstration that we encode the HTML entities and that we won't break out of the syntax; &quot; -> &quot; does neither and really appears to assert an internal behavior of esc_attr() and nothing else.

it would be nice if it were easier and natively-available in PHP to create test spies, which is what I was emulating by comparing against esc_attr( $value ), but every time I've tried that it involved user-space libraries with particular restrictions.

if we want to retain some of these tests then our naming is confusing. we could name it test_set_attribute_properly_escapes_html_entities() and then give explanations for the constraints we are setting on it, which would free someone to swap out esc_attr() while maintaining our requirements. I don't think our requirements are as strict as shown here.

array(
    'escapes double-quote to avoid breaking out of HTML' => array( '"', '&quot;' ),
    'avoids double-encoding existing HTML entities' => array( '&quot;', '&quot;' ),
);

I'm not sure if any of the other constraints is really there. the example with onclick seems to be covered by the assertion on quotes alone.

lots of thoughts 😆

Copy link
Contributor Author

@adamziel adamziel Oct 3, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dmsnell I agree – I only did that to have an automated check in place to confirm that esc_attr is indeed used in all the different cases. Ideally we'd have a spy there, but it won't work if we load WordPress. Or maybe it would? We could register the esc_attr call and augment the return value to something like ESC_ATTR( $original_input ) using a filter.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well blimey, we can do this with the attribute_escape filter

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added in 80452d8
should be easy to revert if we need to.
I removed the comment at the top about running this in isolation since we can't anymore; not if we want to assert that we called esc_attr() and also verify behaviors of esc_attr(). this is fine.

Copy link
Contributor Author

@adamziel adamziel Oct 19, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! Looks good to me, too. This PR needs a rebase but should be good otherwise – I'll look into that

);
}

Expand Down
91 changes: 0 additions & 91 deletions phpunit/html/wp-html-tag-processor-wp-test.php

This file was deleted.