-
Notifications
You must be signed in to change notification settings - Fork 383
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add sanitization reporting #912
Changes from 14 commits
0a44284
e9f394a
d7104b8
616262f
4714062
e209005
aed76c6
30c666f
76b0f17
0b7c3fc
71744e5
da408bf
ab3909a
90090b0
83dac87
70521ef
cc4fe85
0b6a4ee
db49a33
c92bee4
6b5593b
a168a5c
a52ae05
0472a33
6d2350f
3de38b8
46e7084
972ac2c
d2ab9ee
cb0cfc9
084acaa
13ab280
f996673
f8aeca8
69317b8
583a6bc
cd910ff
a6d071a
b40729b
9954c11
15d9186
bb7a175
29caa8a
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -506,13 +506,25 @@ public static function get_amp_component_scripts() { | |
* Start output buffering. | ||
*/ | ||
public static function start_output_buffering() { | ||
ob_start( array( __CLASS__, 'finish_output_buffering' ) ); | ||
ob_start( array( __CLASS__, 'finish_buffer_add_header' ) ); | ||
} | ||
|
||
/** | ||
* Finish output buffering. | ||
* Get the result of the output buffering, and add a header. | ||
* | ||
* @todo Do this in shutdown instead of output buffering callback? | ||
* @param string $output Buffered output. | ||
* @return string Finalized output. | ||
*/ | ||
public static function finish_buffer_add_header( $output ) { | ||
$markup = self::finish_output_buffering( $output ); | ||
AMP_Mutation_Utils::add_header(); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We'll want to limit this to only be added when a user specifically requests this additional information. Like there should be a nonce that must be present to authorize the reporting. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks, @westonruter. That's a good idea to add the header only for users with a nonce. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Applied with a nonce, details to come shortly. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
|
||
return $markup; | ||
} | ||
|
||
/** | ||
* Finish output buffering. | ||
* | ||
* @global int $content_width | ||
* @param string $output Buffered output. | ||
* @return string Finalized output. | ||
|
@@ -523,6 +535,7 @@ public static function finish_output_buffering( $output ) { | |
$dom = AMP_DOM_Utils::get_dom( $output ); | ||
$args = array( | ||
'content_max_width' => ! empty( $content_width ) ? $content_width : AMP_Post_Template::CONTENT_MAX_WIDTH, // Back-compat. | ||
'mutation_callback' => 'AMP_Mutation_Utils::track_removed', | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Per above, this should be conditional based on whether an authorized nonce is present. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Good point. This is now only added if |
||
); | ||
|
||
$assets = AMP_Content_Sanitizer::sanitize_document( $dom, self::$sanitizer_classes, $args ); | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,212 @@ | ||
<?php | ||
/** | ||
* Class AMP_Mutation_Utils | ||
* | ||
* @package AMP | ||
*/ | ||
|
||
/** | ||
* Class AMP_Mutation_Utils | ||
* | ||
* @since 0.7 | ||
*/ | ||
class AMP_Mutation_Utils { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What does Mutation stands for here, it could because I am native French but There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. You're right that There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This commit changes the name of the classes. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. From an architectural perspective, it is interesting that all methods are There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It would be much better to not have all of these methods It's currently calling
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. amp_post_meta_box() might be a good model for this. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sure, this is not really a pressing need or a blocker though, it doesn't have to be addressed in this PR. |
||
|
||
/** | ||
* The argument if an attribute was removed. | ||
* | ||
* @var array. | ||
*/ | ||
const ATTRIBUTE_REMOVED = 'removed_attr'; | ||
|
||
/** | ||
* The argument if a node was removed. | ||
* | ||
* @var array. | ||
*/ | ||
const NODE_REMOVED = 'removed'; | ||
|
||
/** | ||
* Key for the markup value in the REST API endpoint. | ||
* | ||
* @var string. | ||
*/ | ||
const MARKUP_KEY = 'markup'; | ||
|
||
/** | ||
* The attributes that the sanitizer removed. | ||
* | ||
* @var array. | ||
*/ | ||
public static $removed_attributes; | ||
|
||
/** | ||
* The nodes that the sanitizer removed. | ||
* | ||
* @var array. | ||
*/ | ||
public static $removed_nodes; | ||
|
||
/** | ||
* Tracks when a sanitizer removes an attribute or node. | ||
* | ||
* @param DOMNode|DOMElement $node The node in which there was a removal. | ||
* @param string $removal_type The removal: 'removed_attr' for an attribute, or 'removed' for a node or element. | ||
* @param string $attr_name The name of the attribute removed (optional). | ||
* @return void. | ||
*/ | ||
public static function track_removed( $node, $removal_type, $attr_name = null ) { | ||
if ( ( self::ATTRIBUTE_REMOVED === $removal_type ) && isset( $attr_name ) ) { // phpcs:ignore WordPress.NamingConventions.ValidVariableName.NotSnakeCaseMemberVar | ||
self::$removed_attributes = self::increment_count( self::$removed_attributes, $attr_name ); | ||
} elseif ( ( self::NODE_REMOVED === $removal_type ) && isset( $node->nodeName ) ) { | ||
self::$removed_nodes = self::increment_count( self::$removed_nodes, $node->nodeName ); // phpcs:ignore WordPress.NamingConventions.ValidVariableName.NotSnakeCaseMemberVar | ||
} | ||
} | ||
|
||
/** | ||
* Tracks when a sanitizer removes an attribute or node. | ||
* | ||
* @param array $histogram The count of attributes or nodes removed. | ||
* @param string $key The attribute or node name removed. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It is a wonder Travis didn't pickup alignment comments miss alignment, it might only sniff the variables alignment actually. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Should this also align the comments, like:
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. |
||
* @return array $histogram The incremented histogram. | ||
*/ | ||
public static function increment_count( $histogram, $key ) { | ||
$current_value = isset( $histogram[ $key ] ) ? $histogram[ $key ] : 0; | ||
$histogram[ $key ] = $current_value + 1; | ||
return $histogram; | ||
} | ||
|
||
/** | ||
* Gets whether a node was removed in a sanitizer. | ||
* | ||
* @return boolean. | ||
*/ | ||
public static function was_node_removed() { | ||
return ! empty( self::$removed_nodes ); | ||
} | ||
|
||
/** | ||
* Processes markup, to determine AMP validity. | ||
* | ||
* Passes $markup through the AMP sanitizers. | ||
* Also passes a 'mutation_callback' to keep track of stripped attributes and nodes. | ||
* | ||
* @param string $markup The markup to process. | ||
* @return void. | ||
*/ | ||
public static function process_markup( $markup ) { | ||
$args = array( | ||
'content_max_width' => ! empty( $content_width ) ? $content_width : AMP_Post_Template::CONTENT_MAX_WIDTH, | ||
'mutation_callback' => 'AMP_Mutation_Utils::track_removed', | ||
); | ||
AMP_Content_Sanitizer::sanitize( $markup, amp_get_content_sanitizers(), $args ); | ||
} | ||
|
||
/** | ||
* Registers the REST API endpoint for validation. | ||
* | ||
* @return void. | ||
*/ | ||
public static function amp_rest_validation() { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This isn't currently used, though it would be helpful for Gutenberg block validation. |
||
register_rest_route( 'amp-wp/v1', '/validate', array( | ||
'methods' => 'POST', | ||
'callback' => array( __CLASS__, 'validate_markup' ), | ||
'args' => array( | ||
self::MARKUP_KEY => array( | ||
'validate_callback' => array( __CLASS__, 'validate_arg' ), | ||
), | ||
), | ||
'permission_callback' => array( __CLASS__, 'permission' ), | ||
) ); | ||
} | ||
|
||
/** | ||
* The permission callback for the REST request. | ||
* | ||
* @return boolean $has_permission Whether the current user has the permission. | ||
*/ | ||
public static function permission() { | ||
return current_user_can( 'edit_posts' ); | ||
} | ||
|
||
/** | ||
* Validate the markup passed to the REST API. | ||
* | ||
* @param WP_REST_Request $request The REST request. | ||
* @return array|WP_Error. | ||
*/ | ||
public static function validate_markup( WP_REST_Request $request ) { | ||
$json = $request->get_json_params(); | ||
if ( empty( $json[ self::MARKUP_KEY ] ) ) { | ||
return new WP_Error( 'no_markup', 'No markup passed to validator', array( | ||
'status' => 404, | ||
) ); | ||
} | ||
|
||
return self::get_response( $json[ self::MARKUP_KEY ] ); | ||
} | ||
|
||
/** | ||
* Gets the AMP validation response. | ||
* | ||
* If $markup isn't passed, | ||
* It will return the validation errors the sanitizers found in rendering the page. | ||
* | ||
* @param string $markup To validate for AMP compatibility (optional). | ||
* @return array $response The AMP validity of the markup. | ||
*/ | ||
public static function get_response( $markup = null ) { | ||
if ( isset( $markup ) ) { | ||
self::process_markup( $markup ); | ||
} | ||
$response = array( | ||
'has_error' => self::was_node_removed(), | ||
'removed_nodes' => self::$removed_nodes, | ||
'removed_attributes' => self::$removed_attributes, | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It would be useful if the processed markup were also returned here. It could then be used for previewing, for example. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hi @westonruter, that would help. Do you think we should return the processed markup only if we're validating a limited amount of markup, like a single Gutenberg block? We could detect this by whether
If There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, if markup is supplied to validate, then it should get returned. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks, @westonruter. These commits output the |
||
); | ||
self::reset_removed(); | ||
return $response; | ||
} | ||
|
||
/** | ||
* Reset the stored removed nodes and attributes. | ||
* | ||
* After testing if the markup is valid, | ||
* these static values will remain. | ||
* So reset them in case another test is needed. | ||
* | ||
* @return void. | ||
*/ | ||
public static function reset_removed() { | ||
self::$removed_nodes = null; | ||
self::$removed_attributes = null; | ||
} | ||
|
||
/** | ||
* Validate the argument in the REST API request. | ||
* | ||
* It would be ideal to simply pass 'is_string' in register_rest_route(). | ||
* But it always returned false. | ||
* | ||
* @param mixed $arg The argument to validate. | ||
* @return boolean $is_valid Whether the argument is valid. | ||
*/ | ||
public static function validate_arg( $arg ) { | ||
return is_string( $arg ); | ||
} | ||
|
||
/** | ||
* Output AMP validation data in the response header of a frontend GET request. | ||
* | ||
* This must be called before the document output begins. | ||
* Because the document is buffered, | ||
* The sanitizers run after the 'send_headers' action. | ||
* So it's not possible to call this function on that hook. | ||
* | ||
* @return void. $header The filtered response header. | ||
*/ | ||
public static function add_header() { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This isn't actually used yet. I think it was a foundation for validating plugins on activation (#842) |
||
header( sprintf( 'AMP-Validation-Error: %s', wp_json_encode( self::get_response() ) ) ); | ||
} | ||
|
||
} |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -11,6 +11,9 @@ | |
</properties> | ||
</rule> | ||
|
||
<rule ref="WordPress.Files.FileName.InvalidClassFileName"> | ||
<exclude-pattern>tests/*</exclude-pattern> | ||
</rule> | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Almost all of the test files begin with |
||
<rule ref="WordPress.Arrays.MultipleStatementAlignment.LongIndexSpaceBeforeDoubleArrow"> | ||
<exclude-pattern>tests/test-tag-and-attribute-sanitizer.php</exclude-pattern> | ||
</rule> | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can leave this as
finish_output_buffering
I think.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, this commit changes the name back to
finish_output_buffering()
.