Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make sure the WP embeds security process is applied #4226

Merged
merged 1 commit into from
Feb 21, 2018

Conversation

imath
Copy link
Contributor

@imath imath commented Jan 2, 2018

Description

In #3548 @PareshRadadiya suggested a css fix to remove the link above the WordPress embed posts. @swissspidy advised to have a look at the wp-embed.js script and @aduth suggested to inject this script into the Sandbox component instead of adding some css rules.
After exploring another issue about the fact it is not possible to embed posts from the same site, it appears the security mechanism @swissspidy is probably thinking about when saying "it does a bit more", is not applied during REST requests made with the WP_oEmbed_Controller class that Gutenberg uses to fetch WordPress embeds.
The current PR is my attempt to fix both issues: the remaining top link and "self" embeds.

How Has This Been Tested?

I've tested this PR on a regular AMP on my Macbook on a Multisite config.

Screenshots (jpeg or gifs if applicable):

self-embed

Self embeds can be previewed with this PR and the above blockquote is removed as the security process is applied.

external-embeds

Embeds from other WordPress sites are still failing because, imho, the WordPress site does not include one of the function of this PR: gutenberg_filter_oembed_result(). I've added inline comments above this function.

Types of changes

Fixes self embedding WP posts and the remaining blockquote above the iframe for these kind of WP Embeds.

Checklist:

  • My code is tested.
  • My code follows the WordPress code style.
  • My code follows has proper inline documentation.

@imath
Copy link
Contributor Author

imath commented Jan 2, 2018

i'll remove the multi spaces found in blocks/library/embed/index.js. But about the fact URL is not defined, should I use a comment to tell the linter to ignore or use something else like wpApiSettings.root.replace( '/wp-json', '')

@aduth
Copy link
Member

aduth commented Jan 3, 2018

But about the fact URL is not defined

Generally, you could simply be explicit about accessing URL on the window global, i.e. window.URL.

However, URL does not fall within the required browser support, specifically lacking support for IE11:

https://caniuse.com/#feat=url

That said, Node supports the same URL API via the url built-in:

https://nodejs.org/api/url.html#url_constructor_new_url_input_base

You should be able to import this directly, which will be polyfilled by Webpack.

import { URL } from 'url';

If you run into issues, it might be that the WHATWG URL API is too new† to have been polyfilled by Webpack, in which case you can lean on the legacy url.parse function.

† I don't know that it's necessarily new, but I learned only today that url.parse was considered legacy, as it was what I'm accustomed to using. 😄

@@ -142,6 +142,8 @@ export default class Sandbox extends Component {
}
`;

const wpEmbedScript = 'wp-embed' === this.props.type && window._wpEmbedScript ? ( <script type="text/javascript" src={ window._wpEmbedScript }></script> ) : '';
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than making the Sandbox component aware of types of scripts, I wonder if it should instead accept a children prop that allows additional elements to be injected into the sandboxed content.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there were such a property, i guess it should be an object with at least two keys eg { head: 'to add styles', footer: 'to add JavaScripts' } as so far there's only some specific code for Video Embeds.

Btw, i've cleaned up the mess i put when i rebased earlier.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there were such a property, i guess it should be an object with at least two keys

Perhaps. Though, to the best of my knowledge, loading a stylesheet at the end of a page is totally valid.

https://stackoverflow.com/a/21749882/995445

Alternatively, assuming appending styles and scripts are the common use case, we could specify styles and scripts as array props of URL strings, e.g.:

<Sandbox scripts={ [ window._wpEmbedScript ] } />

Related prior art: https://github.com/Automattic/wp-calypso/tree/master/client/lib/embed-frame-markup

If there were such a property

We can/should introduce it here, since it's as part of the introduction of the requirement to include an additional script in the sandbox code.

@@ -100,7 +100,14 @@ function getEmbedBlockSettings( { title, icon, category = 'embed', transforms, k
event.preventDefault();
}
const { url } = this.props.attributes;
const apiURL = addQueryArgs( wpApiSettings.root + 'oembed/1.0/proxy', {
const rootUrl = new URL( wpApiSettings.root );
let action = 'proxy';
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A ternary could work well here:

const action = includes( url, rootUrl.hostname ) ? 'embed' : 'proxy';

See also: https://lodash.com/docs/4.17.4#includes

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, thanks a lot 👍

@@ -114,6 +121,10 @@ function getEmbedBlockSettings( { title, icon, category = 'embed', transforms, k
return;
}
response.json().then( ( obj ) => {
if ( obj.html && -1 !== obj.html.indexOf( 'class="wp-embedded-content"' ) ) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: Complementing the above remark, _.includes also automatically handles falsey values, so this could be simplified to:

if ( includes( obj.html, 'class="wp-embedded-content"' ) ) {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, i will include this for sure. Thanks.

@@ -114,6 +121,10 @@ function getEmbedBlockSettings( { title, icon, category = 'embed', transforms, k
return;
}
response.json().then( ( obj ) => {
if ( obj.html && -1 !== obj.html.indexOf( 'class="wp-embedded-content"' ) ) {
obj.type = 'wp-embed';
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The mutation of obj here makes me a bit uneasy, especially since we pass the object around (see below this.getPhotoHtml( obj )). I might instead suggest making this part of the destructuring and reassignment of type:

let { html, type } = obj;
if ( /* ... */ ) {
	type = 'wp-embed';
}

Problem here is that you might run into a prefer-const ESLint warning since html is not reassigned. Generally speaking, I'd be in favor of enabling the "destructuring": "all" option, but otherwise you might either have to avoid the destructuring for html or destructure separately:

const { html } = obj;
let { type } = obj;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, i understand your concern, i'll try to "destructure separately".

lib/compat.php Outdated
return $data;
}

$data['html'] = wp_filter_oembed_result( $data['html'], (object) $data, $_GET['url'] );
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: This function could be made more concise and have fewer return paths if we just moved this single line into the negation of the previous condition:

<?php

function gutenberg_filter_oembed_result( $data ) {
	if ( defined( 'REST_REQUEST' ) && false !== REST_REQUEST && ! empty( $data['html'] ) && ! empty( $_GET['url'] ) ) {
		$data['html'] = wp_filter_oembed_result( $data['html'], (object) $data, $_GET['url'] );
	}

	return $data;
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, thanks a lot.

lib/compat.php Outdated

return $data;
}
add_filter( 'oembed_response_data', 'gutenberg_filter_oembed_result', 14, 1 );
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fourth argument here is redundant, since 1 is the default value if omitted.

https://developer.wordpress.org/reference/functions/add_filter/

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How did you arrive at the priority of 14?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll remove the fourth argument. About the priority: wp_filter_oembed_result() must happen after get_oembed_response_data_rich() which already hooks at 10 oembed_response_data. So 11 or upper works. 14 leaves some space for others to eventually filter in between.

@imath
Copy link
Contributor Author

imath commented Jan 4, 2018

Thanks a lot for your recommandations and your explanations about the URL thing. I'll update the PR asap.

@imath
Copy link
Contributor Author

imath commented Jan 4, 2018

@aduth i've applied the changes but i guess i put a mess trying to rebase my branch, sorry! I'll try to see how to fix this to only show my 3 latest commits.

Copy link
Member

@aduth aduth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm curious... why isn't this script included in the markup generated from the embed endpoint? Can or should we consider updating / filtering the endpoint response to include it? (Maybe context-specific?)

@@ -142,6 +142,8 @@ export default class Sandbox extends Component {
}
`;

const wpEmbedScript = 'wp-embed' === this.props.type && window._wpEmbedScript ? ( <script type="text/javascript" src={ window._wpEmbedScript }></script> ) : '';
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there were such a property, i guess it should be an object with at least two keys

Perhaps. Though, to the best of my knowledge, loading a stylesheet at the end of a page is totally valid.

https://stackoverflow.com/a/21749882/995445

Alternatively, assuming appending styles and scripts are the common use case, we could specify styles and scripts as array props of URL strings, e.g.:

<Sandbox scripts={ [ window._wpEmbedScript ] } />

Related prior art: https://github.com/Automattic/wp-calypso/tree/master/client/lib/embed-frame-markup

If there were such a property

We can/should introduce it here, since it's as part of the introduction of the requirement to include an additional script in the sandbox code.

@imath
Copy link
Contributor Author

imath commented Jan 6, 2018

@aduth I think the wp-embed.js script is not included in the markup generated from the embed endpoint, because it can deal with all the embedded WordPress posts of the page. In the following example:

4226patch

I'm using changes added by f17f6ed

I think we should probably do like this, instead of including wp-embed.js in each WordPress embed blocks.

I still need to improve this latest commit to avoid the nested ternary.

const { html } = obj;
let { type } = obj;

if ( includes( html, 'class="wp-embedded-content" data-secret' ) ) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the external WordPress site returns a response not containing the 'data-secret' attribute (eg: previous version of WordPress once the code we temporarily added in lib/compat.php will be included in WordPress core), then the response is treated like any other embed content.

Copy link
Member

@aduth aduth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Embeds from other WordPress sites are still failing because, imho, the WordPress site does not include one of the function of this PR: gutenberg_filter_oembed_result().

Can you clarify what you mean by this?

I think we should probably do like this, instead of including wp-embed.js in each WordPress embed blocks.

Personally I don't have a huge issue with wp-embed.js being loaded independently in each iframe, but I'm also overly cautious about dangerouslySetInnerHTML 😄 Noting that this is the same markup that we'd expect to be shown on the front of the site anyways, I think it's reasonable.

An initial thought I had was not being sure how well the embed script handles dynamically-added content (since the markup is added after the initial page load). From my testing this appears to work well though.

@@ -181,6 +189,21 @@ function getEmbedBlockSettings( { title, icon, category = 'embed', transforms, k
const parsedUrl = parse( url );
const cannotPreview = includes( HOSTS_NO_PREVIEWS, parsedUrl.host.replace( /^www\./, '' ) );
const iframeTitle = sprintf( __( 'Embedded content from %s' ), parsedUrl.host );
const embedWrapper = 'wp-embed' !== type ? (
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: In a condition or ternary with both if and else parts, I tend to encourage the positive form as the first case, since "if positive else negative" reads more naturally than "if negative else positive".

lib/compat.php Outdated
@@ -228,6 +228,25 @@ function gutenberg_get_post_type_capabilities( $user, $name, $request ) {
return $value;
}

/**
* Make sure oEmbed REST Requests apply the WP Embed security mechanism for WordPress embeds.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment is a bit light on rationale for why the security mechanism should be applied. Might be worth elaborating or adding a link to context.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In 6dd9e9a i've added a link to the relative Core ticket

lib/compat.php Outdated
*
* TODO: This is a temporary solution. This code should be included in WordPress Core.
*
* @since ?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While you're here you can go ahead and set this to the next release version number (2.1.0).

@imath
Copy link
Contributor Author

imath commented Jan 17, 2018

Thanks a lot for your review @aduth

I've applied your recommandation in latest commit.

Can you clarify what you mean by this?

I'll try!

  1. Let's say Site A is using a Gutenberg version including this PR. When you embed a post from Site A into Site A, the WP Embed will work and the link at the top will be removed by wp-embed.json because it finds the data-secret attribute of the blockquote.

  2. Now in Site A, if i try to embed a content of a WordPress site B that uses a Gutenberg that is not including this PR, the data-secret attribute will be missing because gutenberg_filter_oembed_result() is not in WordPress site B yet. That's why for this case, the link above the Embed html will still be there. As soon as WordPress site B updates Gutenberg to include this PR, then the above link should be removed just like in case 1)

Copy link
Member

@aduth aduth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you resolve the merge conflict?

Will also wait to see how the build runs; I'm guessing the last one ran during Travis's downtime this past week.

Otherwise looks good 👍

lib/compat.php Outdated
/**
* Make sure oEmbed REST Requests apply the WP Embed security mechanism for WordPress embeds.
*
* @see https://core.trac.wordpress.org/ticket/32522
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should a comment be made to this ticket (or a new ticket) describing what's being added here and why?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I just opened this ticket in WordPress Core Trac.

@imath
Copy link
Contributor Author

imath commented Jan 21, 2018

@aduth I just updated the PR to solve the merge conflict.

@swissspidy
Copy link
Member

Out of curiosity, are you using rebase (vs. merge) to resolve merge conflicts? I'm getting a notification for e84b88df2b11a7a80945b4655b2d1b3bb6c0d1d1 over and over again.

Regarding the changes, I left a comment on that Trac ticket as I think this could be fixed in JS alone.

Copy link
Member

@pento pento left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per my comment on the core ticket, changing the oembed API result isn't an option.

However, on digging into this some more, I believe that Core's WP_oEmbed_Controller::get_proxy_item() is not correct here. $data is retrieved from WP_oEmbed::get_data(), but the HTML should be filtered, to match existing behaviour.

So, we'll need a new ticket on core.trac to fix the WP_oEmbed_Controller::get_proxy_item() behaviour.

The workaround in compat.php should look something like this:

function gutenberg_filter_oembed_result( $response, $handler, $request ) {
	if ( 'GET' === $request->get_method() && '/oembed/1.0/proxy' === $request->get_route() ) {
		$response->html = apply_filters( 'oembed_result', _wp_oembed_get_object()->data2html( $response, $_GET['url'] ), $_GET['url'], array() );
	}

	return $response;
}
add_filter( 'rest_request_after_callbacks', 'gutenberg_filter_oembed_result', 10, 3 );

And so that the resize message from the embed is recognised, the Sandbox htmlDoc should include:

<script type="text/javascript" src="/wp-includes/js/wp-embed.min.js" />

@imath
Copy link
Contributor Author

imath commented Jan 24, 2018

Thanks a lot @swissspidy for your feedback & @pento for your review, I’ll update the PR as required asap.

@imath
Copy link
Contributor Author

imath commented Jan 25, 2018

@pento I've applied your patch in 080ca75 but i had to add another condition block for self embeds (posts embedded in the same WordPress site) which was the subject of this PR initialy.

@aduth The good new is, thanks to Pento, the blockquote above self WordPress site embeds or external WordPress site embeds is now removed in both cases !

4226

Copy link
Member

@pento pento left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs some testing, but it allows us to drop the switching from embed/index.js. Everything goes through the proxy endpoint.

function gutenberg_filter_oembed_result( $response, $handler, $request ) {
	if ( 'GET' !== $request->get_method() ) {
		return $response;
	}

	if ( is_wp_error( $response ) && 'oembed_invalid_url' !== $response->get_error_code() ) {
		return $response;
	}

	// External embeds.
	if ( '/oembed/1.0/proxy' === $request->get_route() ) {
		if ( is_wp_error( $response ) ) {
			// It's possible a local post, so lets try and retrieve it that way.
			$post_id = url_to_postid( $_GET['url'] );
			$data = get_oembed_response_data( $post_id, apply_filters( 'oembed_default_width', 600 ) );

			if ( ! $data ) {
				// Not a local post, return the original error.
				return $response;
			}
			$response = (object) $data;
		}

		// Make sure the HTML is run through the oembed santisation routines.
		$response->html = wp_oembed_get( $_GET['url'], $_GET );
	}

	return $response;
}

lib/compat.php Outdated

// Internal embeds.
} elseif ( '/oembed/1.0/embed' === $request->get_route() ) {
$response['html'] = apply_filters( 'oembed_result', _wp_oembed_get_object()->data2html( (object) $response, $_GET['url'] ), $_GET['url'], array() );
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is removing the <script> from the public embed response, which is necessary for non-WordPress sites to be able to handle WordPress embeds. We shouldn't be filtering the response for this endpoint.

@imath
Copy link
Contributor Author

imath commented Jan 26, 2018

Hi @pento I confirm. I just updated the PR with your code and tested it and all WordPress embedded posts (self of external) are displayed the right way. Thanks a lot for your help.

About this failing test, i have no idea about how to fix the checksum integrity thing 😢

@swissspidy
Copy link
Member

Probably can be fixed by clearing the cache on Travis and re-running the tests.

lib/compat.php Outdated
// External embeds.
if ( '/oembed/1.0/proxy' === $request->get_route() ) {
if ( is_wp_error( $response ) ) {
// It's possible a local post, so lets try and retrieve it that way.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

*possibly

@imath
Copy link
Contributor Author

imath commented Jan 27, 2018

Thanks for the feedback @swissspidy i've fixed the typo issue, and tests are now ok 😃

@imath
Copy link
Contributor Author

imath commented Feb 4, 2018

Hi @pento
As you haven't edited your review since my last PR update, does it mean you think there are still changes/improvements to be made ?

@swissspidy swissspidy requested a review from pento February 4, 2018 10:18
@swissspidy
Copy link
Member

I just requested a new review to make sure :-)

@imath
Copy link
Contributor Author

imath commented Feb 8, 2018

I've just updated the PR to resolve conflicts, just in case you find some time to progress on this.

imath added a commit to imath/gutenblocks that referenced this pull request Feb 13, 2018
Untill WordPress/gutenberg#4226 is fixed, this will make sure the data-secret mechanism is also applied to WordPress embeds in Gutenberg.
@pento
Copy link
Member

pento commented Feb 21, 2018

@imath: I was on vacation for the last few weeks. 🙂

If you could update your branch for the merge conflict, let's go ahead and get this merged.

…ism.

- Include the wp-embed.js script only once into the WP Editor.
- Add a temporary filter to make sure the HTML is run through the oembed sanitisation routines.
- Use a specific wrapper for WordPress Embeds.
@imath
Copy link
Contributor Author

imath commented Feb 21, 2018

Hi @pento, thanks a lot for your update (@aduth told me you were afk for a little while yes 😉 ). I've just updated my branch to fix the merge conflict.

@swissspidy
Copy link
Member

New core ticket: https://core.trac.wordpress.org/ticket/45142

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants