Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Standardize YouTube embed rewriting? #2390

Open
RByers opened this issue Feb 23, 2017 · 36 comments
Open

Standardize YouTube embed rewriting? #2390

RByers opened this issue Feb 23, 2017 · 36 comments
Labels
compat Standard is not web compatible or proprietary feature needs standardizing document conformance interop Implementations are not interoperable with each other topic: embed and object topic: media

Comments

@RByers
Copy link

RByers commented Feb 23, 2017

Apparently chromium, Gecko and WebKit all have code for rewriting YouTube flash embeds to use HTML instead (so they work when flash is disabled, such as on mobile):

If this is now effectively required for web compat, perhaps we should work to align our implementations and specify it in HTML just to make this a little more predictable / sane? For automated testing purposes, perhaps we should include in the definition an additional hostname like youtube.web-platform.test?

@mounirlamouri can you share any data on how much of the web benefits from this? I imagine there may be other cases where we want to ask whether this is worth doing. So we may be setting a precedent here. I assume we'd only ever consider something like this when it's has truly massive positive impact on the user experience. I think it's worth having some discussion on what magnitude of usage qualifies (eg. as a fraction of page views, and/or a fraction of HTTP Archive pages).

@annevk annevk added compat Standard is not web compatible or proprietary feature needs standardizing topic: media labels Feb 23, 2017
@mounirlamouri
Copy link
Member

According to our data, this is affecting somewhere between 1-3% of our users on desktop and less than that on mobile (probably half). It's not huge because it's mostly for old websites, which is also the reason why mobile is less affected.

Is there any other data you would like to see?

@zcorpan
Copy link
Member

zcorpan commented Feb 24, 2017

From https://nerdydata.com/ ("We index approximately 160 million websites") searching in "Deep Web":

src="//www.youtube.com/v/
Showing 1 - 50 (of approx. 22,090 found)

src="http://www.youtube.com/v/
Showing 1 - 50 (of approx. 281,530 found) 

src="https://www.youtube.com/v/
Showing 1 - 50 (of approx. 151,810 found) 

data="//www.youtube.com/v/
Showing 1 - 50 (of approx. 3,290 found) 

data="http://www.youtube.com/v/
Showing 1 - 50 (of approx. 28,200 found) 

data="https://www.youtube.com/v/
Showing 1 - 50 (of approx. 1,175 found) 

Adding these together and assuming the worst case (1 match is 1 site), that is ~0.3%.

@zcorpan
Copy link
Member

zcorpan commented Feb 24, 2017

For comparison, we have <image>:

   <dt>A start tag whose tag name is "image"</dt>
   <dd>
    <!-- As of 2005-12, studies showed that around 0.2% of pages used the <image> element. -->
    <p><span>Parse error</span>. Change the token's tag name to "img" and reprocess it. (Don't
    ask.)</p>
   </dd>

@RByers
Copy link
Author

RByers commented Feb 24, 2017

Thanks guys! As expected, this is pretty compelling data that this is important enough to justify some sort of fix. Here's some HTTP Archive data about the popularity of different embed origins/types (of the top 500k sites). So YouTube is present on ~0.27% of top sites, with histats.com behind that at about a tenth of that at 0.029%.

Rank	Origin	Type	Page count
1	http://www.youtube.com	application/x-shockwave-flash	1087	 
2	https://www.youtube.com	application/x-shockwave-flash	234	 
3	http://s10.histats.com	application/x-shockwave-flash	143	 
4	http://player.youku.com	application/x-shockwave-flash	102	 
5	http://www.xatech.com	application/x-shockwave-flash	98	 
6	http://	application/x-shockwave-flash	71	 
7	http://www.clocklink.com	application/x-shockwave-flash	69	 
8	http://blog.naver.com	application/x-shockwave-flash	66	 
9	http://www.dailymotion.com	application/x-shockwave-flash	61	 
10	http://img.i2i.jp	application/x-shockwave-flash	49	 
11	http://www.	application/x-shockwave-flash	49	 
12	http://www.smartadserver.com	application/x-shockwave-flash	48	 
13	http://home.uchat.co.kr	application/x-shockwave-flash	45	 
14	http://content.pop6.com	application/x-shockwave-flash	43	 
15	http://embed.scribblelive.com	application/x-shockwave-flash	41	
16	http://www.ustream.tv	application/x-shockwave-flash	32	 
17	http://chat.53kf.com	application/x-shockwave-flash	31	 
18	http://www.flickr.com	application/x-shockwave-flash	28	 
19	http://static-hw.xvideos.com	application/x-shockwave-flash	27	 
20	http://static.video.qq.com	application/x-shockwave-flash	27	 
21	http://pic.img80.com	application/x-shockwave-flash	26	 
22	http://vimeo.com	application/x-shockwave-flash	26	 
23	http://s3.rotaban.ru	application/x-shockwave-flash	24	 
24	https://embed-ssl.wistia.com	application/x-shockwave-flash	23	 
25	http://s4.rotaban.ru	application/x-shockwave-flash	23	 
26	http://h2.flashvortex.com	application/x-shockwave-flash	21	 
27	http://www.youtube-nocookie.com	application/x-shockwave-flash	19	 
28	http://c.brightcove.com	application/x-shockwave-flash	19	 
29	http://ws.amazon.com	application/x-shockwave-flash	19	 
30	http://localtimes.info	application/x-shockwave-flash	19	 

Query:

SELECT 
  REGEXP_EXTRACT(body, r'<embed[^>]*src=[\'"]?(http[s]?://[^/ "\']*)') AS embedsrc,
  REGEXP_EXTRACT(body, r'<embed[^>]*type=[\'"]?([^ "\']*)') AS type,
  COUNT(DISTINCT bodies.page) as count
  FROM [httparchive:har.2017_01_15_chrome_requests_bodies] 
  AS bodies  
JOIN (
  SELECT 
    page, url,
    JSON_EXTRACT(payload, '$._contentType') AS contentType
  FROM [httparchive:har.2017_01_15_chrome_requests]
) AS requests ON requests.url=bodies.url AND requests.page=bodies.page
WHERE REGEXP_MATCH(body, r'<embed[^>]*src=[\'"]?http[s]?://')
GROUP BY embedsrc, type
ORDER BY count DESC
LIMIT 5000

@rniwa
Copy link

rniwa commented Feb 25, 2017

I would be surprised this was required for compatibility.

@miketaylr
Copy link
Member

I would be surprised this was required for compatibility.

If there's no Flash plugin on the device or platform, this is how you get the embedded videos to keep working, right?

@zcorpan
Copy link
Member

zcorpan commented Feb 27, 2017

Couldn't most of the "rewriting" be done by YouTube instead of in the browser engine, though? That is, if the only hack in the browser engine is type="application/x-shockwave-flash" -> type="text/html" for youtube.com/v/.

(And maybe eventually, do such type rewriting for any URL...?)

@RByers
Copy link
Author

RByers commented Mar 2, 2017

Yeah, we've been discussing a bit a new generic web platform feature that could enable YouTube (and everyone else in a similar position) to solve this problem themselves (basically upgrading a flash embed to an iframe). It's not clear to me what the exact requirements would be for YouTube to be able to use such a feature. If people here feel that is worth pursuing, I can see if we can find someone at Google to drive this.

@qdot
Copy link

qdot commented Sep 8, 2017

Update: @mounirlamouri has pointed out that flash embeds are now completely deprecated by youtube. We've still got per-browser solutions to rewriting. Should we revisit this?

@annevk annevk added the interop Implementations are not interoperable with each other label Sep 9, 2017
@karlcow
Copy link
Member

karlcow commented Oct 30, 2023

Is there a counter somewhere for

<embed src="https://www.youtube.com/…"
       type="application/x-shockwave-flash"
…

@gregorypappas
Copy link

gregorypappas commented Nov 14, 2023

Hi @karlcow, use counters for embeds that are eligible for rewriting were added to Firefox a few months ago. See bug 1847295 for more info 🙂

https://mozilla.github.io/usecounters/#channel=nightly&group=YOUTUBEFLASHEMBED

@zcorpan
Copy link
Member

zcorpan commented Nov 20, 2023

Can Chromium also implement a use counter?

@RByers
Copy link
Author

RByers commented Nov 20, 2023

Can Chromium also implement a use counter?

I am embarrassed to see that I didn't add a UseCounter for this years ago. Absolutely, we can not only add a counter, but also make an informed opinion on our ability to remove this logic from chromium. Bug 1503916 assigned to myself, will plan to land the counter today.

Also note that Chromium's impl isn't just about YouTube anymore, but also dailymotion.com and vimeo.com. I assume we'd want to tackle all three at once, right? I'm adding a UseCounter for all such embed rewrites, but if it turns out to be too large to break, then we could always dig in and look at each independently (though perhaps if we're going to spec one, we should just spec them all?).

aarongable pushed a commit to chromium/chromium that referenced this issue Nov 20, 2023
To support the standards discussion here: whatwg/html#2390

Bug: 1503916
Change-Id: I886a5a1c3d87507de672c3dc60d010a6ed3c22dd
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/5046750
Commit-Queue: Rick Byers <rbyers@chromium.org>
Reviewed-by: Dave Tapuska <dtapuska@chromium.org>
Cr-Commit-Position: refs/heads/main@{#1227007}
@zcorpan
Copy link
Member

zcorpan commented Nov 21, 2023

OK, great. As far as I can tell, Gecko only rewrites for Youtube (here). I'll check if we've received web compat issues for dailymotion/vimeo embeds.

@RByers
Copy link
Author

RByers commented Nov 24, 2023

The counter has landed but probably too late to request a merge into Chrome 120, so it won't hit stable until Chrome 121 late January. Regardless data from early release channels should start showing up here soon. It's wildly unreliable but I have initial data from Chrome Canary on Windows - on the order of 0.001% of page loads, inline with the Firefox data. I'd guess that means it's too common still for us to just remove, but it's not impossible we could do so with more work. In particular, now that I have a rough idea of magnitude and that this is indeed a candidate for potential removal, I can opt this counter into anonymized URL collection - with any luck we could find >99% of the usage is one or two sites suggesting removal is potentially easy.

If we can't immediately remove it would you want to spec it, or maybe since it's been unspecced for so long we just wait another 6 months or something to get a sense for the trend line?

@gregorypappas
Copy link

Individual use counters for dailymotion/vimeo/youtube would be nice. My feeling, based on absolutely nothing, is that unshipping vimeo/dailymotion embed rewriting right now is viable. This would bring all the implementations closer together, allowing for easier specification if necessary.

I personally would like to see this hack completely removed from the web, so +1 for waiting another 6 months or so to get a sense for the trend line.

@RByers
Copy link
Author

RByers commented Nov 24, 2023

Regardless of the Usecounter values, I'm personally uncomfortable with Google Chrome having a workaround just for YouTube - even if it is, by far, the largest use. I'd prefer to either remove them all or keep them all in Chromium.

@zcorpan
Copy link
Member

zcorpan commented Nov 28, 2023

Let's see if it's possible to remove all of them, or simplify (maybe treat Flash type as text/html and let the affected sites serve HTML or redirect?).

@zcorpan
Copy link
Member

zcorpan commented Jan 16, 2024

https://chromestatus.com/metrics/feature/timeline/popularity/4740 now shows about 0.002% for December and January. No hits in "Adoption of the feature on top sites".

@RByers

In particular, now that I have a rough idea of magnitude and that this is indeed a candidate for potential removal, I can opt this counter into anonymized URL collection - with any luck we could find >99% of the usage is one or two sites suggesting removal is potentially easy.

Nice, what's the status of this?

@RByers
Copy link
Author

RByers commented Jan 16, 2024

I just checked and there's not enough data from Chrome 121 beta to identify any popular origins hitting this regularly enough (need >50 users per day for our privacy policy). Chrome 121 hits stable in early Feb, so hopefully we'll get some hits around mid Feb.

@zcorpan
Copy link
Member

zcorpan commented Feb 28, 2024

@RByers mid-Feb has passed. :) Is there enough data now for anonymized URL collection?

@RByers
Copy link
Author

RByers commented Mar 9, 2024

Yep, thanks for the ping. We do indeed have good data now, and I've done an analysis and shared some examples in the chromium issue. The usage appears to have quite a long tail (top 10 origins account only for 2% of the usage we can measure), and there are lots of examples like blogs and forum posts which are unlikely to get fixed but where the user experience would be significantly diminished by breaking the embeds. As a result, I don't believe we can reasonably remove this behaviour from Chromium, sorry! I also don't see any evidence of downward trend in the few months of UseCounter data we now have. So I don't expect this to just resolve with time over the next few years.

So my recommendation is that we standardize this rewrite. Thoughts?

@zcorpan
Copy link
Member

zcorpan commented Mar 12, 2024

Thanks!

I agree. @gregorypappas and I also looked at a few in the "top sites" of https://chromestatus.com/metrics/feature/timeline/popularity/4740 , it looks like there are various education and government sites affected also.

WebKit tried to remove rewriting in https://bugs.webkit.org/show_bug.cgi?id=224453 but brought it back in https://bugs.webkit.org/show_bug.cgi?id=237182 cc @achristensen07 @cdumez

Since Gecko and WebKit seem to be fine with only YouTube rewrite, not Vimeo or Dailymotion, that seems like the minimum to standardize to address the web compat problem. Is that a problem for Chromium even if you can point to the spec?

@zcorpan
Copy link
Member

zcorpan commented Mar 12, 2024

@RByers
Copy link
Author

RByers commented Mar 14, 2024

Oh nice stats. From my perspective it doesn't make the system really any more complex to have a list of rewrite rules rather than a single one, and there's clearly non-trivial usage of Vimeo too. But if there's no other implementation interested in adding the Vimeo or Dailymotion support, then yeah WHATWG has to just standardize the only thing with multiple-implementor interest.

Chromium will keep the others as non-standard extensions. I just want to go on record saying I don't think it's a great look for the HTML spec to special case a Google property alone, and certainly not something Google would endorse or encourage 😄 . There was a general web compat problem with embedded video when the web evolved to not depend on flash for video, and Chromium is happy to mitigate that compat problem for any video provider with non-trivial usage of their legacy flash embed.

@zcorpan
Copy link
Member

zcorpan commented Mar 15, 2024

Well now you're being reasonable, @RByers! 😄

@gregorypappas thoughts about supporting vimeo and dailymotion rewrites? @achristensen07 @cdumez same for WebKit?

@gregorypappas
Copy link

thoughts about supporting vimeo and dailymotion

It wouldn't be hard, and doing so would mildly increase interoperability, so we might as well just do it. As noted in earlier comments, I still oppose this mechanism in principal, and would hate to see these hacks remain in implementations forever. So even when standardized, I think we should always be eyeing the removal of this stuff if usage decreases enough, even if it takes 20 years :)

@annevk
Copy link
Member

annevk commented Mar 17, 2024

Does/will everyone at least log some kind of console error similar to synchronous XMLHttpRequest?

@RByers
Copy link
Author

RByers commented Sep 10, 2024

Does/will everyone at least log some kind of console error similar to synchronous XMLHttpRequest?

Sorry I missed this. I'll do you one better, in Chromium I'll add a deprecation report (which has an associated console error) so that sites watching for usage of deprecated functionality in Chrome in the wild will get a hit on this. In practice we've found that this (and console errors) don't tend to actually drive a meaningful reduction in usage, but I still think it's the right thing to be doing. With luck there will be a point some years in the future where we can say the web doesn't need this anymore (we're just long past the point of hoping that'll come soon IMHO).

@zcorpan
Copy link
Member

zcorpan commented Sep 12, 2024

Gecko seems to have had a console warning for the YouTube embed rewrite since at least 2016.

https://searchfox.org/mozilla-central/source/dom/locales/en-US/chrome/dom/dom.properties#257

More things we could do:

  • Make the rewrite-triggering embed cases non-conforming HTML, and update conformance checkers.
  • Update the MDN page for embed to cover what authors should do if they have an old-style embed, which can be linked to in console warnings and conformance checker errors.

@zcorpan
Copy link
Member

zcorpan commented Sep 13, 2024

I first thought this could be testable by intercepting with service workers, but per https://w3c.github.io/ServiceWorker/#implementer-concerns it seems that is not the case. Maybe that can be changed now that plugins are no longer a thing?

It is detectable whether a rewrite happened or not, e.g. window.length is different. But I assume we want to test more aspects than "did a rewrite happen".

Adding new hostnames as @RByers suggested (youtube.web-platform.test) is an option, but it seems nicer to be able to test the real URLs.

@annevk
Copy link
Member

annevk commented Sep 19, 2024

We shouldn't change how service workers deal with object and embed for the purposes of this feature. There's much more to revamp about how object and embed work and that's tracked by dedicated issues.

window.length being different is a longstanding issue about iframe elements being detectable across the shadow root boundary. I wouldn't want to enshrine that even further by making it impossible to fix due to tests. In fact, I'd kinda lean towards testing the opposite...

@domenic
Copy link
Member

domenic commented Sep 20, 2024

Are shadow roots involved here? I thought the rewrite was at the HTML parser level.

@zcorpan
Copy link
Member

zcorpan commented Sep 20, 2024

We shouldn't change how service workers deal with object and embed for the purposes of this feature. There's much more to revamp about how object and embed work and that's tracked by dedicated issues.

OK, I can file an issue for service workers so object and embed handling can be evaluated on its own. Edit: I commented here w3c/ServiceWorker#249 (comment)

window.length being different is a longstanding issue about iframe elements being detectable across the shadow root boundary. I wouldn't want to enshrine that even further by making it impossible to fix due to tests. In fact, I'd kinda lean towards testing the opposite...

WebKit's implementation uses an internal shadow root: https://github.com/WebKit/WebKit/blob/main/Source/WebCore/html/HTMLPlugInElement.cpp#L273-L282 & https://github.com/WebKit/WebKit/blob/main/Source/WebCore/Modules/plugins/YouTubePluginReplacement.cpp#L82-L104

That WebKit uses an internal shadow root seems like an implementation detail. The object or embed element isn't in a shadow root, and per spec those elements can create a browsing context like iframe. If we want to hide browsing contexts in shadow roots, I think that should be when the author-visible element is in a shadow root.

I can also see an argument to hide YouTube-rewritten elements generally, in that it would more closely emulate the Flash-era behavior (which I assume did not affect window.length).

moz-v2v-gh pushed a commit to mozilla/gecko-dev that referenced this issue Oct 22, 2024
…early beta r=emilio

Use counters in Firefox and Chromium are showing that
this mechanism can't be disabled in release.

Let's re-enable in nightly and early beta.

See whatwg/html#2390 for more info

Differential Revision: https://phabricator.services.mozilla.com/D226355
ErichDonGubler pushed a commit to erichdongubler-mozilla/firefox that referenced this issue Oct 23, 2024
…early beta r=emilio

Use counters in Firefox and Chromium are showing that
this mechanism can't be disabled in release.

Let's re-enable in nightly and early beta.

See whatwg/html#2390 for more info

Differential Revision: https://phabricator.services.mozilla.com/D226355
i3roly pushed a commit to i3roly/firefox-dynasty that referenced this issue Oct 24, 2024
…early beta r=emilio

Use counters in Firefox and Chromium are showing that
this mechanism can't be disabled in release.

Let's re-enable in nightly and early beta.

See whatwg/html#2390 for more info

Differential Revision: https://phabricator.services.mozilla.com/D226355
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compat Standard is not web compatible or proprietary feature needs standardizing document conformance interop Implementations are not interoperable with each other topic: embed and object topic: media
Development

No branches or pull requests

10 participants