Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sniffing considerations #169

Open
padenot opened this issue Apr 6, 2021 · 12 comments
Open

Sniffing considerations #169

padenot opened this issue Apr 6, 2021 · 12 comments
Labels
extension Interface changes that extend without breaking. image issues related to image decoding and encoding

Comments

@padenot
Copy link
Collaborator

padenot commented Apr 6, 2021

After long periods of time during which having a mime-types were mandatory to render audio or video, the HTML spec now says that sniffing should be implemented (with very specific steps from MIMESNIFF), and is in fact required for example for decodeAudioData in the Web Audio API, since there is no mime type information on a byte buffer (unlike with a URL or a Blob).

Here, we're in a novel situation. There is a bit more chances to have a mime type, because authors have probably (but not necessarily) just demuxed a byte stream, and they must have a good idea of what's in it, but in the end it's really just a series of bytes, for the video or audio case.

For images, less so. Authors are used to sniffing and never really touching the byte stream, so this is a bit of a problem.

Additionally, requiring a mime type means that in terms of layering, web codecs cannot be used to implement AudioContext.decodeAudioData or the <img> tag without having an author-provided implementation of MIMESNIFF.

@dalecurtis
Copy link
Contributor

dalecurtis commented Apr 6, 2021

The decision to drop mime sniffing was made in https://github.com/dalecurtis/image-decoder-api/issues/1

@domenic @jyasskin

@chcunningham
Copy link
Collaborator

chcunningham commented Apr 9, 2021

Summary of discussion in this week's editors call:
@padenot is anticipating the feedback that sniffing is required if WebCodecs were to be used to reimplement <video> or <img>. My counter points were: reimplementing <video> requires a number of higher level things (e.g. demuxing) which are not the responsibility of codecs. Additionally, I recommend that anyone storing assets anywhere store some metadata like their mime type to facilitate capability detection via ImageDecoder.isTypeSupported() (avoid the broken image experience).

@mathiasbynens
Copy link

Which repo contains the proposed spec text for ImageDecoder.isTypeSupported? The flagged Chromium implementation seems to be promise-based, which differs from isTypeSupported on MediaSource — is this intentional? https://github.com/dalecurtis/image-decoder-api/issues/6

@dalecurtis
Copy link
Contributor

Image types almost universally come with their mime-type attached or are 1:1 with their extension, so just adding sniffing for the handful of cases where the extension and mime-type are unavailable wasn't compelling when we discussed this previously.

The sniffing code for images is all incredibly simple too (just looking at first 16 bytes), so I don't see much loss in having the client implement:
https://source.chromium.org/chromium/chromium/src/+/master:third_party/blink/renderer/platform/image-decoders/image_decoder.cc;l=95;drc=043d32542213b601bf5852f6cd5950c949f411fa

@chcunningham
Copy link
Collaborator

Which repo contains the proposed spec text for ImageDecoder.isTypeSupported? The flagged Chromium implementation seems to be promise-based, which differs from isTypeSupported on MediaSource — is this intentional? dalecurtis/image-decoder-api#6

ImageDecoder was incubated in its own repository, but is now moving to WebCodecs. The text in PR #152 (preview) reflects the latest plans.

Making isTypeSupported() async is intentional. Lets discuss motivations in your comment on the PR.

@padenot padenot added the image issues related to image decoding and encoding label Apr 29, 2021
@padenot
Copy link
Collaborator Author

padenot commented Apr 29, 2021

Not sniffing would prevent implementing <img> without having to look at the bytes of the resources, which means it's not really possible to layer (as it's always possible to layer by doing lots of work in script). Not all resources are served with a mime type, and not all resources have an extension or even a file name.

For images, if sniffing is (1) easy and (2) frequently necessary for authors to implement, it's then clear this needs to be added in the spec and implementations to make it easy for authors. I'll note that determining that an HEIF is an AVIF (that one can display on some shipping browser today in their stable version) and not an HEIC (that is a very common image format that no browser regardless of their version can display today) is not sniffing a few bytes: https://source.chromium.org/chromium/chromium/src/+/master:third_party/blink/renderer/platform/image-decoders/avif/avif_image_decoder.cc;drc=221e331b49dfefadbc6fa40b0c68e6f97606d0b3;l=504

For videos and audio, sniffing is notoriously non-trivial, but this specification is not about demuxers (in the audio/video case), so this is a non-issue.

@dalecurtis
Copy link
Contributor

Ultimately I defer to @domenic and @jyasskin here since they are most familiar with current policies on sniffing from our side.

As a matter of strategy I'd say we should keep it as required until we actually get complaints or requests to loosen it, since it's always possible to loosen this requirement later without impacting existing clients.

A few is 144 in the linked case and I'd argue you actually only need to look at the major brand (first 12 bytes), but your point is taken that it's definitely not as simple as older formats.

@chcunningham chcunningham added the extension Interface changes that extend without breaking. label May 12, 2021
@chcunningham
Copy link
Collaborator

Triage note: marking 'extension' as desire for sniffing would probably be signaled by simply omitting the 'type' member from the ImageDecoderInit.

@padenot
Copy link
Collaborator Author

padenot commented May 17, 2021

An important point on something a bit tangential w3ctag/design-reviews#633 (comment). I'll note that unlike what is being discussed there, the data is already in clear in the process and inspectable.

@baumanj
Copy link

baumanj commented May 18, 2021

A few is 144 in the linked case and I'd argue you actually only need to look at the major brand (first 12 bytes), but your point is taken that it's definitely not as simple as older formats.

Having had a lot of discussions about what's required in the brands of ISOBMFF files recently, I don't think it is sufficient to limit the check to the major_brand (though that would suffice in most cases). One of the features offered by the compatible brands arrangement is to allow a single file to be legally interperable as different types. While I don't see that as necessarily desirable or common in the case of AVIF, I do think it's technically legal and since there's only one major brand, assuming that avif will occur there and not solely in the compatible_brands array would reject files which are valid per the specs.

@dalecurtis
Copy link
Contributor

Thanks for the insight. compatible_brands is still part of the ftyp at the front of the file, so it doesn't change the difficulty of sniffing implied by my comment though.

@baumanj
Copy link

baumanj commented May 19, 2021

Digging further, 144 bytes isn't a sufficient limit either. It's an arbitrary one in the chromium code. Firefox uses 512 since the size of an ftyp box is unbounded*, and the only limit that the mimesniff spec declares is 1445 bytes in total for sniffing. Updating the mimesniff spec to have a more reasonable limit on MP4 sniffing would be beneficial, but the overall point is that from a security standpoint, it's not reasonable to assume that our sniffing reads will be limited to 12 bytes in all MP4 cases.

* ftyp is a Box, which typically expresses its size as a unsigned 32-bit int, but there are facilities for extending that to a 64-bit value or until the end of the file (See ISO/IEC 14496-12:2020 § 4.2.2). That said, the mimesniff algorithm for MP4 assumes a 32-bit size, which is totally reasonable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
extension Interface changes that extend without breaking. image issues related to image decoding and encoding
Projects
None yet
Development

No branches or pull requests

5 participants