-
Notifications
You must be signed in to change notification settings - Fork 57
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sample Flags - Correctness vs Reality #221
Comments
Attempting to change the mp4 container to get around bad content is I believe a moot exercise as browsers are moving away from using any container informations to determine a frame type, but instead parsing the raw bit stream. |
@jyavenard to be clear I'm not trying to "get around" bad content. I'm trying to strike a balance of producing content which is spec compliant to a fault (many extraneous bytes that no one uses) but includes all the information necessary for playback with a decode error rate hopefully approaching zero percent. For example, we had a test mux done that verbosely defined every sample-flag to correct values. However, the first-sample-flag and its value were left in accidentally, and this confused the box parsing code in Edge (which is spec compliant). The content wouldn't play and failed silently. This content however played in every other browser implementation. If all MSE implementations are already moving away from reading MP4 metadata for keyframe-ness, would that be something that could be included as part of the specification for ISOBMFF parsing for MSE? We were looking for examples of others who were producing good content at the MP4 level and could not find an open or closed source encoder which encoded ISOBMFF content that either defined verbose sample flags or, through another mechanism, correctly defined every sync-sample as such. It appears there is a defacto industry standard to not define the MP4 metadata completely, to ensure that the first sample flag is correct and that the rest are set to the unknown values. |
@itsjamie Did you find any examples of others producing good content at the MP4 level? |
@michaeljweaver it's likely there is buffered range gap produced as a result. Chrome stable recently has ramped up to 99% of sessions using spec-compliant buffering-by-PTS (not DTS) ranges. That buffering relies on correct keyframeness metadata to manage GOP presentation intervals, and since so much of content previously had mp4 metadata for avc keyframeness that mismatched the reality embedded in the actual AVC coded frames, at the same time as switching to buffering-by-PTS, Chrome also now trusts the AVC keyframe metadata over the MP4. |
Thanks @wolenetz, yes you are correct there is a buffered range gap produced as a result. There are two frames that are not buffered correctly. On further investigation, I think the MSE stream is SAP Type 2, where the decode order is IBBPBB, but the presentation order is BBIBBP. So the PTS of the first two B frames precedes that of the random access point. |
@michaeljweaver yes, as noted by #187, the MSE spec doesn't guide implementations around handling the buffering/overlap removal/etc of the portion of a SAP-Type-2 GOP's presentation interval prior to that GOP's keyframe. In Chrome, in a continuous (in DTS) series of coded frames that comprise one or more SAP-Type-2 GOPs that are themselves directly adjacent (that is, the GOPs are adjacent and sequential and don't overlap) in PTS intervals, then they will be buffered continuously, though the very first GOP's start time will be reported as the that GOP's keyframe PTS. This is a compromise made in the implementation to support known SAP-Type-2 streams while the spec remains ambiguous. Hopefully, this is enough information to unblock you. If not, please file a https://crbug.com (and if you can populate the component as Internals>Media>Source, that will reach me more directly). |
@wolenetz thanks, I have raised https://bugs.chromium.org/p/chromium/issues/detail?id=954115 |
@michaeljweaver Thank you for reporting. Let's continue the discussion of this specific case on that crbug. |
Closing the loop, @michaeljweaver, it was indeed SAP-Type-2 GOPs causing the problem. Especially when the GOPs are appended out-of-order (e.g. GOP3, then GOP1, then GOP2, etc.), in the worst case, each SAP-Type-2 GOP's preceding nonkeyframes may be of sufficient duration to prevent gap coalescence in some MSE implementations like Chrome as a result of #187 not being fixed. Workaround in Chrome, at least, would be to append them in order GOP1,2,3, etc.: so long as the DTS sequence is continuous across those GOPs (and the GOPs presentation intervals don't overlap), then they should be buffered continuously without gaps. A longer term spec fix as an alternative or even complementary to #187 is #160. I expect to put forth a proposal to solve at least the standardization of the gap tolerance/coalescence portion of #160 soon. |
In debugging a few issues, I've started to look at how encoders are defining sample_flags at the ISOBMFF level, and noticed through working with content and talking with others in debugging this content that often times the data was incorrect. So I looked at what would go into defining the information as correct as possible.
However, what I found was that many encoders were taking advantage of
default_sample_flags
in thetfhd
box, andfirst_sample_flags
in thetrun
box.This works for achieving playback, but through talking with @wolenetz and debugging a rise of failures related to our MP4 metadata in the Chrome implementation, we were notified that some frames were incorrectly marked as not keyframes when they in fact were.
In our case, verbosely defining the sample_flags for every sample entry lead to a box size increase of ~10% (600 bytes) per segment and was also of an unknown value as some demuxers only look for the first keyframe currently.
I believe that currently multiple
trun
boxes could be defined for each GOP, but in my investigation, I didn't see any encoder that output fragmented MP4 in this way. Curious if anyone knew why?Alternatively, I wanted to run by here, and if it makes sense raise for a revision to 14496-12 for including into a future revision of the spec.
The proposal is to add a new flag to the
trun
box,indexed-sample-flags-present
. It would be treated as mutually exclusive tosample-flags-present
, andfirst-sample-flags-present
.It would be an optional field + variable field array combination of the type:
I believe it would require a new minor_compliance brand in the same way as
default-base-is-moof
. This would allow a singletrun
box to define specific samples that had non-default flags easily. I believe striking a balance.Thank you for any consideration of this.
The text was updated successfully, but these errors were encountered: