-
Notifications
You must be signed in to change notification settings - Fork 35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Metadata #104
base: master
Are you sure you want to change the base?
[WIP] Metadata #104
Conversation
The idea seems ok to me. Though I'm nervous of a large metadata registry ballooning the specification. I'd recommend citing an external source for the semantics, storage, and vocabulary of metadata values where possible and avoid defining them internally where possible. This approach does seem interesting for handling round-trips of DPX color data in DPX->FFV1-DPX scenarios. Also I suggest some presence or summary of this concept on the cellar working group. |
There are streams where width/height changes mid stream. So a lossless format should not put w/h in a global header ideally. the version check is also bad as micro version depends on version. The check misbehaves after version 3. A better way is to combine the integer and fractional version parts before comparing. The metadata format has issues too, theres no way to store a key as a string for example. And you mix count and size in the pseudo code, one is unused the other looks unintendedly used twice |
6099a67
to
11f341e
Compare
From feedback I understand you are not against the idea itself, the current state of the PR was more for a proposal of the idea before cleaning up the proposal so I would not spend too much time on it if it is not the right direction, so now I fix issues raised, asking for having validation on the technical part then I expand the text around modifications.
Most of containers do not support changes mid stream (AFAIK AVI, MKV and MP4 don't), and the ones supporting that are expanded with e.g. an intermediate layer providing the complete configuration record of the new stream configuration: a lot of things could change mid stream (bit depth, color space...) so I don't think that having an exception for width/height, without the same possibility for all other fields, make a lot of sense. Anyway, if it is blocking I changed to
We didn't define fractional part, I am reluctant to add a complete definition for only that.
there is,
Typo, fixed. At the same time, I changed @dericed I am think to either just put the algorithm with no definition of any metadata id (we put them in a separate document), or list the metadata names and link to Matroska specs and centralize definition in Matroska, as Matroska is in CELLAR too. Please review again (principle and technical part) and let's agree on the principle, then I expand it. |
My 22 cents:
|
Got it, just remarking that all could change (e.g. we could want to keep the same slice size, so we need 4x more slices in the UHD production, so we need to change that too mid stream).
It is sometimes problematic, e.g. when frame rate is indicated in the bitstream (in that case the container frame rate is prefered as time stamps are from it). Anyway, I think it is a player policy and I don't see how to "force" that in a spec.
Definitely something we need to work on, by e.g. permitting a completely new Configuration Record mid stream (to be discussed). |
My suggestion is that a player SHOULD. (At the Berlin symposium I proposed container before codec as well, then I changed my mind. I guess, I said that on the Cellar list. Not blocking at all on my side, I was just sharing some experience I gathered in our playgrounds.) |
11f341e
to
af285de
Compare
I added a line about it, for debate. |
about principle, I think its useful in principle but the currently proposed WIP changes feel rushed and hacked together ATM. iam in favor of it in principle but it must be improved but iam not sure what we will need and want will be possible before v4. w=h=0 (provided by container) is considering that changing w/h should be at codec level not helping. Especially for most existing containers changing parameters need to be at the codec level. And even if a future container allows storing "per frame" w/h that still wont be enough i suspect. Theres more needed at the codec level. string metadata keys are not supported in your proposal, you call it "metadata meaning" / "metadata_id", the list is finite and fixed, a user cannot store things that are not listed in this. string values, theres a binary case supported. strings should probably be a type different from binary data. Also with strings the encoding needs to be specified. I assume that would always be UTF-8 this should be mentioned in the spec. The spec should mention what to do about invalid UTF8 encodings, this may also needs expansion of the security section. Also if you want to make the stream independant of the container (by adding w/h), you also need timebase and timestamps or something equivalent. The encoding in your proposal allows storing 1D arrays, it also lists matrixes, but how would a user defined matrix work as in "mymatrix" and 6 elements, that could be a 3x2 or 2x3 matrix? And with string metadata comes the question of language. A title (which is metadata) is in a language, there may be multiple titles in multiple languages. |
At least me needs it now (color metadata is often lost during trans-wrapping), more than optimization of the compression, and as it is not breaking v3 decoding I think it is valuable to have in v3. But definitely not needed for v3 standardization, just that I heard that there is no work on improving FFV1 in parallel to standardization so I propose patches for improving FFV1 with what is in my pending list (and for the moment, I see no need to have break in specs so no v4 stuff for this kind of change).
It would permit to have width/height in bitstream without preventing your use case. I don't understand the issue, please suggest a method for transporting width/height without issue for anyone.
I don't understand what is not working:
Idea is to be very light, and there is no need to have more actually. I could still add a new type "char" but I don't see the reason we need to have that as "ur" is good for UTF-32.
As said, UTF-32 is possible, "ur" can handle 32-bit values.
IMO it is UTF-8 (actually UTF-32) specification, Matroska does not define it too, and decoder can just skip the content.
timebase stuff is the role of the container, even MPEG-TS does that (making info about width/height mandatory in the codec as there is no place for that elsewhere).
Would be in the definition of the metadata. I don't define any 1D array, just a suite of elements (with a dedicated count). it is generic and the goal is not to have complex stuff at the codec level. Matroska does same, having few type (signed, unsigned, string as UTF-8 as they don't have "Range Coder" stuff, and binary used for any 2D/3D array)
I don't expect to transport title or any localized metadata, the idea is not to recreate Matroska, just transport video-related metadata, like a lot of other video format (H.264 VUI and SEI etc...). |
you seem not to understand what i mean. Let me give you a very silly example
In an application that supports only string based metadata, binary will alway have to be converted to a string in some form base64 encoding or whatever. Now if there is no information on if data is a string or binary that would like become messy in an implementation.
ok but what you say here about MPEG is not correct. mpeg-ts or PS does in general not store timestamps for every frame at the container level. The frame rate and frame duration from the codec layer has to be used to find per frame timestamps in MPEG-1/2 (this becomes more complex than this after MPEG-2) In this sense the codec layer frame rate represents the timebase, the frame/field repeat flags represent the timestamps. |
As we can add content at the end of
Parameters()
in version 3, we don't need to wait for version 4 for adding metadata not impacting the frame bitstream.I suggest to add frame size (width, height) in order to be independent from the container (and some container e.g. MPEG-TS does not have a place for such information), self describing, and also add metadata part (same reason: self describing and losing nothing even if there is a transwrapping).
The design has the goal to be able to skip metadata items not known by the decoder (i.e. if we add a new metadata id, we don't need to increase the version or micro_version number), as all metadata items are optional and not needed for the decoder (e.g. matrix coefs are just set as unknown if not present or not supported by the decoder, as it is nowadays).
metadata_size
would be 1 most of the time, I expect exceptions for char strings e.g. library name (it would be the count of unicode characters in this metadata)Names of metadata come from Matroska.
Work in progress, for feedback and agreement about the way to do it, before I expand it.