-
Notifications
You must be signed in to change notification settings - Fork 35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Support for encodings using floating-point values #197
base: master
Are you sure you want to change the base?
Conversation
Please dont. Instead add a pixel format. If adding support in some part like swscale is too hard then just skip this but do not add more hacks like a downconvert to 16bit integers
colorspace_type is the wrong field to indicate float formats also i would not drop the "integer" from the transform, whatever is done it needs to be a integer transform else precission and rounding becomes a issue with floats and lossless |
I was imagining to split the work, in order to reduce the count of changes of the corresponding patch, but I am fine with adding 16-bit float support in FFmpeg at the same time if I don't have to do swcale stuff at the same time, so:
here it makes the patch easier, I was afraid about this task and have my patch rejected if I don't do that, reason I was planning to do as it is in EXR.
I don't get that, how? IIUC, adding a new field will make old decoders just skip it, not what we would want. using
The idea is definitely the opposite, I'll keep the word and add a "MUST consider float values as integers" somewhere else. |
does a version 3 decoder decode it to something meaningfull ? version 4 is not final yet so any decoder attempting to decode that accepts potential failure. If you want to add a method of more fine grained "file support" detection thats fine, wouldnt be a bad thing. But please dont hack new features into semantically wrong fields |
Instead of one float flag in the header we could add fields that specify the number of bits in the mantisse, if theres a sign bit and the range of exponents (larger than 1 and how much detail around 0). This would be a superset for single and double precission IEEE floats. And should also improve speed as fewer "always 0" planes would be stored |
I strongly support the idea to implement floating-point formats in version 4 only, avoiding to try a “hack” for version 3, and to do it consistently from scratch. |
Any update on this ? |
16bit float to 32bit float in exr decoder with tables is not lossy, its lossless - no data is lost. |
I don't have any specific update on this, on our side we went to an awful but working hack using EXR 16-bit float as integer and signaling EXR 16-bit float with a side channel (compression is in practice like 16 bit integer, average of 50% compression), but still interested in moving to something less hack. In my opinion the discussion is more about how to signal the pix_fmt and maybe avoiding to compress only 0 bit planes rather than having a complex and new path for compressing and decompressing. It could also be reused for integers as sometimes some bits are 0 padding or very dark (so highest bit always 0 in a slice). State of my thoughts about a superset of changes for v4:
Rationale is that we don't know in advance the content which could be e.g. negative or more than 1 in only one frame at the end, so we don't prevent this possibility at encoder and decoder init but we permit speed optimization by limiting the count of bits managed by the range coder, on the decoder side it is only an extra bit shift in the case of lower bits having only 0 and nothing for higher bits. What other optimizations do you see for this topic? |
I think there are 3 different things here.
we can do 1+2 and treat each independantly or look at 3 first and then decide
we have at least 3 things.
All 3 are wrong for floats, in the sense of not being "homomorphic" That is if you take a few integers and a few floats that are equivalent in some sense then these opertions do not do the same thing to both. |
Treating floats as integers when compressing? I doubt one can get any big compression gain that way, at least for audio case it is bad... |
I can not share the files but real use case by a RAWcooked user (non relevant things removed, same FFV1 config with v3 and 576 slices):
TLDR, FFV1 compresses this file by 60%! More generally we have an average compression ratio like that, better than our 16-bit int (easy, lot of MSB at 0... But still good!) which is ~50% compression. Users appreciate a lot to have this compression ratio and prefer to have this one rather than storing EXR files as is, and I have doubt we could really do a lot better without a lot of changes in FFV1, current issue is not the compression ratio but the fact that there is no standard signaling of float. |
I think we should switch to files that can be shared.
Theres a chance FFv1 maintaince work this year and especially development of float support will be funded. If thats the case i intend to investigate more completely how to optimally handle floats. The variant of simply treating them as integers isnt bad and i suggest we support that too as it adds 0 complexity but i agree with paul that it should be possible to do better than that. |
Are these real 32bit float EXR files - with natural (camera footage) content and one with synthetic (blender rendered ones) non trivial content? EXR have just bad lossless 32-bit float compressions IIRC. If current/future coder in FFv1 can make extra reductions with mantissa and exp bits (with no need for separate coding of two of them) that would be major win. |
I wish I have that... And I am interested in such files, because it seems very hard to get such content, I developed my hack "blind" and it was enough for my needs. In the meantime, 16-bit float non real use cases e.g.:
FYI tests are made with this ugly patch for FFmpeg and the lossless compression is confirmed when the added option is used.
It would be great if there is a demonstration that the additional complexity is worth it, I like FFV1 also because of its "low" complexity (very small code size compared to some other lossless formats). |
I used as a starting point: https://openexr.com/en/latest/_test_images/index.html Yet I will check if some of our clients are willing to share publicly examples. |
This pull request expands FFV1 to provide lossless compression to additional pixel formats by adding floating-point values support, and permits the lossless encoding and decoding of the following FFmpeg currently existing floating-point values “pix_fmt” (AV_PIX_FMT_GBRPF32, AV_PIX_FMT_GBRAPF32, AV_PIX_FMT_GRAYF32) as well as their (not yet existing) 16-bit counterparts.
Video formats such as EXR can use floating-point values.
Note about the implementation in FFmpeg: As FFmpeg does not (yet) support 16-bit floating point RGB pixel formats, I plan to send a patch for ffv1dec using exactly the same method of decoding as FFmpeg handles EXR (using of the lossy conversion to integer function from EXR implementation after FFV1 decoding, no encoding), with a decoding message about the lossy decoding from the conversion. The additional complexity is minimal (a test on
colorspace_type
in order to apply the float to int conversion after decoding).Note about the YCbCr part: as FFmpeg has AV_PIX_FMT_GRAYF32, I prefer to anticipate the support of such pix_fmt in order to have a coherent specification, by just increasing the
colorspace_type
value by 2 for each previouscolorspace_type
values.Potential optimizations: In practice for 16-bit content not all bits are used (bit 15 is the sign and so always 0 and bit 14 is for more than 1.0 and so always 0); however, in theory it is possible to have values that are negative or greater than 1 (see AllHalfValues.exr description
in https://github.com/AcademySoftwareFoundation/openexr-images/tree/master/TestImages ) so we can not simply omit these bits in the Parameters. Optimization about reducing the bit depth requires more complex changes (similar to how we could reduce Y bit depth to bit_depth instead of
bit_depth+1
withcolorspace_type
of 1, as Only Cb and Cr have a range ofbit_depth+1
bits) which would be implemented in version 4. The idea of the implementation in version 3 is to keep the decoder nearly untouched.This link has some sample files as a proof of concept (the
colorspace_type
value in the FFV1 bitstream is wrong for both the DPX header and FFV1 bitstream but permits to decode the floating-point numbers from current FFmpeg; FFmpeg patches are hacks for moving the float to int algo from EXR decoder to FFV1 decoder).