Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Native video decoder via ffmpeg CLI #7607

Closed
emilk opened this issue Oct 7, 2024 · 2 comments · Fixed by #8007
Closed

Native video decoder via ffmpeg CLI #7607

emilk opened this issue Oct 7, 2024 · 2 comments · Fixed by #8007
Assignees
Labels
enhancement New feature or request 🎞️ video

Comments

@emilk
Copy link
Member

emilk commented Oct 7, 2024

Send samples to ffmpeg over CLI, read back the results, show the user.

A basic demo of this is in this internal repository: https://github.com/rerun-io/video-experiments

The main approach is fairly simple:

 let mut ffmpeg = FfmpegCommand::new()
        .hide_banner()
        // Keep in mind that all arguments that are about the input, need to go before!
        .format("h264")
        .input("-")
        //.args(&["-c", "copy", "out.mp4"]) // For testing.
        .rawvideo() // Output rgb24 on stdout. (todo for later: any format we can read directly on re_renderer would be better!)
        .spawn()
        .expect("faild to spawn ffmpeg");

    let mut stdin = ffmpeg.take_stdin().unwrap();
    std::thread::spawn(move || {
          // Wait for samples to be sent on stdin.
          // then send them to ffmpeg.
          // See `write_sample_to_nalu_stream`
    });

    // On the main thread, run the output instance to completion
    ffmpeg.iter().unwrap().for_each(|e| match e {
        FfmpegEvent::Log(LogLevel::Error, e) => println!("Error: {}", e),
        FfmpegEvent::Progress(p) => println!("Progress: {} / 00:00:15", p.time),
        FfmpegEvent::OutputFrame(frame) => println!(
            "Received frame: time {:?} fmt {:?} size {}x{}",
            frame.timestamp, frame.pix_fmt, frame.width, frame.height
        ),
        evt => println!("Event: {evt:?}"),
    });

The difficult part is that we need to provide ffmpeg with a format that it can stream in on stdin. In this snippet this is done with format("h264"), so ffmpeg expects the .h264 format which is a stream of NAL units in Annex B format (== replace NAL lengths with start headers). Also, at every IDR frame, sequence parameter sets (SPS) and frame parameter sets (FPS) need to be inserted:

fn write_sample_to_nalu_stream(
    avc_box: &re_mp4::Avc1Box,
    nalu_stream: &mut dyn std::io::Write,
    sample: &re_mp4::Sample,
    video_track_data: &[u8],
    state: &mut NaluStreamState,
) -> Result<(), Box<dyn std::error::Error>> {
    let avcc = &avc_box.avcc;

    // Append SPS & PPS NAL unit whenever encountering an IDR frame unless the previous frame was an IDR frame.
    // TODO(andreas): Should we detect this rather from the NALU stream rather than the samples?
    if sample.is_sync && !state.previous_frame_was_idr {
        for sps in (&avcc.sequence_parameter_sets).iter() {
            nalu_stream.write_all(&NAL_START_CODE)?;
            nalu_stream.write_all(&sps.bytes)?;
        }
        for pps in (&avcc.picture_parameter_sets).iter() {
            nalu_stream.write_all(&NAL_START_CODE)?;
            nalu_stream.write_all(&pps.bytes)?;
        }
        state.previous_frame_was_idr = true;
    } else {
        state.previous_frame_was_idr = false;
    }

    // A single sample, may consist of multiple NAL units, each of which need our special treatment.
    // (most of the time it's 1:1, but there might be extra NAL units for info, especially at the start)
    let mut buffer_offset = sample.offset as usize;
    let sample_end = buffer_offset + sample.size as usize;
    while buffer_offset < sample_end {
        // Each NAL unit in mp4 is prefixed with a length prefix.
        // In Annex B this doesn't exist.
        let length_prefix_size = avcc.length_size_minus_one as usize + 1;

        // TODO: improve the error handling here.
        let nal_unit_size = match length_prefix_size {
            4 => u32::from_be_bytes(
                video_track_data[buffer_offset..(buffer_offset + 4)]
                    .try_into()
                    .unwrap(),
            ) as usize,
            2 => u16::from_be_bytes(
                video_track_data[buffer_offset..(buffer_offset + 2)]
                    .try_into()
                    .unwrap(),
            ) as usize,
            1 => video_track_data[buffer_offset] as usize,
            _ => panic!("invalid length prefix size"),
        };
        //println!("nal unit size: {}", nal_unit_size);

        if (sample.size as usize) < nal_unit_size {
            panic!(
                "sample size {} is smaller than nal unit size {nal_unit_size}",
                sample.size
            );
        }

        nalu_stream.write_all(&NAL_START_CODE)?;
        let data_start = buffer_offset + length_prefix_size; // Skip the size.
        let data_end = buffer_offset + nal_unit_size + length_prefix_size;
        let data = &video_track_data[data_start..data_end];

        // Note that we don't have to insert "emulation prevention bytes" since mp4 NALU still use them.
        // (unlike the NAL start code, the preventation bytes are part of the NAL spec!)

        nalu_stream.write_all(data)?;

        buffer_offset = data_end;
    }

    Ok(())
}

Open questions:

  • Does this work with H.265 as well? It uses the same formats overall
  • What about other codecs?
  • Does this approach scale well with other formats?
  • ffmpeg allows arbitrary output formats. Can we be clever about what to pick? E.g. if the decoder internally gives us YUV420 (that's most of the time but not always!), we should pass YUV420 on instead of rgb24 and processes this directly.
@emilk emilk added enhancement New feature or request 👀 needs triage This issue needs to be triaged by the Rerun team 🎞️ video and removed 👀 needs triage This issue needs to be triaged by the Rerun team labels Oct 7, 2024
This was referenced Oct 7, 2024
@Wumpf
Copy link
Member

Wumpf commented Nov 4, 2024

Left after the initial PR:

  • test with broken samples (completely invalid chunks)
  • download ffmpeg for the user or at least tell users what to do when there's no ffmpeg (haven't checked even yet on the error message)
  • add version & ability check
  • consume 420p and friends directly to speed up processing
  • check speed of windows process startup (make sure we don't get hickups on scrubbing)
  • update docs

Wumpf added a commit that referenced this issue Nov 4, 2024
### What

* Most of #7607 
* Replaces #7658 

Tested this with videos containing b frames, both long and short.

<img width="1403" alt="image"
src="https://github.com/user-attachments/assets/12da809b-5aee-4c06-ba30-394d82073a2a">

(note how PTS != DTS, indicating that there's a b-frame _somewhere_ ;-))

Biggest risk overall with the approach taken is that we can't detect
dropped frames properly

Left to do before landing:
* [x] test 4k 60fps video material
* works smoothly in release, not at 60fps in debug builds on my mac
because of rgb->rgba conversions on our side (see todos below)
* [x] test black & white (yuv400)
* [x] works but my video test material gave me:
`re_video::decode::ffmpeg]
/\//Users/andreas/dev/rerun-io/internal-test-assets/video/greyscale_360p_h264.mp4,
codec: H.264 (avc1.64001E) decoder: [swscaler @ 0x3001d8000] [warning]
deprecated pixel format used, make sure you did set range correctly`
every time ffmpeg started, need to fix that to a log_once (if ffmpeg
says it's deprecated then I figure we may report that 🤷, just don't
spam)
* [x] test yuv444/rgb video
* worked, but the test 4.4s test video I used stops working starting at
3.2s, recovers fine though
* Direct play browser: Firefox says the file is corrupt, Safari refuses
to play. Chrome handles it
        * Rerun play browser: Same, but different errors
        * ... letting this issue slide for now 🤷 

What's left to do (beyond general video bugs I logged while working on
this) before calling this "done":
* download ffmpeg for the user or at least tell users what to do when
there's no ffmpeg (haven't checked even yet on the error message)
* add version & ability check for ffmpeg?
* consume 420p and friends directly to speed up processing
* try to provoke frame dropping and see if we can detect & handle that -
one of the shortcomings of the overall approach today is that we don't
track this!

### Checklist
* [x] I have read and agree to [Contributor
Guide](https://github.com/rerun-io/rerun/blob/main/CONTRIBUTING.md) and
the [Code of
Conduct](https://github.com/rerun-io/rerun/blob/main/CODE_OF_CONDUCT.md)
* [x] I've included a screenshot or gif (if applicable)
* [x] I have tested the web demo (if applicable):
* Using examples from latest `main` build:
[rerun.io/viewer](https://rerun.io/viewer/pr/7962?manifest_url=https://app.rerun.io/version/main/examples_manifest.json)
* Using full set of examples from `nightly` build:
[rerun.io/viewer](https://rerun.io/viewer/pr/7962?manifest_url=https://app.rerun.io/version/nightly/examples_manifest.json)
* [x] The PR title and labels are set such as to maximize their
usefulness for the next release's CHANGELOG
* [x] If applicable, add a new check to the [release
checklist](https://github.com/rerun-io/rerun/blob/main/tests/python/release_checklist)!
* [x] If have noted any breaking changes to the log API in
`CHANGELOG.md` and the migration guide

- [PR Build Summary](https://build.rerun.io/pr/7962)
- [Recent benchmark results](https://build.rerun.io/graphs/crates.html)
- [Wasm size tracking](https://build.rerun.io/graphs/sizes.html)

To run all checks from `main`, comment on the PR with `@rerun-bot
full-check`.

---------

Co-authored-by: Emil Ernerfeldt <emil.ernerfeldt@gmail.com>
@emilk emilk added this to the 0.20 - Maps, H.264 milestone Nov 4, 2024
@Wumpf
Copy link
Member

Wumpf commented Nov 5, 2024

process startup isn't much of an issue according to profiler, but wait for shutdown can cause severe hickups:
image

Wumpf added a commit that referenced this issue Nov 5, 2024
### What

* part of #7607 

<img width="1012" alt="image"
src="https://github.com/user-attachments/assets/317d2d07-d544-4743-86ef-c1ae51168b16">


Future work:
* download for you in the viewer to a location we manage
* have location be an application setting

### Checklist
* [x] I have read and agree to [Contributor
Guide](https://github.com/rerun-io/rerun/blob/main/CONTRIBUTING.md) and
the [Code of
Conduct](https://github.com/rerun-io/rerun/blob/main/CODE_OF_CONDUCT.md)
* [x] I've included a screenshot or gif (if applicable)
* [x] I have tested the web demo (if applicable):
* Using examples from latest `main` build:
[rerun.io/viewer](https://rerun.io/viewer/pr/7991?manifest_url=https://app.rerun.io/version/main/examples_manifest.json)
* Using full set of examples from `nightly` build:
[rerun.io/viewer](https://rerun.io/viewer/pr/7991?manifest_url=https://app.rerun.io/version/nightly/examples_manifest.json)
* [x] The PR title and labels are set such as to maximize their
usefulness for the next release's CHANGELOG
* [x] If applicable, add a new check to the [release
checklist](https://github.com/rerun-io/rerun/blob/main/tests/python/release_checklist)!
* [x] If have noted any breaking changes to the log API in
`CHANGELOG.md` and the migration guide

- [PR Build Summary](https://build.rerun.io/pr/7991)
- [Recent benchmark results](https://build.rerun.io/graphs/crates.html)
- [Wasm size tracking](https://build.rerun.io/graphs/sizes.html)

To run all checks from `main`, comment on the PR with `@rerun-bot
full-check`.
Wumpf added a commit that referenced this issue Nov 5, 2024
### What

* Part of #7607

(All perf numbers on Windows which is well known for slower process
handling compared to its competitors.)

Makes the ffmpeg listen/write thread shutdown much more rigorous. It
worked fine before, but on Windows in particular I often noticed long
stalls:

![383056190-113e4efa-3376-4b7b-96cf-dfd79fac8e4d](https://github.com/user-attachments/assets/a31186dd-3971-4e30-b0c3-7bf9c9facffd)

This happened mostly due to delays in "noticing" that ffmpeg was shut
down already. The shutdown fixes themselves, make the problem _almost_
go away. But it still happens that the "listen thread" takes a fair
while until it closes:

![image](https://github.com/user-attachments/assets/936372ab-db4c-4c16-adf4-d25e45fe27c6)

Occasionally I also still observed 100ms+ for this operation, especially
when having several decoders open at the same time (coincidence? did
that just make it more likely to get a bad one or is there more to it?).
So I decided to have the thread shut down in the background instead -
this is safe now, since the `on_output` callback gets disconnected prior
to the shutdown.

No profiler picture for this, since there's nothing left to look at ;-)

### Checklist
* [x] I have read and agree to [Contributor
Guide](https://github.com/rerun-io/rerun/blob/main/CONTRIBUTING.md) and
the [Code of
Conduct](https://github.com/rerun-io/rerun/blob/main/CODE_OF_CONDUCT.md)
* [x] I've included a screenshot or gif (if applicable)
* [x] I have tested the web demo (if applicable):
* Using examples from latest `main` build:
[rerun.io/viewer](https://rerun.io/viewer/pr/7998?manifest_url=https://app.rerun.io/version/main/examples_manifest.json)
* Using full set of examples from `nightly` build:
[rerun.io/viewer](https://rerun.io/viewer/pr/7998?manifest_url=https://app.rerun.io/version/nightly/examples_manifest.json)
* [x] The PR title and labels are set such as to maximize their
usefulness for the next release's CHANGELOG
* [x] If applicable, add a new check to the [release
checklist](https://github.com/rerun-io/rerun/blob/main/tests/python/release_checklist)!
* [x] If have noted any breaking changes to the log API in
`CHANGELOG.md` and the migration guide

- [PR Build Summary](https://build.rerun.io/pr/7998)
- [Recent benchmark results](https://build.rerun.io/graphs/crates.html)
- [Wasm size tracking](https://build.rerun.io/graphs/sizes.html)

To run all checks from `main`, comment on the PR with `@rerun-bot
full-check`.
Wumpf added a commit that referenced this issue Nov 6, 2024
### What

* part of #7607

Recognizing ahead of time what format a piece of encoded data is
unfortunately not that easy. To accomplish that I fell into the rabbit
hole of parsing the Sequence Parameter Set (SPS). Which will hopefully
be useful for other things in the future!

Other than that, it turned out that determining the color space & color
range are again different cans of worms, so I'm being now a lot more
explicit with ffmpeg on what we expect.
Which.. is yet another rabbit hole! Pretty much no matter what I tried I
couldn't replicate exact colors some sample videos compared to VLC
(unlike we did with AV1 where we have a lot more solid grasp on all
these things and get satisfying parity!). But also here different video
players seem to do slightly different things, so I decided to not follow
this thread further for the moment as this whole affair already cost me
a lot more time that I wanted to invest originally.

~Partially disabled unfortunately since we need this fix:~
* nathanbabcock/ffmpeg-sidecar#61

<img width="390" alt="image"
src="https://github.com/user-attachments/assets/c340544a-0c32-406e-ba20-2acc630c0238">

(note that this uses 422, the video source material is 420 though)


Fixes slow 4k video handling:

Before:
<img width="512" alt="image"
src="https://github.com/user-attachments/assets/82a0f0a1-03e7-4eac-8a24-a17ba185a8b5">


After:
<img width="548" alt="image"
src="https://github.com/user-attachments/assets/fb8315f4-5a85-4c2e-83bb-7e7e0c8ded33">


### Checklist
* [x] I have read and agree to [Contributor
Guide](https://github.com/rerun-io/rerun/blob/main/CONTRIBUTING.md) and
the [Code of
Conduct](https://github.com/rerun-io/rerun/blob/main/CODE_OF_CONDUCT.md)
* [x] I've included a screenshot or gif (if applicable)
* [x] I have tested the web demo (if applicable):
* Using examples from latest `main` build:
[rerun.io/viewer](https://rerun.io/viewer/pr/8002?manifest_url=https://app.rerun.io/version/main/examples_manifest.json)
* Using full set of examples from `nightly` build:
[rerun.io/viewer](https://rerun.io/viewer/pr/8002?manifest_url=https://app.rerun.io/version/nightly/examples_manifest.json)
* [x] The PR title and labels are set such as to maximize their
usefulness for the next release's CHANGELOG
* [x] If applicable, add a new check to the [release
checklist](https://github.com/rerun-io/rerun/blob/main/tests/python/release_checklist)!
* [x] If have noted any breaking changes to the log API in
`CHANGELOG.md` and the migration guide

- [PR Build Summary](https://build.rerun.io/pr/8002)
- [Recent benchmark results](https://build.rerun.io/graphs/crates.html)
- [Wasm size tracking](https://build.rerun.io/graphs/sizes.html)

To run all checks from `main`, comment on the PR with `@rerun-bot
full-check`.
Wumpf added a commit that referenced this issue Nov 6, 2024
…og quirks of older versions (#8005)

### What

* part of #7607

Most replacing randomness with warnings. I haven't deep dived on which
version will work but 5.1 went fine with my test data minus a bit of log
spam which I fixed here.

I considered adding a check for coded availability, but this seems a bit
cumbersome given that any off-the-shelf ffmpeg install comes with h264:
the problem is that while we get a full string of compile options that
ffmpeg was built with, we can't really infer the decoders from this
since any given library may fulfill various roles. Instead we'd need to
parse the output of `ffmpeg -codecs`. We should eventually! But doesn't
seem urgent right now.

### Checklist
* [x] I have read and agree to [Contributor
Guide](https://github.com/rerun-io/rerun/blob/main/CONTRIBUTING.md) and
the [Code of
Conduct](https://github.com/rerun-io/rerun/blob/main/CODE_OF_CONDUCT.md)
* [x] I've included a screenshot or gif (if applicable)
* [x] I have tested the web demo (if applicable):
* Using examples from latest `main` build:
[rerun.io/viewer](https://rerun.io/viewer/pr/8005?manifest_url=https://app.rerun.io/version/main/examples_manifest.json)
* Using full set of examples from `nightly` build:
[rerun.io/viewer](https://rerun.io/viewer/pr/8005?manifest_url=https://app.rerun.io/version/nightly/examples_manifest.json)
* [x] The PR title and labels are set such as to maximize their
usefulness for the next release's CHANGELOG
* [x] If applicable, add a new check to the [release
checklist](https://github.com/rerun-io/rerun/blob/main/tests/python/release_checklist)!
* [x] If have noted any breaking changes to the log API in
`CHANGELOG.md` and the migration guide

- [PR Build Summary](https://build.rerun.io/pr/8005)
- [Recent benchmark results](https://build.rerun.io/graphs/crates.html)
- [Wasm size tracking](https://build.rerun.io/graphs/sizes.html)

To run all checks from `main`, comment on the PR with `@rerun-bot
full-check`.
@Wumpf Wumpf closed this as completed in eec4915 Nov 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request 🎞️ video
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants