Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

calling avcodec_send_frame locks up with ffmpeg h264_omx encoder #433

Closed
jasaw opened this issue Jul 28, 2017 · 21 comments
Closed

calling avcodec_send_frame locks up with ffmpeg h264_omx encoder #433

jasaw opened this issue Jul 28, 2017 · 21 comments

Comments

@jasaw
Copy link
Contributor

jasaw commented Jul 28, 2017

  1. motion version 280141f, ffmpeg release 3.3.2.
  2. compiled from source.
  3. ran as part of MotionEye initially, then used config generated by MotionEye to run motion as standalone.
  4. tested with mmal and v4l2.
  5. ARM (only reproducible on multiple models of Raspberry Pi).
  6. tested both Raspbian & MotionEyeOS.
  7. GPU firmware version: 3202f1b16896029f9da1b074b0912177e8960b52

How to trigger avcodec_send_frame lock up:

  • Make sure h264_omx encoder is used by motion (either modify motion to specifically choose the encoder, or get ffmpeg to always default to h264_omx encoder when H264 format is selected).
  • Set resolution at 1280 x 720. Frame rate makes no difference.
  • Record video into a movie file, mp4 format.
  • Run motion and trigger some motion.
  • At some point, it will lock up when calling avcodec_send_frame, i.e. code calls the function and does not return.

Is it locking up because of how motion interfaces with libav? Anyone has any clue as to what might be happening?

I've checked the GPU status, and everything was OK.

vcgencmd get_mem <type>
Where type is:
arm: total memory assigned to arm
gpu: total memory assigned to gpu
malloc_total: total memory assigned to gpu malloc heap
malloc: free gpu memory in malloc heap
reloc_total: total memory assigned to gpu relocatable heap
reloc: free gpu memory in relocatable heap

vcgencmd get_throttled
0: under-voltage
1: arm frequency capped
2: currently throttled
16: under-voltage has occurred
17: arm frequency capped has occurred
18: throttling has occurred

Config file from my Raspbian system below.

thread-1.conf

ffmpeg_output_movies on
height 720
stream_quality 5
threshold 28800
quality 85
noise_level 31
ffmpeg_output_debug_movies off
pre_capture 1
noise_tune on
smart_mask_speed 0
stream_maxrate 1
output_pictures off
hue 0
saturation 0
stream_localhost off
ffmpeg_variable_bitrate 75
ffmpeg_video_codec mp4
text_changes off
movie_filename %Y-%m-%d/%H-%M-%S
auto_brightness off
stream_port 8081
rotate 0
brightness 0
lightswitch 0
framerate 20
emulate_motion off
snapshot_filename
despeckle_filter
snapshot_interval 0
stream_auth_method 0
stream_motion off
target_dir /var/lib/motioneye/Camera1
text_double on
post_capture 100
stream_authentication user:
output_debug_pictures off
on_picture_save /usr/local/lib/python2.7/dist-packages/motioneye/scripts/relayevent.sh "/etc/motioneye/motioneye.conf" picture_save %t %f
on_movie_end /usr/local/lib/python2.7/dist-packages/motioneye/scripts/relayevent.sh "/etc/motioneye/motioneye.conf" movie_end %t %f
text_left Camera1
picture_filename
locate_motion_style redbox
locate_motion_mode off
contrast 0
videodevice /dev/video0
max_movie_time 0
on_event_end /usr/local/lib/python2.7/dist-packages/motioneye/scripts/relayevent.sh "/etc/motioneye/motioneye.conf" stop %t
text_right %Y-%m-%d\n%T
on_event_start /usr/local/lib/python2.7/dist-packages/motioneye/scripts/relayevent.sh "/etc/motioneye/motioneye.conf" start %t
event_gap 30
minimum_motion_frames 15
mask_file
width 1280
@Mr-Dave
Copy link
Member

Mr-Dave commented Jul 28, 2017

You are the first person I have heard of using the omx encoder with ffmpeg. It really isn't part of our support / testing cycle. As such, I am not sure of the resolution. To break the deadlock, a ffmpeg callback /interrupt process function would need to be added like what exists within the netcam_rtsp.

@jasaw
Copy link
Contributor Author

jasaw commented Aug 1, 2017

I've been trying to debug this for a few days and got no where. What I don't understand is, it works flawlessly at 800 x 600 resolution. I've been running it for a few weeks. Works at 1920 x 1080 resolution too, but didn't get as much testing. At 1280 x 720 resolution, it locks up.
I ran it with strace -f, but did not see anything obvious. vcdbg log didn't show any error too.
What else can I try?

@jogu
Copy link
Member

jogu commented Aug 1, 2017

If the problem is a thread is getting stuck, wait for this to happen, then run gdb -p Then 'thread apply all backtrace' (I may have misremembered syntax slightly) and paste the output into a gist/pastebin and share URL here.

@jasaw
Copy link
Contributor Author

jasaw commented Aug 2, 2017

gdb thread apply all backtrace output here but gdb hit an internal-error.

I did more debugging and found that it's actually stuck at OMX_EmptyThisBuffer function in ffmpeg libavcodec/omx.c. sudo vcdbg log assert did not show anything useful.

@jogu
Copy link
Member

jogu commented Aug 2, 2017

It may well be that you're suffering from an ffmpeg or kernel/GPU driver issue that you will need help from the raspberry pi people with. You should also make sure your PSU is sufficient (ie. is an official PSU for that model of PI which has at least the required current capacity).

You could possibly try a debug build of motion to see if gdb is happier with that; if it's not then I suspect your system has a broken gdb or compiler and it's going to be difficult to get more info.

@jasaw
Copy link
Contributor Author

jasaw commented Aug 3, 2017

I can confirm that it's definitely not a power supply issue because I'm seeing it with different models of Raspberry Pis with different power supplies and official power supply too. Also vcgencmd get_throttled stayed zero during the entire test, which means not a power supply issue.

I'm running latest Raspbian Jessie, and I'm disappointed that it came with a broken gdb.

I'm not sure if I'm hitting an ffmpeg or GPU firmware bug, or motion is not using the ffmpeg API "correctly". I'll debug further, see what else I can find.

@jasaw
Copy link
Contributor Author

jasaw commented Aug 4, 2017

Running motion with extpipe to ffmpeg h264_omx is stable at various resolutions.
motion --> extpipe --> ffmpeg (h264_omx encoder)

extpipe config:

use_extpipe on
extpipe ffmpeg -y -f rawvideo -pix_fmt yuv420p -video_size %wx%h -framerate %fps -i pipe:0 -c:v h264_omx -profile:v high -b:v 3000000 -f mp4 %f.mp4

Running motion encoding via ffmpeg C API hangs at 1280 x 720 resolution. Exact same encoder configuration (bitrate, profile, etc...) as extpipe version.
motion --> ffmpeg API (h264_omx encoder)

Edit: Another interesting observation is the extpipe version achieves higher frame rate than API version.

@jasaw
Copy link
Contributor Author

jasaw commented Aug 11, 2017

Taking out the input_zerocopy from ffmpeg libavcodec/omx.c : omx_encode_init function seems to stop the locking up issue.

#if CONFIG_OMX_RPI
    s->input_zerocopy = 1;
#endif

@tosiara
Copy link
Member

tosiara commented Aug 11, 2017

Did you report that to ffmpeg?

@jasaw
Copy link
Contributor Author

jasaw commented Aug 11, 2017

Not yet. I haven't figured out exactly who's fault it is, motion or ffmpeg. Need to find out who's supposed to manage what buffer, when can it be freed, that sort of thing.

If someone has ffmpeg knowledge, please chime in.

@jasaw
Copy link
Contributor Author

jasaw commented Aug 14, 2017

It appears that whenever the zerocopy condition in omx.c is satisfied (contiguous planes and stride alignment), the call to OMX_EmptyThisBuffer hangs. 1280 x 720 resolution images happen to meet that condition.

Still don't know why it hangs.

@tosiara
Copy link
Member

tosiara commented Aug 16, 2017

Can you share your exact patch how do you force h264_omx encoder? So I can try to reproduce it

@jasaw
Copy link
Contributor Author

jasaw commented Aug 17, 2017

You'll need to compile ffmpeg with omx-rpi enabled. I added this to ffmpeg config --enable-omx --enable-omx-rpi --enable-mmal.

motion patch to choose h264_omx encoder.

diff --git a/ffmpeg.c b/ffmpeg.c
index 71685a1..07ce41c 100644
--- a/ffmpeg.c
+++ b/ffmpeg.c
@@ -485,7 +485,13 @@ static int ffmpeg_set_codec(struct ffmpeg *ffmpeg){
     char errstr[128];
     int chkrate;
 
-    ffmpeg->codec = avcodec_find_encoder(ffmpeg->oc->oformat->video_codec);
+    ffmpeg->codec = NULL;
+    if (ffmpeg->oc->oformat->video_codec == AV_CODEC_ID_H264)
+        ffmpeg->codec = avcodec_find_encoder_by_name("h264_omx");
+    else if (ffmpeg->oc->oformat->video_codec == AV_CODEC_ID_MPEG4)
+        ffmpeg->codec = avcodec_find_encoder_by_name("mpeg4_omx");
+    if (!ffmpeg->codec)
+        ffmpeg->codec = avcodec_find_encoder(ffmpeg->oc->oformat->video_codec);
     if (!ffmpeg->codec) {
         MOTION_LOG(ERR, TYPE_ENCODER, NO_ERRNO, "Codec %s not found", ffmpeg->codec_name);
         ffmpeg_free_context(ffmpeg);

@jasaw
Copy link
Contributor Author

jasaw commented Aug 17, 2017

I got some info from 6by9 (one of the Raspberry Pi guys). This explains why ffmpeg needs to copy frame, therefore avoid the locking up issue at 800 x 600 and 1920 x 1080 resolutions.

800x600 - 600 is not multiple of 16 for nSliceHeight (608 is), so needs a copy.
1920x1080 - 1080 is again not multiple of 16 for nSliceHeight (1088 is), so needs a copy.

I haven't got time to investigate further exactly why it locks up. Maybe ffmpeg gave it bad pointer in the buffer header.

@tosiara
Copy link
Member

tosiara commented Aug 17, 2017

Does it only happen on Rpi? Is it reproducible on x86 with USB web cam?

@tosiara tosiara added the mmal label Aug 17, 2017
@tosiara
Copy link
Member

tosiara commented Aug 17, 2017

Unfortunately, I can't test your patch on OrangePi Zero and ffmpeg compiled with --enable-omx:

[1:ml1] [NTC] [EVT] event_new_video: Source FPS 9
[1:ml1] [ERR] [ENC] ffmpeg_set_codec: Could not open codec Encoder not found
[1:ml1] [ERR] [NET] ffmpeg_open: Failed to allocate codec!
[1:ml1] [ERR] [EVT] event_ffmpeg_newfile: ffopen_open error creating (new) file [/home/motion/01-20170817090721.mp4]

@tosiara
Copy link
Member

tosiara commented Aug 17, 2017

Ok, I had also to add --enable-libx264 which was somehow missing.
Now, my motion was running for 10 minutes fine, recorded a valid mp4, no lock up or any issue
1280x720 10fps YUV, if that matters

@jasaw
Copy link
Contributor Author

jasaw commented Aug 17, 2017

I only see the problem on raspberry Pi.

@tosiara
Copy link
Member

tosiara commented Aug 17, 2017

Ok, I have updated your initial report that the issue is only reproducible on Raspberry Pi

@jasaw
Copy link
Contributor Author

jasaw commented Sep 7, 2017

Lockup issue reported as raspberrypi/firmware#851.

@Mr-Dave
Copy link
Member

Mr-Dave commented Sep 18, 2017

Closing as problem upstream

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants