-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
mmal: Get component port state debug information #377
Comments
Yes, /opt/vc/bin/mmal_vc_diag should do what you want. |
Thanks, this looks very helpful :) |
I believe mmal and openmax are thread safe, so you don't need additional locking. |
Ok, this is what I assumed. I'm currently seeing a deadlock when I do aggressive start/stop loops. See this backtrace: Looking at Thread 6, which is teardown of the decoder module, which just called mmal_component_disable and at Thread 4 which just tries to send a picture to the deinterlace input port it looks like mmal_vc_send_message and mmal_vc_sendwait_message might be blocking each other. Could this be? Maybe @6by9 has some thoughts, too? |
Here is a slight variation of the deadlock: In this case Thread 5 and 4 seem to block each other - Thread 5 being the deinterlace filter calling mmal_port_send_buffer and Thread 4 the vout calling mmal_port_format_commit. This is due to an aspect ratio change happening right before stopping the playback. |
And the next hit, this time the flush on image_fx does not seem to finish as expected. My internal refcounting says that 1 buffer is held by the output port still. mmal_vc_diag mmal-stats seems to confirm this:
Related stacktrace is here: http://pastie.org/private/vwpmjsqqptmvazvj3jfkdw Any thoughts? |
Locking around component create/destroy: it should all be internally safe for different threads to be creating and destroying components simultaneously. I thought it was possible to do the same refcounting with pools as well, but can't see it at the moment (may have been on a different dev branch - I'm sure I've used it. You can call mmal_pool_destroy on a pool before the last buffer is released, and it only actually gets destroyed when that buffer is finally released. Perhaps I'm misremembering). Do remember that port_disable will flush back any buffers owned by the components, so ensure that your callback isn't blocked when you make the call. (Doesn't appear to be an issue from your callstacks) On the VideoCore side, there is a single thread that brokers all MMAL requests from the ARM. A couple of longer running tasks are offloaded to worker threads, but otherwise most things are run sequentially. |
@6by9 Thanks for the feedback. If it would help for analysing things on the GPU I could prepare a SD card image containing the test-case. |
Does it need to be a full SD card image, or just a test app? I'd need to rebuild the firmware anyway to use a debug build, so if it can just be an app on top of a known rpi-update release hash then it would be easier. |
@6by9 I've not tried to get the ramdump stuff going on Pi. Can't think why it wouldn't work. |
@popcornmix - switching to email. |
@6by9 let me try building VLC in recent raspbian and provide it as tarball. Was a bit tricky last time I did... Making up a minimal test case is probably a hard task in this case. |
If it needs to be a full SD card image then I can cope, just that it's easier to have an environment I know I can hack around. |
Build is running on a clean raspbian install - I should be able to provide you with prebuilt binaries +source tomorrow. |
I was able to finally build things properly for raspbian (using a cross toolchain linking against nfs-exported raspbian root actually). If I downgrade to what I have on my systems currently (firmware from Feb 4) things are better but still odd and different to how it behaves on my system. After the first few frames have been decoded the whole system stalls for several seconds - no mmal events occur in that time and then it resumes spontaneously. It keeps running smooth until a mmal component create/destroy (ie deinterlace plugin reset on aspect ratio change) occurs. Then it stalls again. In the hope you might see more than I do I create a package though: Use it like this:
Additionaly I packaged my aspect-ratio change source - you should ideally run this on another PC which has VLC installed (version shouldn't matter much) - it will stream a continuous mpeg-ts with aspect ratio change every second to 239.2.2.2:10101 (feel free to change the address in the script as you like)
You can then play the stream on the pi using:
Be aware it is a mpeg2 encoded stream so you need to have the mpeg2 license activated. The case without deinterlace runs smooth after the initial stall is over. The one with deinterlace stalls over and over again because the deinterlace filter is recreated. And finally some notes for building:
The rootfs needs one fix though: /usr/lib/arm-linux-gnueabihf/libdl.so has to be recreated as relative link in favor of the absolute one it normally is in that rootfs
I used the linaro toolchain from the tools repository for building - change configure.sh to use your paths... And finally just in case you don't experience these odd stalls on start/component creation or find a way to fix it you can use this python script to run the actual loop test which I run to hit the deadlocks I reported last days:
This will just restart the playback every 5 seconds, which stresses the component create/destroy cycle in addition to the permanent recreation of image_fx deinterlace due to the aspect change. The sources for the mmal modules in vlc reside in modules/hw/mmal/ - just in case you're interested in looking into them. I hope this is somehow understandable :) |
If you can identify when the firmware broke things, it would be helpful. Also there were some memory reductions here: |
@popcornmix Hexxeh/rpi-firmware@0c9001b introduces the regression for me. |
Do files play with omxplayer with new firmware? |
@popcornmix omxplayer seems to work well. You can use vlc to see the issue - it seems to appear with mpeg2 codec only. You can use the vlc bundle I uploaded along with the aspect ratio stream. |
Thanks for the detailed reports; I'll look at this as soon as possible. Unfortunately the change to the video decoder firmware at Hexxeh/rpi-firmware@0c9011b is quite large, and we don't have the same ability to regression test it at Pi Towers that used to be possible at Broadcom (we're working on it...). However, if this only affects MPEG-2, we should be able to get to the bottom of it quickly. |
@6by9 Thank you. Regarding mpeg2 only: I wouldn't take this as proven so far - but probable. So it probably makes sense to look for mpeg2 related things first regarding the regression. |
…e link fields See: #377 See: Hexxeh/rpi-firmware#77 kernel: dtoverlay added for ENC26J60 See: raspberrypi/linux#795
…e link fields See: raspberrypi/firmware#377 See: #77 kernel: dtoverlay added for ENC26J60 See: raspberrypi/linux#795
@julianscheel can you try latest rpi-update firmware? Includes a fix from @deborah-c for MPEG-2 decode. |
I just did a quick remote login and it seems the decoder starts up properly again now. Thanks. So we can get back to the deadlocks. |
I found that MPEG-2 decode worked poorly using that build: it complains that the TS demuxer is an invalid module, and playing from a .VOB file had very slow (~10s) startup, and then played jerkily. I'm not sure how VLC works internally -- does it use our A/V sync, or is that done on the ARM side of the world? The regression wasn't MPEG-2 specific, but applied to all non-H.264 streams, and failed to produce images after a resolution change. VLC appears to create the initial image pool differently from omxplayer -- the latter generated a pool the same dimension that the decoder wanted, whereas VLC asked for larger images, causing an immediate resolution change. |
@deborah-c Ah sorry, I forgot to post the note about the dependencies. This is what I installed for doing the raspbian build:
libdvbpsi is what you need for the ts demux to work. These ~10s on statup are what I see on the raspbian build as well. At a first glance it did not look like some a/v sync issue - the a/v sync is handled on the ARM side with VLC, it's using alsa for audio output. Thanks for the init analysis and fix. I wonder a bit why we create a pool with a different size initially, I will take a look into it. Maybe we can avoid it. |
@deborah-c, @6by9 I've prepared a new build of vlc for raspbian which fixes that startup stall.
I also use gdb from jessie, because the wheezy gdb does not properly support the dwarf4 debug symbols generated by the toolchain. Once you have updated the dependencies and VLC you should be able to use the aspect streaming script on another machine and start the testcase with
as described in my older post. Just let this one run for a while (sometimes it runs quite some hours without failing). It will change aspect ratio frequently and restart the stream every 5 seconds. At some point it will just stop. Now you can attach the debugger and should be able to see that one of the mmal components is either waiting for a mmal_port_* call to finish or is waiting for buffers to be returned. The last case I had where it failed I couldn't even check the mmal internal state because mmal_vc_diag mmal-stats did stall immediately as well. |
Is the testcase working for you? |
Sorry, not had any time to look at anything Pi related. Hopefully will over the weekend. |
Ok, this gave me the "oh crap" moment :) @deborah-c Could this match the exception you see in the VC? I've started a testrun with zero_copy and guard in front of input now and a second one using the old approach without zero_copy and local buffers but with the guard included. Let's see how they go. |
My guess would be that what I saw was a use-after-free, in effect; that could be the symptom of passing something twice, but I'm more inclined to suspect either errant shutdown code or accidental parallelism. Annoyingly, I had a run stopped at an assert when I came in today, but the (rather static and motion sensitive) debugger had disconnected; I've had another that did the same, one that just stalled completely with no immediately obvious reason, and my current run has been running for a long, long while... |
From time to time the error hides for a while, I've had that as well... |
My test runs survived the night - I think the double send into image_fx really seems to be the problems root cause... |
@deborah-c Happened to hit a debuggable assert in the meantime? |
I guess we can add a sanity check to ensure the queue linked list remains valid, but yes it sounds like is really a userspace error. Ensuring the pools keep a sane allocated count would also be useful - just had a related thought on how it may have got into that state :-) (Not sure how we detect the double submit as I don't think the component has a listed of already submitted buffers. It'll take some investigation) |
@6by9 I also thought about ways for sanity checks, but doing that efficiently seems not really like a simple task. Actually for the case where the same buffer_header is submitted twice it can be detected by scanning the queue for duplicates on each _put. This is in fact how I nailed the issue down on my side. But this requires a loop over the queue each time which might hurt depending on the queue size. And secondly this wouldn't have helped in the initial error case at all, because there I did that magic data reuse with different buffer headers, so the queue looked just fine. So to detect that, one would have to actually check the data pointer and I really don't know how feasible that is. Maybe adding some comment/documentation stating the opaque buffers must not be send to input ports twice would be a good idea though... |
Unless I'm missing something, the mmal_queue_put calls can trap it when the buffer is returned and released, but ideally we want to do it when the mmal_port_send_buffer call is made. I can't see a useful queue there to check as most stuff just passes through down to the component. No buffer, opaque or otherwise, should be sent to a port twice. As you say, munging buffer->data pointers masked things further. |
This was my sanity check: It only helps if the duplicates are in the same queue at the same time. If the queues are drained quickly, then you may never spot the issue. However this did detect a bug in my kodi/mmal code. I wonder if the MMAL_BUFFER_HEADER_T itself could contain a flag that indicates arm/vc ownership. If you try to submit a buffer that you don't have ownership of (e.g. submitting buffer a second time to video_render) then it fails. |
@popcornmix Your sanity check still adds the duplicate buffer to the queue, so we're then still doomed and potentially stuck within the ARM/VC ownership also has a niggle. There was the intent that you can do mmal_buffer_header_acquire to increment the refcount on a buffer header as a very cheap splitter. The buffer only gets released back to the pool (and then returned to the source component) when refcount hits 0. That means that you should be able to submit the same buffer header to different components simultaneously. I don't think we've ever used that in anger, but it is there. It really needs zero copy to work properly with opaque buffers though (needs to get the correct refcounting/lifetime on the reloc heap object of the source port buffer payload). As I commented earlier, tracking which buffers are submitted to a component isn't easy at the moment. But we could have a MMAL_QUEUE in the core of submitted buffers that can be checked on a send to avoid duplicates. My thought from earlier for image pools and the
= badness. |
Any patches for additional asserts on the gpu side would be welcome. |
Can you use a counter on each buffer header to check for double submission? The port would store the most recently received counter value and complain You would need to do something when it wraps obviously, or just have a 64 It ought to be possible to do entirely in the mmal core code.
|
There's no obligation to submit buffers in the same order they are produced. And what if the client generates buffers to be submitted to say the video encoder? Those won't have been tagged at all. @popcornmix I have an untested patch for vc_pool. I'll ping it to you by email. Addressing the MMAL core/vc_api to do sanity checking needs some more thought. Your patch would be a safety net, but still leaves things in a bad place. |
Perhaps not, but you'll get some very weird effects. It's a pretty niche scenario. I guess you could use this to generate videos where alternate frames are swapped or something, for making peoples' brains explode.
They won't be opaque. |
Slideshow with 2 Image decodes and alternating between the output of the two being fed into video render? If we're failing things on this "error", then a false positive even on a niche use case would be a pain. I don't think there is a need for them to be opaque images to cause issues. As in Julian's case, submit the same buffer twice to VC and you will get two callbacks when they are returned. If you call mmal_buffer_header_release on each instance then he managed to screw over as the linked list in MMAL_QUEUE (duplicate buffer gets stuffed on the end, so buffers from old->next to the end get lost). |
See: raspberrypi/linux#148 kernel: vchiq: Remove inline from suspend/resume Breaks build with gcc 5 kernel: Added optional parameter to set resistance of touch plate(x-plate-ohms) See: raspberrypi/linux#923 firmware: video_decode: Need to clear corrupt state when recovery point is seen firmware: mmal: Plumb in OMX_IndexParamBrcmInterpolateMissingTimestamps firmware: Video_splitter: support RGB888, BGR888, and ARGB888 firmware: vc_pool: block allocated count going negative firmware: vc_pool: fix behaviour if acquiring a released image See: #377 firmware: mmal: Pass dts in place of pts when pts is invalid firmware: video_decode: Use dts from fifo when pts is unknown
See: raspberrypi/linux#148 kernel: vchiq: Remove inline from suspend/resume Breaks build with gcc 5 kernel: Added optional parameter to set resistance of touch plate(x-plate-ohms) See: raspberrypi/linux#923 firmware: video_decode: Need to clear corrupt state when recovery point is seen firmware: mmal: Plumb in OMX_IndexParamBrcmInterpolateMissingTimestamps firmware: Video_splitter: support RGB888, BGR888, and ARGB888 firmware: vc_pool: block allocated count going negative firmware: vc_pool: fix behaviour if acquiring a released image See: raspberrypi/firmware#377 firmware: mmal: Pass dts in place of pts when pts is invalid firmware: video_decode: Use dts from fifo when pts is unknown
Latest firmware has a couple of sanity checks from 6by9 so repeated submission of a buffer should produce an complaint in the assert log and may be less likely to wedge things (although I suspect bad stuff may still happen). |
Yes, it cleans up the image pool handling, but I haven't looked at the MMAL queue linking doing silly things on double buffer release yet. |
@julianscheel has this issue been resolved? |
@Ruffio The actual issue was on the application side and has been resolved. Discussion was ongoing whether it would be wise to add some sanity check at mmal level to detect this kind of bad usage. Maybe @popcornmix, @6by9 or @deborah-c want to add some thoughts? If it appears to be not worth adding a check, let's close the issue. |
Created #707 to cover the sanity checking at the MMAL level, so this can be closed. |
…e link fields See: raspberrypi#377 See: Hexxeh/rpi-firmware#77 kernel: dtoverlay added for ENC26J60 See: raspberrypi/linux#795
firmware: dispmanx: Fix for locking with dispmanx_element_add with stereo object firmware: video_decode: increase the number of userdatas firmware: platform: Enable VCOS_RELEASE_ASSERTS See: raspberrypi#377 (comment) firmware: dispmanx: Fix for dispmanx_snapshot with more than one rotated layer See: raspberrypi#377 kernel: config: Enable ZSMALLOC, ZRAM and PGTABLE_MAPPING See: raspberrypi/linux#894 kernel: config: Enable CONFIG_FB_MODE_HELPERS and CONFIG_FB_UDL See: raspberrypi#141 kernel: bcm2708: Make ioctl logging quieter See: raspberrypi/linux#895 kernel: HiFiBerry Digi: set SPDIF status bits for sample rate See: raspberrypi/linux#899 kernel: dts: overlay: add generic support for ads7846 See: raspberrypi/linux#896
kernel: Add driver for rpi-proto See: raspberrypi/linux#908 kernel: Fix reduced volume issue for users of PCM5122 DAC See: raspberrypi/linux#889 firmware: arm_loader: Refactor freq/freq_min logic and allow h264 freq_min to be increased firmware: arm_loader: Allow non-turbo voltage to be increased by up to two config steps firmware: video codec: refactor userdata release mechanics in categoriser firmware: hvs: experimental: reduce hvs non-panic priority on 2836 firmware: arm_loader: Add force_eeprom_read setting firmware: bootcode: Add bootcode_delay for an early delay See: raspberrypi#401 firmware: dispmanx: Fix stereoscopic flags to invert left/right eyes with multichannel See: http://forum.kodi.tv/showthread.php?tid=211501&pid=1956924#pid1956924 firmware: [deinterlace] Avoid asserts on half rate deinterlace firmware: [audioplus] Limit samplerate/channels to something we expect to be able to support through hdmi firmware: [deinterlace] Fall back to fast algorithm in a cleaner way firmware: MMAL opaque - reduce back below 128 btyes See: raspberrypi#377 (comment) firmware: arm_loader: Avoid double-free when disabling HAT overlay, and always relocate overlay phandles firmware: vc_pool_image: add locking around linked image release
See: raspberrypi/linux#148 kernel: vchiq: Remove inline from suspend/resume Breaks build with gcc 5 kernel: Added optional parameter to set resistance of touch plate(x-plate-ohms) See: raspberrypi/linux#923 firmware: video_decode: Need to clear corrupt state when recovery point is seen firmware: mmal: Plumb in OMX_IndexParamBrcmInterpolateMissingTimestamps firmware: Video_splitter: support RGB888, BGR888, and ARGB888 firmware: vc_pool: block allocated count going negative firmware: vc_pool: fix behaviour if acquiring a released image See: raspberrypi#377 firmware: mmal: Pass dts in place of pts when pts is invalid firmware: video_decode: Use dts from fifo when pts is unknown
Is there a similiar mechanism for mmal to OMX_GetDebugInformation for openmax which enables us to see the state of the active mmal components and their ports?
I am looking for an issue where mmal_port_flush() does not seem to work properly on image_fx component - my internal refcounting says that still one buffer is in use by the input port. It would be quite helpful to have a chance to see the internal port state.
The text was updated successfully, but these errors were encountered: