-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Demuxer/DecodeSurfaceFromPacket coming back at some point? #55
Comments
Hi @rcode6 For now VALI doesn’t support demuxing. I understand it’s usefulness for advanced users but unfortunately it leaves too many ways to e. g. break decoder internal state with seek to particular packet or to overflow vRAM when sending packets with multiple frames in each but receiving frames one by one. I’m not against the whole demuxing idea, I just don’t have clear understanding of how to do it right. BTW can it be done with something like PyAV? Muxing isn’t computationally expensive and it doesn’t have HW support anyway, so any other alternatives to VALI may be just as good. |
Hi @RomanArzumanyan, You're right that demuxing can be done with other libraries, but my goal is to still have the latter parts of the pipeline all on the GPU: decoding, pixel format conversions & resizing. That's a bit harder to manage. PyAV does demux into packets, but decoding always ends up on the cpu. And Nvidia's spinoff of VPF, PyNvVideoCodec doesn't do surface conversions or resizing. I completely respect your decision to keep it simpler for users though! Would love it if you'd consider otherwise, since it's already happening behind the scenes. I think, from reading PyNvVideoCodec's very sparse documentation, that they do support dlpack, so maybe I can use PyNvVideoCodec for demuxing & decoding, then use from_dlpack to jump into VALI for the rest of the work. |
Hi @rcode6 Could you share your demux / decode use case? As far as I understand, the only difference between builtin and standalone modes in |
Hi @RomanArzumanyan, My project processes multiple live camera stream inputs for a security system, so demuxing and timestamping the feeds quickly and with minimal latency is important, and then packets are placed into a queue for slower decoding/processing later. The processing thread pulls packets off the queue, then decodes the packets into frames, and does pixel format conversions and resizing before passing them on for further processing work. All that work is also done entirely on the gpu, without ever downloading to the host. There's a combination of selective image processing, motion detection, and object detection going on there, followed by possibly recording back to disk. This queue will fall behind for periods of time and catch back up later. The primarily gain I get from the demuxing process is that I can keep the unprocessed video as packets on the host, which are basically the compressed video feed. If I were to instead decode earlier instead of just demuxing, the unprocessed video ends up being stored in the queue as decompressed surfaces in vram. Another place I can get the same type of space savings is that if I want to keep the last 10 seconds of video footage in memory before a recording event, I can also just keep them as compressed packets on the host, instead of decompressed on the gpu. For instance, with a 30fps feed, that's 300 frames I can keep compressed to decode later, vs using up vram for 300 uncompressed surfaces in whatever resolution the feeds were in. So far, I've found that the extra load of periodically decoding packets twice to be minimal, while the vram savings have been huge. In a nutshell, my demuxing use case just lets me save a lot of vram while avoiding to have to download/upload from device/host too much. |
Thanks for the reply @rcode6 As far as I understand, one missing thing is the ability of The rest is can be done with e. g. PyAV which will write demuxed packets to some queue. Am I missing something? |
Hi @RomanArzumanyan, Yes, if there was a way for I think the problem is what an I can't seem to think of an efficient way other than the same library being used for the demux and decode steps, with the |
Upvoting for pyAV compatibility. Would be fantastic if we could directly décode AVpackets (and retrieve them from encoder). |
Hi @RomanArzumanyan, Would you consider having a way to decode using I realized that I don't need access to the demuxer since the decode functions do provide Also, according to ffmpeg,
|
Hi @rcode6 Honestly I’d like to avoid that at all costs.
I understand the importance of the stand-alone demuxer, however I’d like not to expose private ffmpeg-specific AVPacker API as demuxer API. E. g. how am I going to test if demuxer produces correct output ? Is it going to be annex.b for h264 / h265 / av1 or avcc for h264? What’s about vp8 / vp9 ? How am I going to explain what’s wrong with someone’s demuxer output ? Will I need a bitstream analyzer for that ? Can I dump AVPacket to disk and open it with VLC? So instead of all that I’ve added a constructor which accepts any Python object which has You can demux to a pipe with ffmpeg / pyav / any other tool and the give the pipe (file handle or your own adapter class - you name it) to What’s the difference you may ask? The answer is quite simple - I don’t have to cripple API. You call |
Hi @RomanArzumanyan, I completely understand. And thanks for pointing me to that other Would you consider exposing the encoded bitstream data in |
Hi @rcode6 I think I should just rewrite the whole Initial “bare annex.b output” design was fine back then, but now it becomes a headache. WRT the |
Thank you so much for keeping this project going as VALI!
I'm currently using VPF in my project and was working on swapping over to VALI, but it looks like you've removed demuxing into packets & decoding from packets. Is that something you'd consider adding back in at some point? It's currently a large part of my processing pipeline.
The text was updated successfully, but these errors were encountered: