-
Notifications
You must be signed in to change notification settings - Fork 362
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow Pyav objects to be serialized by pickle #652
Comments
I'm not convinced this is a good idea, because there is a lot of internal state we will never be able to pickle. We could likely pickle It is easy enough to convert a packet to |
I am new to Pyav, so may be there is something I miss here... I don't know how to distribute the cpu consuming functions on servers without instances. To my understanding, pyspark map function running on each frame of the datasets need to apply some Pyav functions. Distributed frame encoding requires 1 function and 2 classes:
Distributed Frame Graph filtering requires 3 functions and 3 classes:
As the Graph can be created in the map function, only the data (Frame class, Stream class) passed to it needs be pickable. May be using I'll be pleased to spend plenty of time to do some tests and research on the subject to enable Pyav to be used in a fully distributed map function. |
You can't distribute encoding of most codecs in this way. They depend upon the previous, and often future frames. It may be that they depend upon the previous and future raw frames instead of encoded ones, but I don't think FFmpeg will be easy to trick to do that. I'd be happy to be proved wrong though. |
I'm going to side with @mikeboers here, PyAV's objects bind deeply into the FFmpeg libraries with pointers into FFmpeg's datastructures. As for distributing encoding this would have to be something supported by FFmpeg itself, you can't just tack this on from the outside. |
It could be possible, but a quick pickle of the PyAV objects won't do it. |
I really have no plans to do this, so either the poster is planning to work on a PR or this issue should be closed as won't fix. |
When you group the packets per GOP decoding should be possible right? I believe I have a working poc somewhere where I split reading and decoding into totally separate processes. But the extradata needs to be shared. If you like I can search for the code. |
@jlaine Having no experience in C binding I can not contribute on any C side solution and it seems not trivial to do clean distributed work with ffmpeg, so you can close as Wont' fix. @koenvo Thanks! I will be very happy to have any help on distributing encoding among server/process with Pyav. I propose to talk about it in a new issue. |
Closing as "won't fix" as suggested by @vtexier |
Overview
Be able to serialize/de-serialize Pyav objects with
pickle
.Pickle raise an error, complaining about
__reduce__
magic function to be implemented.Desired Behavior
Pyav objects should be able to be serialized/de-serialized with
pickle
.Example API
To achieve that, it seems that there is a decorator in Cython since 0.26.
@cython.auto_pickle(True)
http://blog.behnel.de/posts/whats-new-in-cython-026.html
https://stackoverflow.com/questions/12646436/pickle-cython-class
Additional context
I am trying to use Pyav on Apache Spark, to distribute encoding across servers.
Pyspark (Python implementation of Spark) use pickle to transfer code object instances to servers.
So, making Pyav classes pickabled will allow to distribute Pyav jobs on a farm of servers, with
pyspark
or other similar tools.The text was updated successfully, but these errors were encountered: