-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. Weβll occasionally send you account related emails.
Already on GitHub? Sign in to your account
β¨ [$4.000 BOUNTY π°] Improve VideoPipeline (lower-overhead, no OpenGL, all pixel-formats, auto-scaling) #1837
Comments
TL:DR; As of today, VisionCamera uses an OpenGL pipeline on Android. Downsides:
Instead, I want to use an approach similar to how it works on iOS - just pass GPU buffers ( If someone has any ideas, please comment here. $4.000 bounty if you can solve this problem. Feel free to share this, e.g. if you know someone who works at Android/Google or someone with Camera2/android.media/OpenGL experience. |
I found a temporary solution to the problem: #1874 Basically, I plug an ImageReader and an ImageWriter in-between the Camera and the OpenGL pipeline. The problem here is that this does not support RGB, but at least it works in YUV and PRIVATE. I think most Frame Processors need YUV anyways. As far as I can see, there is no real solution on Android right now. SolutionI think this can be achieved if the Android OS adds two features:
I think 2. already works on most phones, but for some reason not on every phone. So 1. is the major point; if that works, we can scrap the entire OpenGL pipeline (a shit ton of C++ files). I think I will create a feature request for that in the Android issue tracker, but they probably won't care about that tbh. iOS already supports this since iOS 12 π |
NEON (SIMD) only works on arm, and I'm definitely not gonna go down that rabbit hole π |
CameraX has a stream sharing feature, this is kinda what I am aiming for. really cool stuff, no need for me to bring everything into OpenGL. I hope it's as efficient as on iOS :) |
What
$4.000 bounty to anyone who can solve this problem once and for all!!!!
One of VisionCamera's strong suits is the flexibility of it. You can configure the Camera for photo capture, video recording, frame processor, or even all at once.
This is roughly how VisionCamera's Camera Session is set up:
For the
VideoPipeline
, we have 6 requirements:yuv
,rgb
andnative
(most efficient platform format)android.media.Image
type, but if that's not possible we can also use some other form of buffer (ByteBuffer
/AHardwareBuffer
)iOS
On iOS, this was relatively easy to implement:
The
CMSampleBuffer
type is amazing, it exposes the GPU-backed buffer to the CPU (IOSurface), but can also fully stay on the GPU without any CPU copies if we don't use MLKit and only record to a file.For the 6 main requirements:
kCVPixelFormat_YCbCr4208BiFullRange
) and doesn't involve any render passes or CPU copies.yuv
,rgb
ornative
) and the AVAssetWriter understands all of them.AVAssetWriter
will handle scaling automatically if the buffers are a different size than what we originally configured it to.CMSampleBuffer
, which is a GPU buffer. We can easily get CPU access on that using AVF APIs.Android
On Android however, it seems like this approach is simply not possible. There is a new type in API 26 called
HardwareBuffer
which seems similar toCMSampleBuffer
, but it is still not widely exposed in APIs and all of the Camera/android.media APIs expect surfaces.Here's a few things I tried:
1. Separate outputs
Use separate Camera outputs,
MediaRecorder
andImageReader
- Does not work because the Camera only allows 3 outputs. We already have 3 (preview, photo, video). Also:android.media.Image
in theImageReader
's callback2.
ImageReader
/ImageWriter
I tried to create an
ImageReader
/ImageWriter
setup that just receivesImage
s, then passes them through to the output Surface (MediaRecorder
/MediaCodec
):This feels like closest to what we have on iOS and it seems like
ImageReader
/ImageWriter
are really efficient as they are just moving buffers around. This is my code:...but I couldn't really get this to work. A really smart guy on StackOverflow said that it is not guaranteed that
MediaRecorder
/MediaCodec
can be fed with Images from anImageWriter
, so sometimes it just silently crashes π€¦ββοΈAlso, I'm not sure what format the MediaRecorder/MediaCodec expects - so maybe this requires an additional conversion step:
...which is just ridiculous.
For the 6 main requirements:
MediaRecorder
/MediaCodec
requires a PRIVATE format. If we feed it RGB/YUV data, it crashes.Image
from theImageReader
.android.media.Image
in theImageReader
's callback3. Create a custom OpenGL Pipeline
I created a custom OpenGL pipeline that the Camera will render to, then we do a pass-through render pass to render the Frame to all the outputs:
But, this has four major drawbacks:
ImageReader
/ImageWriter
approach, as we do an implicit RGB conversion and an actual render pass, whereasImageReader
/ImageWriter
just moving Image Buffers around (at least as far as I understood this)ImageReader
gets called at a later point. We could not really use information from the Frame to decide what gets rendered later (e.g. to apply a face filter).As for the 6 main requirements:
ImageReader
which calls the Frame Processor at some later point, whenever it has an Image available. This could be solved though by rendering to a HardwareBuffer wrapped as a Texture/FBO, which we can then wrap using ourFrame
type. But; we no longer have anImage.
android.media.Image
in theImageReader
's callback (unless we render to a HardwareBuffer; point 5.)4.
AHardwareBuffer
This is something I couldn't get working yet and I'm not sure if that's possible, but my theory is to receive HardwareBuffers (which should represent GPU memory afaik) instantly, then somehow pass them to a MediaRecorder/MediaCodec for encoding:
..but I have no idea how to get direct low-level access to such buffers from the Camera. Can the Camera only render to
Surface
s? TheseSurface
abstractions are really annoying.5. FFmpeg
This is something I couldn't get working yet and I'm not sure if that's possible, but my theory is to use FFmpeg instead of MediaRecorder/MediaCodec to make the recording step simpler and more flexible:
..but I'm not sure if that would grant me any advantages. And also, I think ffmpeg is just using MediaRecorder/MediaCodec under the hood - so that's something I could build myself.
At this point I'm pretty clueless tbh. Is a synchronous video pipeline simply not possible at all in Android? I'd appreciate any pointers/help here, maybe I'm not aware of some great APIs.
Happy to pay $4.000 to anyone who comes up with a solution for this problem once and for all.
The text was updated successfully, but these errors were encountered: