Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JSI Frame Processors (iOS) #2

Merged
merged 309 commits into from
May 6, 2021
Merged

JSI Frame Processors (iOS) #2

merged 309 commits into from
May 6, 2021

Conversation

mrousavy
Copy link
Owner

@mrousavy mrousavy commented Feb 20, 2021

Update (3.5.2021)

Frame Processors are ready for iOS 🎉. Here's what actually happened:

  • I had to create a custom AVCaptureVideoDataOutput delegate which uses a custom AVAssetWriter to write the video files (turned out quite complex!), since you cannot use the AVCaptureMovieFileOutput and a AVCaptureVideoDataOutput (for frame processing) delegate at the same time.
  • I had to create a custom AVCaptureAudioDataOutput delegate which uses a custom AVAssetWriter for the same reason as above
  • I had to figure out how Swift <-> (Objective-)C++ communication worked
  • I had to figure out how Reanimated implemented worklets.
  • I have created numerous PRs at the react-native-reanimated repo to fix a few threading issues, increase performance of worklets, and touched almost all files to extract the "workletization API" to make it re-usable for other libraries (VisionCamera) (#1768, #1859, #1860, #1861, #1863, #1883, #1984)
  • I created react-native-multithreading to experiment with the workletization API I extracted from react-native-reanimated in #1861
  • I spawned a separate JS-Runtime which uses the Reanimated workletization API to workletize frame processors and runs them.
  • I added a plugin API which uses Objective-C macros to "inject" functions into the VisionCamera JS-Runtime.
  • I added the possibility to use return values and parameters for the Plugins, which requires conversion between Objective-C types and JS types (jsi::Value)
  • I added Swift support for the Plugin API
  • I tweaked reanimated's babel plugin to allow passing custom globals (= frame processor plugins), PR yet to be merged.
  • I spent a few nights debugging threading issues (EXC_BAD_ACCESS errors), and submitted a PR to reanimated to fix those.
  • I wrote comprehensive documentation for frame processors, the plugin API and a step by step guide on how you can create your own plugins.

Right now I have compared RAM and CPU usage between current main (ae1dde1) and current frame-processor branch, and noticed that with the custom AVCaptureVideoDataOutput delegate the memory usage has increased from 58 MB to 187 MB (even with all the frame-processor code removed). That's a very high increase in memory which is currently blocking me from merging this thing. I have tried debugging this to find out where it comes from, but couldn't figure it out (yet). I've posted this to stackoverflow for now: https://stackoverflow.com/q/67370456/5281431 - if anyone has an idea, lmk

If you want, you can already write frame processor plugins. They're fairly easy to write, as you're probably just using some API anyways, and you just have to forward those calls then.

As for Android, I'll do that in a separate PR. I'll wait for #1960 and #1879 to be merged before, since those PRs change a lot of reanimated's Android infra.

Original comment:

Frame Processors

Implements "frame processors".

Frame Processors are functions created in JS that can be used to process frames from the Camera.
They will be called for each frame the camera sees, and can analyse the current frame using simple JS code. They can call other JS functions, but those have to be workletized (achieved with Reanimated).

Example use-cases include (but are not limited to-):

  • Scan QR codes in realtime
  • Scan and Translate text in realtime
  • Use facial recognition APIs
  • Basically any AI stuff (Tensorflow, OpenCV)
  • Send frames over the network (realtime video chat, FaceTime)
  • Draw over the camera, e.g. QR code box (idea WIP)
  • Draw filters on the camera, e.g. depth-filters, Snapchat dog filter (idea WIP)

Other functions (such as AI stuff) have to be either workletized, or implemented natively using JSI (C++).

In other words, they're just like Reanimated worklets, but on a separate JS-Runtime specifically created for the Camera (can be explained as some sort of JS multithreading)

const frameProcessor = useFrameProcessor((frame: Frame) => {
  const qrCodes = QRCodeAPI.scanCodes(frame)
  // TODO: draw box around QR code coordinates
}, []);

return <Camera frameProcessor={frameProcessor} />

Notes

  1. Frame processors must be called synchronously from the Camera thread to avoid thread hopping
  2. JS code must be workletized (e.g. external variables must be copied over into the new runtime) using REA
  3. Functions that have a C++ implementation (using JSI) can easily be used without workletization.
  4. You can assign to reanimated SharedValues (e.g. for updating a QR code's frame which will be displayed), but unlike reanimated worklets, the frame processor will not be called when a SharedValue changes (it's not a mapper.). I still have to make sure all thread-checkers are correctly set up for this, see my PR Fix Threading issues (SV access from different Thread) software-mansion/react-native-reanimated#1883.
  5. Other libraries can provide react-native-vision-camera plugins which are basically just functions that operate on a frame. Those can be implemented in JS or C++. (e.g. QR Code detection, AI like Tensorflow or MLKit, etc)

Frame

The Frame object should represent a single frame from the camera. I'm not entirely sure yet what properties this frame will have, but ideally I want to support:

  1. pixels. An array of pixels this camera sees. This should be in the size of the Preview View, not some huge format like 4k because that would result in each frame object having like 25MB of space. You can read pixels like an array: frame.pixels[0] -> { r: number, g: number, b: number } - I am not sure how this will work on the platforms since they don't stream in RGB but rather YUV (Android) or some other color space (iOS).
  2. depth. An array of depth information for each pixel. Not sure if I can embed that into the pixels array or have it separately. We'll see.

Base Plugins

Since at release there are no cool & easy to use frame processor plugins available, I thought of creating base plugins that are always supported. Either in this lib or in a separate package/repo.

  1. QR Code detection: Detects QR codes in a Frame.

Maybe I can also use iOS/Android specific APIs for faster execution speed, e.g. Metal or iOS AI tools.

Implementation

  1. Create C++/JSI base. Link C++ files to Android/iOS.
  2. Create HostObject for the Camera that "installs" the JSI bindings, in our case something like "setFrameProcessorForCamera(viewTag: number, frameProcessor: FrameProcessor)"
  3. In that function, workletize the frameProcessor parameter on a separate thread using a Reanimated API. See: Multithreading with worklets software-mansion/react-native-reanimated#1561
  4. Once workletized, call into the Swift/Kotlin Camera APIs to pass them the newly created worklet.
  5. The native Camera APIs should then notice that a frame processor is available, install a metadata/image analyzer, and call into that jsi::Function.
  6. Call that JSI function from the Camera.ts view where the result gets memoized.

Swift <-> C++

Since the library is written in Swift, we need a way to interop with JSI (written in C++).

I've found the following solutions:

  • Write a custom .cpp wrapper, which manually wraps the calls I need in an extern "C" block so I can call it from Swift. I have to be careful with UnsafePointer<T> and deallocating the objects, since that is not done automatically (no ARC)
  • Use scapix bridge for automatic bridge code-generation between Swift and C++.

I really wish Swift had direct C++ interop. This incompatibility is making this whole thing really really hard.

Tasks

Following is already done:

  • Copy over needed headers and code from Karol's "multithreading with worklets" PR
  • Create Objective-C++ bindings to JSI
  • Import ObjC++ header and install bindings for JSI from Swift codebase
  • Create new JSI runtime
  • Attach to VideoOutputDelegate in Swift CameraView (actually capture frames)
  • Import NativeReanimatedModule (roadblock: Use dependency with non-modular-headers in library where I don't need the modular headers CocoaPods/CocoaPods#10472)
  • Get NativeReanimatedModule instance to call makeShareable ("workletize") (roadblock: No idea how to get a TurboModule outside of a TurboModule context.)
  • Find a suitable format for the captured output. jsi::HostObject with custom accessors, group bytes in color format?
  • Call JSI function with captured output data (jsi::HostObject)
  • Add a solution for third party libraries to register their frame processor plugins
  • Do the same things for Android

Testing

Since this PR required a few refactors and restructures of the react-native-reanimated library, I have created a PR over there: software-mansion/react-native-reanimated#1790. Because that PR is not merged yet, you have to have those changes locally - either install react-native-reanimated directly from that GitHub branch, or install it normally through npm (2.0.0), download the repo from my PR's branch, and drag the ios and Common files from my PR into your react-native-vision-camera/example/node_modules/react-native-reanimated folder.

Android and .aars

Android is a bit more complicated than iOS because react-native-reanimated is not distributed from Source. I will have to play around with CMake to try and get the react-native-reanimated.aar file (that lives on the end-user's machine) embedded into my library, no idea if that's even possible.

Maybe useful links:

@mrousavy mrousavy self-assigned this Feb 20, 2021
@mrousavy mrousavy marked this pull request as draft February 25, 2021 11:54
@mrousavy
Copy link
Owner Author

mrousavy commented Feb 25, 2021

EDIT

Got it. Finally managed to call Swift code from Objective-C++.

Original comment

I'm having trouble calling the Swift function from the Objective-C++ file. :(

Apparently CocoaPods renames the ObjC Generated Interface Header Name from VisionCamera-Swift.h to react_native_vision_camera-Swift.h, and I can't import that without build errors.

Mission: I want to call a Swift function (actually just use the CameraView type) in this line.

If anyone has an idea, please let me know as this is blocking me a lil bit

EDIT: Wtf am I doing wrong?

Screenshot 2021-02-25 at 21 40 24
Screenshot 2021-02-25 at 21 40 15

@davidgovea
Copy link

this is awesome.

I don't have time to contribute at the moment, but I wanted to drop in here and cheer you on 🙌

@mrousavy
Copy link
Owner Author

🚀🎉 Got it running successfully:

Screen.Recording.2021-03-11.at.10.54.51.mov

cpp/Frame.h Outdated Show resolved Hide resolved
cpp/Frame.h Outdated Show resolved Hide resolved
cpp/Frame.h Outdated Show resolved Hide resolved
cpp/Frame.h Outdated Show resolved Hide resolved
cpp/Frame.h Outdated Show resolved Hide resolved
cpp/MakeJSIRuntime.h Outdated Show resolved Hide resolved
cpp/MakeJSIRuntime.h Outdated Show resolved Hide resolved
cpp/MakeJSIRuntime.h Show resolved Hide resolved
cpp/SpeedChecker.h Outdated Show resolved Hide resolved
cpp/SpeedChecker.h Show resolved Hide resolved
example/src/App.tsx Outdated Show resolved Hide resolved
example/src/App.tsx Outdated Show resolved Hide resolved
example/src/App.tsx Outdated Show resolved Hide resolved
example/src/App.tsx Outdated Show resolved Hide resolved
cpp/Frame.h Outdated Show resolved Hide resolved
cpp/Frame.h Outdated Show resolved Hide resolved
cpp/SpeedChecker.h Outdated Show resolved Hide resolved
cpp/SpeedChecker.h Outdated Show resolved Hide resolved
android/src/main/cpp/AndroidLogger.cpp Outdated Show resolved Hide resolved
android/src/main/cpp/AndroidLogger.cpp Outdated Show resolved Hide resolved
android/src/main/cpp/AndroidLogger.h Outdated Show resolved Hide resolved
android/src/main/cpp/AndroidLogger.h Outdated Show resolved Hide resolved
android/src/main/cpp/AndroidLogger.h Outdated Show resolved Hide resolved
android/src/main/cpp/FrameProcessorBindings.cpp Outdated Show resolved Hide resolved
android/src/main/cpp/FrameProcessorBindings.cpp Outdated Show resolved Hide resolved
android/src/main/cpp/FrameProcessorBindings.cpp Outdated Show resolved Hide resolved
android/src/main/cpp/FrameProcessorBindings.cpp Outdated Show resolved Hide resolved
android/src/main/cpp/FrameProcessorBindings.cpp Outdated Show resolved Hide resolved
android/src/main/cpp/AndroidLogger.cpp Outdated Show resolved Hide resolved
android/src/main/cpp/AndroidLogger.cpp Outdated Show resolved Hide resolved
cpp/Frame.h Outdated Show resolved Hide resolved
cpp/MakeJSIRuntime.h Show resolved Hide resolved
cpp/SpeedChecker.h Show resolved Hide resolved
@nandorojo
Copy link

This is exciting. Seems like a good abstraction of worklets in general.

@mrousavy mrousavy mentioned this pull request Mar 29, 2021
@mrousavy
Copy link
Owner Author

mrousavy commented Mar 31, 2021

My PR at reanimated was finally merged!: software-mansion/react-native-reanimated@815847e 🎉🎉

@@ -213,6 +219,14 @@ export const App: NavigationFunctionComponent = ({ componentId }) => {
console.log('re-rendering camera page without active camera');
}

const frameProcessor = useFrameProcessor((frame) => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[tsc] <6133> reported by reviewdog 🐶
'frameProcessor' is declared but its value is never read.

@@ -234,6 +248,12 @@ export const App: NavigationFunctionComponent = ({ componentId }) => {
onError={onError}
enableZoomGesture={false}
animatedProps={cameraAnimatedProps}
frameProcessor={(frame) => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[tsc] <7006> reported by reviewdog 🐶
Parameter 'frame' implicitly has an 'any' type.

@@ -213,6 +217,12 @@ export const App: NavigationFunctionComponent = ({ componentId }) => {
console.log('re-rendering camera page without active camera');
}

const frameProcessor = useFrameProcessor((frame) => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[tsc] <6133> reported by reviewdog 🐶
'frame' is declared but its value is never read.

@mrousavy mrousavy marked this pull request as ready for review May 2, 2021 12:59
@mrousavy mrousavy changed the title JSI Frame Processors (REA Worklets) JSI Frame Processors (iOS) May 2, 2021
@mrousavy mrousavy merged commit b6a67d5 into main May 6, 2021
@mrousavy mrousavy deleted the frame-processors branch May 6, 2021 12:11
trungthanhnt pushed a commit to trungthanhnt/react-native-vision-camera that referenced this pull request Nov 1, 2023
@MajorTom007 MajorTom007 mentioned this pull request Nov 14, 2023
4 tasks
@cseluzha cseluzha mentioned this pull request Dec 12, 2023
5 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🍏 ios Issue affects the iOS platform
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants