feat: Add VisionCamera integration (`imageFromFrame`) #199

mrousavy · 2023-02-20T11:37:34Z

Summary

Targets VisionCamera V3 (mrousavy/react-native-vision-camera#1466) so you can use PyTorch Core inside a Frame Processor. Example: mrousavy/react-native-vision-camera#1485

This makes it possible to use Models (like face detection, object classification, etc) inside a VisionCamera Frame Processor straight from JS without touching native code at all (no native Frame Processor Plugins!) 🚀

cc @raedle

Still WIP - proof of concept. iOS has a small memory issue and Android isn't tested yet.

WIP - Current TODOS:

Figure out how to get Context in the Android implementation. We're in a static method right now.
Figure out how we want to link VisionCamera on Android. We probably have to add a if include in the build.gradle + CMake setup to require it as a prefab if the user has VisionCamera installed.
Figure out why it doesn't delete the image after I call release() - this blocks the Camera pipeline
Test it :)

Changelog

[CATEGORY] [TYPE] - Message

Test Plan

EDIT: I got it working!

This is the code I used:

const frameProcessor = useFrameProcessor((frame) => {
  'worklet';
  console.log(`Frame width: ${frame.width}`)

  const image = media.imageFromFrame(frame)
  console.log(`Converted! Image width ${image.getWidth()}`)
  // we can run some ML Models here :)

  image.release()
}, []);

return <Camera frameProcessor={frameProcessor} />

This runs for ~10 times (max buffer size), but then stops because the image.release() call doesn't properly release all resources. I have no idea how ref counting works in this repo, so I'd appreciate some pointers here?

But in theory it works, so this could be a pretty cool integration :)

vercel · 2023-02-20T11:37:38Z

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name	Status	Preview	Comments	Updated
playtorch	❌ Failed (Inspect)			Feb 21, 2023 at 5:03AM (UTC)

raedle · 2023-02-21T05:23:08Z

That's pretty exciting!

I solved the memory leak--at least partially because there seems to be other resources that are not deallocated.

There were two issues that I fixed, and a BE to cleanup eventual legacy code:

The JavaScript GC isn't kicking in fast or frequent enough to cleanup unreferenced host objects. I made a change that sets nullptr for the Image on the ImageHostObject letting the smart pointer to its thing.
Same as above for BlobHostObject and a release equivalent for the TensorHostObject.
Not necessarily a big issue, but cleaner. The first release of PlayTorch (v0.1) sent "image references" across the React Native bridge (in the codebase, it's called NativeJSRef). Basically, images where put in a hash map with a UUID as key and the image object as value. This allowed just sending the UUID, which then could be used in JavaScript to refer to a native object when calling functions on the native object or draw it on the canvas. For backward compatibility, PlayTorch v0.2 was additive and made new API compatible with the NativeJSRef. A possible future with react-native-vision-camera and react-native-skia make the PlayTorch Camera and Canvas obsolete and will allow removing the NativeJSRef entirely.

Answering TODOs:

Ideally, there is no reference to the ApplicationContext needed. This will require removing it from the ImageProxyImage constructor and eventually do the yuv420ToBitmap conversion in C++ directly instead of a RenderScript.
If it simplifies the approach, we can probably upgrade to RN 0.71 w/o backward compatibility
See above
👍

this could be a pretty cool integration :)

Agree!

Additional TODOs:

I got a step further converting the image to a tensor. However, there is still a bottleneck somewhere that drops the frame rate from 60 to 20fps. My hunch is that there are unnecessary memcpy and conversions. To be investigated.
I tried to load an image classification model, but I'm not sure how to (1) download the model async, (2) load it into memory with torch.jit._loadForMobile or torch.jit._loadForMobileSync, and then call its forwardSync or forward function in the useFameProcessor hook

// ...

import { torch, media, torchvision } from 'react-native-pytorch-core';
import type { Module, Tensor } from 'react-native-pytorch-core';

const T = torchvision.transforms;
const resizeTensor = T.resize([224, 224])
const normalizeTensor = T.normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225]);

// ...

const countRef = useRef(0);

// ...

const frameProcessor = useFrameProcessor((frame) => {
  'worklet';

  // Increasing counter to see at what frame the frame processor
  // stops processing
  countRef.current += 1;

  console.log(`Width: ${frame.width}/${frame.height}`);
  const imageHighRes = media.imageFromFrame(frame);
  const image = imageHighRes.scale(0.25, 0.25);
  imageHighRes.release();
  const width = image.getWidth();
  const height = image.getHeight();
  console.log(`Converted ${countRef.current}! ${width}/${height}`);

  const blob = media.toBlob(image);
  let tensor = torch.fromBlob(blob, [height, width, 3]);
  image.release();
  blob.release();

  // Helper function to release input tensor before returning newly
  // constructed tensor. This is for testing purposes, and will need
  // to change to an API transparent to the developer.
  function applyAndFreeTensor(inputTensor: Tensor, func: (tensor: Tensor) => Tensor): Tensor {
    const newTensor = func(inputTensor);
    inputTensor.release();
    return newTensor;
  }

  tensor = applyAndFreeTensor(tensor, (tensor) => tensor.permute([2, 0, 1]));
  tensor = applyAndFreeTensor(tensor, (tensor) => tensor.div(255));
  tensor = applyAndFreeTensor(tensor, (tensor) => {
    const centerCrop = T.centerCrop(Math.min(width, height));
    return centerCrop(tensor);
  });
  tensor = applyAndFreeTensor(tensor, resizeTensor);
  tensor = applyAndFreeTensor(tensor, normalizeTensor);
  tensor = applyAndFreeTensor(tensor, tensor => tensor.unsqueeze(0));

  console.log('shape', tensor.shape);
  
  // // How to load "model" (i.e., ModuleHostObject)?
  // const output = model.forwardSync<[Tensor], Tensor>(tensor);
  // console.log('output', output.shape);
  // output.release();

  tensor.release();
}, [countRef]);

Most on-device models use a relatively low input image. I worked around this by scaleing the image down. Is there a way to pick a lower resolution on the camera directly?

mrousavy · 2023-02-21T13:41:24Z

Thanks for your feedback and rapid change in the GC @raedle!

If it simplifies the approach, we can probably upgrade to RN 0.71 w/o backward compatibility

Yes, VisionCamera V3 will require RN 0.71 due to the much simpler buildscript.

I got a step further converting the image to a tensor. However, there is still a bottleneck somewhere that drops the frame rate from 60 to 20fps. My hunch is that there are unnecessary memcpy and conversions. To be investigated.

Yea, I mean CMSampleBuffer itself is a GPU buffer, so converting that to a UIImage copies it to the CPU which can be slow. Is there a way to create a Tensor purely from a GPU buffer? Not entirely sure how PyTorch works here under the hood...

I tried to load an image classification model, but I'm not sure how to (1) download the model async, (2) load it into memory with torch.jit._loadForMobile or torch.jit._loadForMobileSync, and then call its forwardSync or forward function in the useFameProcessor hook

I think this fully relies on the understanding of Worklets.

We want to move as much stuff outside of useFrameProcessor as possible, as this is the hot code path (called for every frame).

So loading the model (asynchronous) has to be done outside the Worklet.

Also, Worklets have some limitations;

You can use "outside" values inside a Worklet, but if you also want to write to them it has to be a SharedValue (latest RN Worklets version introduced the useSharedValue hook, check it out! Also, don't use REA's useSharedValue here.)
You cannot use async/await inside a Worklet.
You cannot call JS functions inside the Worklet. It has to be either a C++ JSI function (HostObject/HostFunction), or another Worklet (aka a JS function with the "worklet" directive). If you want to call back to JS, use Worklets.createRunInJsFn to wrap the JS func outside, and then call that wrapped func inside the Worklet.
Callbacks are a bit tricky. If they go straight to a C++ JSI func (HostObject/HostFunction) it shouldn't be a problem, if it goes to a Worklet it shouldn't be much of a problem either. We'd have to test that

Looking at your code:

useRef

Needs to be useSharedValue from react-native-worklets.

const output = model.forwardSync<[Tensor], Tensor>(tensor);

Is that sync? If it's sync it should work. If it's async/awaitable, this is not gonna work and shouldn't be part of useFrameProcessor as that gets called ever frame.

Also, in the latest RN Worklets lib we made some fixes to identify HostObjects correctly- so maybe try the latest commit fro master instead of the current pinned version :)

feat: Add VisionCamera (imageFromFrame) integration

7a0eaeb

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Feb 20, 2023

vercel bot deployed to Preview February 20, 2023 11:43 View deployment

mrousavy added 3 commits February 20, 2023 12:57

feat: Implement iOS + Android

256d3a2

fix

57a1ed4

Conditional type

9484f3f

vercel bot deployed to Preview February 20, 2023 15:10 View deployment

mrousavy added 4 commits February 20, 2023 16:12

fix

375f564

forward-declare FrameHostObject

0f14843

Simplify MediaUtilsImageFromCMSampleBuffer

67129df

fix: add support for wrapping CIImage

61fd8c5

mrousavy mentioned this pull request Feb 20, 2023

‼️‼️‼️‼️ ✨ VisionCamera V3 ‼️‼️‼️‼️‼️ mrousavy/react-native-vision-camera#1376

Closed

41 tasks

mrousavy changed the title ~~feat: Add VisionCamera (imageFromFrame) integration~~ feat: Add VisionCamera integration (imageFromFrame) Feb 20, 2023

mrousavy mentioned this pull request Feb 20, 2023

example: PyTorch Live integration mrousavy/react-native-vision-camera#1485

Draft

Removed NativeJSRef legacy path for images

42a9869

vercel bot had a problem deploying to Preview February 21, 2023 05:03 Failure

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Add VisionCamera integration (`imageFromFrame`) #199

feat: Add VisionCamera integration (`imageFromFrame`) #199

mrousavy commented Feb 20, 2023 •

edited

Loading

vercel bot commented Feb 20, 2023 •

edited

Loading

raedle commented Feb 21, 2023 •

edited

Loading

mrousavy commented Feb 21, 2023

feat: Add VisionCamera integration (imageFromFrame) #199

Are you sure you want to change the base?

feat: Add VisionCamera integration (imageFromFrame) #199

Conversation

mrousavy commented Feb 20, 2023 • edited Loading

Summary

Changelog

Test Plan

vercel bot commented Feb 20, 2023 • edited Loading

raedle commented Feb 21, 2023 • edited Loading

mrousavy commented Feb 21, 2023

feat: Add VisionCamera integration (`imageFromFrame`) #199

feat: Add VisionCamera integration (`imageFromFrame`) #199

mrousavy commented Feb 20, 2023 •

edited

Loading

vercel bot commented Feb 20, 2023 •

edited

Loading

raedle commented Feb 21, 2023 •

edited

Loading