-
Notifications
You must be signed in to change notification settings - Fork 101
feat: Add VisionCamera integration (imageFromFrame
)
#199
base: main
Are you sure you want to change the base?
feat: Add VisionCamera integration (imageFromFrame
)
#199
Conversation
The latest updates on your projects. Learn more about Vercel for Git ↗︎
|
imageFromFrame
) integrationimageFromFrame
)
That's pretty exciting! I solved the memory leak--at least partially because there seems to be other resources that are not deallocated. There were two issues that I fixed, and a BE to cleanup eventual legacy code:
Answering TODOs:
Agree! Additional TODOs:
// ...
import { torch, media, torchvision } from 'react-native-pytorch-core';
import type { Module, Tensor } from 'react-native-pytorch-core';
const T = torchvision.transforms;
const resizeTensor = T.resize([224, 224])
const normalizeTensor = T.normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225]);
// ...
const countRef = useRef(0);
// ...
const frameProcessor = useFrameProcessor((frame) => {
'worklet';
// Increasing counter to see at what frame the frame processor
// stops processing
countRef.current += 1;
console.log(`Width: ${frame.width}/${frame.height}`);
const imageHighRes = media.imageFromFrame(frame);
const image = imageHighRes.scale(0.25, 0.25);
imageHighRes.release();
const width = image.getWidth();
const height = image.getHeight();
console.log(`Converted ${countRef.current}! ${width}/${height}`);
const blob = media.toBlob(image);
let tensor = torch.fromBlob(blob, [height, width, 3]);
image.release();
blob.release();
// Helper function to release input tensor before returning newly
// constructed tensor. This is for testing purposes, and will need
// to change to an API transparent to the developer.
function applyAndFreeTensor(inputTensor: Tensor, func: (tensor: Tensor) => Tensor): Tensor {
const newTensor = func(inputTensor);
inputTensor.release();
return newTensor;
}
tensor = applyAndFreeTensor(tensor, (tensor) => tensor.permute([2, 0, 1]));
tensor = applyAndFreeTensor(tensor, (tensor) => tensor.div(255));
tensor = applyAndFreeTensor(tensor, (tensor) => {
const centerCrop = T.centerCrop(Math.min(width, height));
return centerCrop(tensor);
});
tensor = applyAndFreeTensor(tensor, resizeTensor);
tensor = applyAndFreeTensor(tensor, normalizeTensor);
tensor = applyAndFreeTensor(tensor, tensor => tensor.unsqueeze(0));
console.log('shape', tensor.shape);
// // How to load "model" (i.e., ModuleHostObject)?
// const output = model.forwardSync<[Tensor], Tensor>(tensor);
// console.log('output', output.shape);
// output.release();
tensor.release();
}, [countRef]);
|
Thanks for your feedback and rapid change in the GC @raedle!
Yes, VisionCamera V3 will require RN 0.71 due to the much simpler buildscript.
Yea, I mean CMSampleBuffer itself is a GPU buffer, so converting that to a
I think this fully relies on the understanding of Worklets. We want to move as much stuff outside of So loading the model (asynchronous) has to be done outside the Worklet. Also, Worklets have some limitations;
Looking at your code:
Needs to be
Is that sync? If it's sync it should work. If it's async/awaitable, this is not gonna work and shouldn't be part of Also, in the latest RN Worklets lib we made some fixes to identify HostObjects correctly- so maybe try the latest commit fro master instead of the current pinned version :) |
Summary
Targets VisionCamera V3 (mrousavy/react-native-vision-camera#1466) so you can use PyTorch Core inside a Frame Processor. Example: mrousavy/react-native-vision-camera#1485
This makes it possible to use Models (like face detection, object classification, etc) inside a VisionCamera Frame Processor straight from JS without touching native code at all (no native Frame Processor Plugins!) 🚀
cc @raedle
Still WIP - proof of concept. iOS has a small memory issue and Android isn't tested yet.
WIP - Current TODOS:
Context
in the Android implementation. We're in a static method right now.Changelog
[CATEGORY] [TYPE] - Message
Test Plan
EDIT: I got it working!
This is the code I used:
This runs for ~10 times (max buffer size), but then stops because the
image.release()
call doesn't properly release all resources. I have no idea how ref counting works in this repo, so I'd appreciate some pointers here?But in theory it works, so this could be a pretty cool integration :)