Skip to content

Latest commit

 

History

History
108 lines (65 loc) · 8.48 KB

EXTRA.md

File metadata and controls

108 lines (65 loc) · 8.48 KB

Invisibility Cloaking via Real-time Object Detection in Video

This extra tasks extends the special effect wizard style invisibility cloaking to use real-time object detection and segmentation (specifically a neural network based technique known as Mask R-CNN.)

Task Setup

Some quick steps to get you setup for object detection:

  1. In the browser, download and save file the code file invisible_objects.py (left click mouse, "Save Page As ...").

    If you are working on a shared account (i.e. as a visitor to Computer Science at Durham University), save this in the directory you created earlier (i.e. yourfirstname-initial/invisible_objects.py or similar) to avoid file conflicts with other users.

    Download Task 5

  2. In the browser, download and save the file script file download-model.sh (left click mouse, "Save Page As ...") to the same directory you used in Step (1):

    Download script

  3. Open a command line Terminal as follows:

    Open Terminal

    Only if you saved the files to a directory you created earlier, first enter the following command to change to this directory (replacing yourfirstname-initial with your file name):

    cd yourfirstname-initial
    

    ... and then enter the following command to download the pre-trained object detection models we are going to use:

    bash ./download-model.sh
    

    After the downloads complete, should now see some final output in the Terminal as follows:

    ....
    Performing MD5 file verification checks ...
    object_detection_classes_coco.txt: OK
    mask_rcnn_inception_v2_coco_2018_01_28.pbtxt: OK
    mask_rcnn_inception_v2_coco_2018_01_28/frozen_inference_graph.pb: OK
    
    

    which this means the download was successful and the files verified (using a mathematical checksum) correctly.

  4. Finally, open the file invisible_objects.py that you downloaded in Visual Studio Code (File menu -> Open File) as follows:

    Open File

Task 5 - Invisibility Cloaking via Object Detection

In this extra exercise, we make use of a deep machine learning based object detection model, specifically a Mask Region-based Convolutional Neural Network (Mask R-CNN). This model has been pre-trained to detect upto 90 different types of object, including people, using 1000's of labelled image examples. The Mask R-CNN model takes a input image (e.g. from the camera) and produces object detections in the form of bounding boxes (i.e. regions), object masks showing the pixels in the image that belong to the object and also class (or type) labels saying what the object is.

Mask R-CNN object detection

Here we make use of a pre-trained model to be able to detect object masks within the image. We can then use these R-CNN generated object masks as the foreground mask for our invisibility cloaking approach in place of using the green material.

To try this out:

  • run the downloaded code file invisible_objects.py in Visual Studio Code (click "Run > Run Without Debugging")
  • you should now see 2 image windows displayed - one containing an initial background image, and one containing the mask output of the Mask R-CNN model (with each type of object over-shaded with a different colour)
  • you may need to resize the live image view window with the mouse ( you can also turn fullscreen on/off by pressing f)

As before, you you can reset the background image by pressing the space key but now you can press the i key to turn on invisibility for the detected objects in the scene.

object cloaking

However, you may notice the program is running very slowly and sluggishly. This is because of all the processing required to process the complexity of the Mask R-CNN model operations is being carried out on the CPU (Central Processing Unit, main PC processor). In order to overcome this, we can switch to using the faster GPU (Graphics Processing Unit) which can perform the arithmetic operations of the neural network model much more efficiently. In order to do this, enter the command opencv.init into the command terminal at the bottom of the Visual Studio Code window as follows: run opencv.init

On the Durham Linux system, this re-initialises the Python software environment to use the GPU (i.e. OpenCV with CUDA enabled). You should now get a high refresh rate and be able to re-run the code file, reset the background (pressing the space key ) and toggle invisibility on/off with ease (pressing the i key) to see things like this:

object cloaking

If the object invisibility masking is not perfect, this is due to the simplicity of the Mask R-CNN object masks - you could try fixing this by increasing the number of morphological operations used to clean up the foreground mask in the code (hint: see code lines 161 - 168).

How does this work ?

This approach is using a trained object detection and segmentation model known as a Mask Region-Based Convolutional Neural Network which uses a series of convolutional processing layers to extract intermediate feature representations of the objects in the scene.

By training this neural network model, using 1000s and 1000s of labelled image examples of different types of objects, we can set the weights of these convolutional operations to extract intermediate feature representations that allow the network to recognise different types of objects by label (i.e. "What is it? -[person, car, dog, cat, cow, ...]) and to be able to localise them within the scene (i.e. show us either a mask region or a bounding box).

If you want to see all the different types of objects that the model we are using is trained to on, open the file object_detection_classes_coco.txt in Visual Studio Code.

The current version code file we are using (invisible_objects.py) is setup just to display the object masks so that we can use them for our invisibility task. If you want to see the complete set of object masks, bounding boxes, labsls and confidence levels for all objects download and run the this mask-rcnn.py code file. (N.B. to get it to run on the GPU by default edit line: 68 + to make it run even more efficiently additionally download the camera_steam.py threaded camera capture module.)

full Mask R-CNN

Some other things to try ...

If time allows you may want to try editing the (invisible_objects.py) code file to play with the following features:

  • changing the confidence threshold for object detection (What is the effect? - hint: see code line 116)
  • selectively making only certain objects invisible (hint: see code lines 138-142)
  • adding a timing loop to compare CPU vs. GPU execution times (hint: see simple example here)

Additional Info

Instructor Notes: tested with OpenCV 4.6.x. (08/2022) on Durham University LDS (Debian Linux) + OpenSuSE Linux Tumbleweed. The use of the opencv.init command to activate a version of OpenCV built against CUDA is specific to the setup on the DU LDS system; the general alternative is to built OpenCV from source with WITH_CUDA and OPENCV_DNN_CUDA enabled - for any configuration this can be tested via this version.py script.

Developer Notes: to add - additional versions with sliders controls for object confidence; if using this code in anger beyond a beginner level lab demo - consider using the camera_steam.py module for threaded camera capture.

Acknowledgements: based in part on a prior code example from the OpenCV Library.