Skip to content

A project to detect and classify abnormal video clips (fighting and falling down) using various neural network architectures (Long-term Recurrent Convolutional Network, 3-Dimensional Convolutional Network)

Notifications You must be signed in to change notification settings

Toukenize/abnormal-event-detection

Repository files navigation

Abnormal Event Detection

A project to detect and classify abnormal video clips (fighting and falling down) using various neural network architectures (Long-term Recurrent Convolutional Network, 3-Dimensional Convolutional Network).

Data were collected from multiple sources:

Each of the video clip is split into multiple subclips of 30 frames over 5 seconds, and the annotation of the 5-seconds clips was done manually.

2 types of Models

  • To be updated

6 types of Features Representations

For each frame in the video, 6 different feature representations were generated:

No. Name Description Sample
1 Raw 1-channel representation : Grayscale version the original image. chute01_cam2_subclip007_frame008_raw
2 Heatmap (HM)* 1-channel representation : Grayscale version of OpenPose's raw output, with high intensity areas corresponding to high possibility of human body part. chute01_cam2_subclip007_frame008_hm
3 Keypoint (KP)* 1-channel representation : Grayscale version of the skeletons representation of the detected human (processed from thresholded OpenPose heatmap). chute01_cam2_subclip007_frame008_kp
4 Raw, Heatmap, Backsub (RHB)** 3-channels representation : Raw (1), Heatmap (2), Background-subtracted frame, positive regions (non-zero in RGB color space) represents region of new motions relative to the previous 5 frames. chute01_cam2_subclip007_frame008_rhb
5 Heatmap, Keypoint, Backsub (HKB)** 3-channels representation : Heatmap (2), Keypoint (3), Background-subtracted frame. chute01_cam2_subclip007_frame008_hkb
6 Heatmap, Heatmap, Backsub (HHB)** 3-channels representation : Heatmap (2), Heatmap (2), Background-subtracted frame. chute01_cam2_subclip007_frame008_hhb

Note * : Extracted using OpenPose application

Note ** : Background subtraction with past 5 frames as history, using OpenCV's Gaussian Mixture-based Background/Foreground Segmentation Algorithm

Modelling Environment

  • OS : Windows 10
  • GPU : NVIDIA RTX2070
  • RAM : 16 GB
  • Package Manager : Anaconda 4.8.0

How to Use

Download these files:

  1. Processed Data with 6 different representations
  2. Annotations
  3. Extracted Features (Optionally, you can run the extract_features.py to obtain the same set of features)
  4. Pretrained Weights for C3D
  5. The scripts and yaml from this repository

Organise them into this structure:

image

Create the conda environment

Using anaconda prompt, change directory to your project_folder from the previous step.

Create the environment (make necessary changes the prefix in the yml file to your Anaconda directory):

conda env create -f tf_gpu_115.yml

Activate the environment:

conda activate tf_gpu_115

Feature Extractions

Extract features from the assault-fall-data folder.

Run python extract_features.py. New folders c3d, mobilenet, resnet50v2 will be created, with all the extracted features

Note : Only run this if you have not download the Extracted Features from previous section.

Model Training

Train the select model using selected features (image type), over a specified N-fold cross validation for M times.

Run the script model_training.py with these flags (available options in {}):

  --model {mobilenet,resnet50v2,c3d,all}
                        Select model you want to build.
  --folds {1,2,3,4,5,6,7,8,9,10}
                        Specify N-fold validations.
  --runs {1,2,3,4,5,6,7,8,9,10}
                        Specify N runs. During each run, N-fold CV is done.
  --imgtype {raw,hm,kp,hhb,rhb,hkb,all}
                        Specify feature representation.

For example, if you want to train all models, with all image types using a 5-fold cross validation for 10 times, run:

python model_training.py --model all --folds 5 --runs 10 --imgtype all

A training results folder will be created for each model, with subfolders for each image type. The training metrics for each model (1 model is created for each fold during each run) is saved as an image in the respective subfolders, which looks like this:

mobilenet_hhb_run1_fold1_metrics

A prediction csv file is also generated, which consist of the test predictions of every model trained.

is_fight is_fall raw_run1_fold1_fight raw_run1_fold1_fall raw_run1_fold2_fight raw_run1_fold2_fall
0 1 0.1323 0.6754 0.2351 0.1231
1 0 0.3245 0.1234 0.7231 0.3275
... ... ... ... ... ...

Model Results Analysis

See the analysis notebook here.

About

A project to detect and classify abnormal video clips (fighting and falling down) using various neural network architectures (Long-term Recurrent Convolutional Network, 3-Dimensional Convolutional Network)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published