Utilizing Generative Models to Address Imbalanced Data Classification in the Context of Credit Card Fraud Detection

Developed as part of my dissertation submitted to the University of Manchester for the degree of “M.Sc. Business Analytics: Operations research and Risk Analysis” in the Faculty of Humanities"

Dissertation Overview

This study explored the effectiveness of data augmentation using generative models to address class imbalance in credit card datasets.

The two generative models tested are Generative Adversarial Network (GAN) and Variational Autoencoder (VAE), which are compared with traditional oversampling techniques, Synthetic Minority Oversampling Technique (SMOTE) and Adaptive Synthetic Sampling Approach for Imbalanced Learning (ADASYN).

Installation and Setup

Codes and Resources Used

Editor Used: Visual Studio
Python Version: Python 3.10.12

Python Packages Used

General Purpose: copy, collections
Data Manipulation: pandas, numpy
Data Visualization: seaborn, matplotlib
Machine Learning: scikit-learn, tensorflow, keras
Sampling: imblearn

Code structure

visualisations.ipynb: contains initial data exploration, including statistical summary table, correlation matrix, distribution graphs and boxplots.
CV.py: helper functions for implementing cross validation, and printing results.
GAN.py: GAN functions for training the model, generating synthetic samples, and concatenating with training data.
VAE.py: VAE functions for training the model, generating synthetic samples, and concatenating with training data.
classifiers.ipynb: training and evaluation of LR, RF, KNN, XGB, with original distribution of data, SMOTE, ADASYN, VAE, and GAN

Data Source

The dataset used is sourced from Machine Learning Group - ULB and contains 284,807 credit card transactions made by European cardholders across two days in September 2013.

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
__pycache__		__pycache__
.gitignore		.gitignore
CV.py		CV.py
GAN.py		GAN.py
README.md		README.md
VAE.py		VAE.py
classifiers.ipynb		classifiers.ipynb
visualisations.ipynb		visualisations.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Utilizing Generative Models to Address Imbalanced Data Classification in the Context of Credit Card Fraud Detection

Dissertation Overview

Installation and Setup

Codes and Resources Used

Python Packages Used

Code structure

Data Source

About

Languages

MariliaElia/credit-card-fraud-detection-ml-model

Folders and files

Latest commit

History

Repository files navigation

Utilizing Generative Models to Address Imbalanced Data Classification in the Context of Credit Card Fraud Detection

Dissertation Overview

Installation and Setup

Codes and Resources Used

Python Packages Used

Code structure

Data Source

About

Topics

Resources

Stars

Watchers

Forks

Languages