Machine Learning for Fraud Detection in E-commerce Transactions

Overview

This project investigates the application of machine learning techniques to enhance fraud detection in e-commerce transactions. By leveraging a comprehensive dataset from Vesta, we explore feature engineering, distance prediction, and clustering analysis to identify fraudulent activities.

Problem Statement

The increasing sophistication of financial fraud poses significant challenges to businesses and consumers. Traditional rule-based fraud detection systems often struggle to keep pace with evolving fraudulent tactics. This project aims to develop more robust and accurate fraud detection models using machine learning.

Methodology

This project addresses three key research questions:

RQ1: Feature Engineering and Selection

Objective: Improve fraud detection accuracy by identifying and engineering the most predictive features.
Techniques: Recursive Feature Elimination (RFE), Feature Importance from Gradient Boosting, Principal Component Analysis (PCA).

RQ2: Predicting Transaction Distances

Objective: Develop models to predict transaction distances and identify geographic anomalies indicative of fraud.
Techniques: Linear Regression, XGBoost.

RQ3: Clustering for Coordinated Fraud Detection

Objective: Utilize clustering techniques to uncover groups of transactions potentially associated with coordinated fraud.
Techniques: K-Means Clustering, HDBSCAN, Hierarchical Clustering.

Results

Feature Engineering: PCA significantly enhanced model accuracy, highlighting its effectiveness in capturing relevant data structures.
Distance Prediction: XGBoost models demonstrated promising results in predicting transaction distances, aiding in the identification of high-risk transactions.
Clustering Analysis: K-Means Clustering provided the most interpretable and well-separated clusters, potentially revealing patterns of coordinated fraud.

Data

Source: "IEEE-CIS Fraud Detection" dataset from Kaggle, provided by Vesta.
Size: Over 140,000 transactions with 434 features (transaction details, card information, addresses, Vesta-engineered features).

Contact

Cem Kazan - kzncem@gmail.com# fraud-detection-ml

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
BitWizards Capstone Final Report.pdf		BitWizards Capstone Final Report.pdf
BitWizards EDA.ipynb		BitWizards EDA.ipynb
MergingData.ipynb		MergingData.ipynb
RQ1.ipynb		RQ1.ipynb
RQ2.ipynb		RQ2.ipynb
RQ3.ipynb		RQ3.ipynb
final_xgb_model.pkl		final_xgb_model.pkl
readme.md		readme.md
sfs_model.pkl		sfs_model.pkl

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Machine Learning for Fraud Detection in E-commerce Transactions

Overview

Problem Statement

Methodology

Results

Data

Contact

About

Releases

Packages

Languages

SamKazan/fraud-detection-ml

Folders and files

Latest commit

History

Repository files navigation

Machine Learning for Fraud Detection in E-commerce Transactions

Overview

Problem Statement

Methodology

Results

Data

Contact

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages