Skip to content

This project focuses on predicting customer churn in an e-commerce setting using machine learning techniques.

Notifications You must be signed in to change notification settings

MariaDimopoulou/Churn-Prediction-Customer-Segmentation-in-E-Commerce

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 

Repository files navigation

E-Commerce Customer Churn Analysis and Prediction

This project focuses on predicting customer churn in an online retail company using machine learning algorithms. The dataset, sourced from Kaggle, underwent extensive exploratory data analysis (EDA), preprocessing, and classification methods, including Random Forest and XGBoost. Additionally, clustering techniques such as K-Means and DBSCAN were employed to identify customer segments.

Project Overview

  1. Introduction: The goal is to predict customer churn and perform customer segmentation to tailor promotional strategies.
  2. Exploratory Data Analysis (EDA): Analyzing data shape, types, correlations, imbalances, and missing values.
  3. Data Preprocessing: Handling missing values, outliers, encoding categorical variables, and balancing imbalanced data.
  4. Classification Methods: Employing Random Forest, XGBoost, and Logistic Regression with and without balancing data.
  5. Clustering Methods: Utilizing K-Means, DBSCAN, and Hierarchical clustering techniques.

Classification Results

  • Random Forest: Achieved high accuracy, precision, and AUC-ROC; slightly lower recall.
  • XGBoost: Outperformed other classifiers in most metrics.
  • Logistic Regression: Showed comparatively lower scores.

Clustering Results

  • K-Means with t-SNE: Produced the most accurate clusters compared to other clustering methods.
  • DBSCAN: Demonstrated less accurate clustering.
  • Hierarchical Clustering: Used for visualizing dendrogram structure.

Acknowledgments

  • Dataset source: Kaggle.
  • Libraries used: pandas, scikit-learn, xgboost, seaborn, and others.

Releases

No releases published

Packages

No packages published