Customer Segmentation

Third project in Udacity's data science nanodegree (unsupervised learning)

Installation

This project uses Python 3 and is designed to be completed through the Jupyter Notebooks IDE. It is highly recommended that you use the Anaconda distribution to install Python, since the distribution includes all necessary Python libraries as well as Jupyter Notebooks. The following libraries are expected to be used in this project:

NumPy
pandas
Sklearn / scikit-learn
Matplotlib (for data visualization)
Seaborn (for data visualization)

Project Overview

This project is a real-life project with data provided by Udacity's Bertelsmann partners AZ Direct and Arvato Finance Solution. The data here concerns a company that performs mail-order sales in Germany. Their main question of interest is to identify facets of the population that are most likely to be purchasers of their products for a mailout campaign. In this project I will use unsupervised learning techniques to organize the general population into clusters, then use those clusters to see which of them comprise the main user base for the company. Prior to applying the machine learning methods, the data needs to be assessed and cleaned in order to convert it into a usable form.

Project Motivation

The unsupervised learning branch of machine learning is key in the organization of large and complex datasets. While unsupervised learning lies in contrast to supervised learning in the fact that unsupervised learning lacks objective output classes or values, it can still be important in converting the data into a form that can be used in a supervised learning task. Dimensionality reduction techniques can help surface the main signals and associations in data, providing supervised learning techniques a more focused set of features upon which to apply their work. Clustering techniques are useful for understanding how the data points themselves are organized. These clusters might themselves be a useful feature in a directed supervised learning task. This project is a hands-on experience with a real-life task that makes use of these techniques, focusing on the unsupervised work that goes into understanding a dataset. In addition, the dataset presented in this project requires a number of assessment and cleaning steps before applying machine learning methods. In workplace contexts, data scientist frequently need to work with data that is untidy or needs preprocessing before standard algorithms and models can be applied.

File Description

Identify_Customer_Segments.ipynb is the Jupyter notebook that have the work of the project.
Identify_Customer_Segments.html is the notebook saved as html file

License

This project is licensed under the MIT License - see the LICENSE file for details

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
Identify_Customer_Segments.html		Identify_Customer_Segments.html
Identify_Customer_Segments.ipynb		Identify_Customer_Segments.ipynb
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Customer Segmentation

Table of Contents

Installation

Project Overview

Project Motivation

File Description

License

About

Releases

Packages

Languages

License

Rawan-Alharbi/Cusomer-Segmentation

Folders and files

Latest commit

History

Repository files navigation

Customer Segmentation

Table of Contents

Installation

Project Overview

Project Motivation

File Description

License

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages