GitHub - abdulelahsm/SparkifyChurn: Here I attempt to learn to manipulate large and realistic datasets with Spark to engineer relevant features for predicting churn.

predicting_churn_with_spark

Here I attempt to learn to manipulate large and realistic datasets with Spark to engineer relevant features for predicting churn.

I used Spark MLlib to build machine learning models with large datasets, far beyond what could be done with non-distributed technologies like scikit-learn.

Predicting churn rates is a challenging and common problem that data scientists and analysts regularly encounter in any customer-facing business. Additionally, the ability to efficiently manipulate large datasets with Spark is one of the highest-demand skills in the field of data.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
README.md		README.md
Sparkify.html		Sparkify.html
Sparkify.ipynb		Sparkify.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

predicting_churn_with_spark

About

Releases

Packages

Languages

abdulelahsm/SparkifyChurn

Folders and files

Latest commit

History

Repository files navigation

predicting_churn_with_spark

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages