Churn Prediction with PySpark

This Jupyter notebook runs through a simple tutorial of how churn prediction can be performed using Apache Spark.

It uses the Python API to perform basic analysis on the Orange Telco Churn Data, generate decision tree models using MLlib and construct a model selection pipeline with the ML package.