Introduction to machine learning with scikit-learn

This repo contains IPython notebooks from my scikit-learn video series, as seen on Kaggle's blog.

Want to learn even more about scikit-learn? I teach an online course, Machine Learning with Text in Python.

Entire series

Read the blog posts (Kaggle's blog)
Watch the entire series (YouTube playlist)
View the IPython Notebooks (nbviewer)
Run the IPython Notebooks online (binder)

Individual videos

What is machine learning, and how does it work? (video, notebook, blog post)
- What is machine learning?
- What are the two main categories of machine learning?
- What are some examples of machine learning?
- How does machine learning "work"?
Setting up Python for machine learning: scikit-learn and IPython Notebook (video, notebook, blog post)
- What are the benefits and drawbacks of scikit-learn?
- How do I install scikit-learn?
- How do I use the IPython Notebook?
- What are some good resources for learning Python?
Getting started in scikit-learn with the famous iris dataset (video, notebook, blog post)
- What is the famous iris dataset, and how does it relate to machine learning?
- How do we load the iris dataset into scikit-learn?
- How do we describe a dataset using machine learning terminology?
- What are scikit-learn's four key requirements for working with data?
Training a machine learning model with scikit-learn (video, notebook, blog post)
- What is the K-nearest neighbors classification model?
- What are the four steps for model training and prediction in scikit-learn?
- How can I apply this pattern to other machine learning models?
Comparing machine learning models in scikit-learn (video, notebook, blog post)
- How do I choose which model to use for my supervised learning task?
- How do I choose the best tuning parameters for that model?
- How do I estimate the likely performance of my model on out-of-sample data?
Data science pipeline: pandas, seaborn, scikit-learn (video, notebook, blog post)
- How do I use the pandas library to read data into Python?
- How do I use the seaborn library to visualize data?
- What is linear regression, and how does it work?
- How do I train and interpret a linear regression model in scikit-learn?
- What are some evaluation metrics for regression problems?
- How do I choose which features to include in my model?
Cross-validation for parameter tuning, model selection, and feature selection (video, notebook, blog post)
- What is the drawback of using the train/test split procedure for model evaluation?
- How does K-fold cross-validation overcome this limitation?
- How can cross-validation be used for selecting tuning parameters, choosing between models, and selecting features?
- What are some possible improvements to cross-validation?
Efficiently searching for optimal tuning parameters (video, notebook, blog post)
- How can K-fold cross-validation be used to search for an optimal tuning parameter?
- How can this process be made more efficient?
- How do you search for multiple tuning parameters at once?
- What do you do with those tuning parameters before making real predictions?
- How can the computational expense of this process be reduced?
Evaluating a classification model (video, notebook, blog post)
- What is the purpose of model evaluation, and what are some common evaluation procedures?
- What is the usage of classification accuracy, and what are its limitations?
- How does a confusion matrix describe the performance of a classifier?
- What metrics can be computed from a confusion matrix?
- How can you adjust classifier performance by changing the classification threshold?
- What is the purpose of an ROC curve?
- How does Area Under the Curve (AUC) differ from classification accuracy?

Name		Name	Last commit message	Last commit date
Latest commit History 29 Commits
images		images
styles		styles
.gitignore		.gitignore
01_machine_learning_intro.ipynb		01_machine_learning_intro.ipynb
02_machine_learning_setup.ipynb		02_machine_learning_setup.ipynb
03_getting_started_with_iris.ipynb		03_getting_started_with_iris.ipynb
04_model_training.ipynb		04_model_training.ipynb
05_model_evaluation.ipynb		05_model_evaluation.ipynb
06_linear_regression.ipynb		06_linear_regression.ipynb
07_cross_validation.ipynb		07_cross_validation.ipynb
08_grid_search.ipynb		08_grid_search.ipynb
09_classification_metrics.ipynb		09_classification_metrics.ipynb
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Introduction to machine learning with scikit-learn

Entire series

Individual videos

About

Releases

Packages

Languages

BernardOng/scikit-learn-videos

Folders and files

Latest commit

History

Repository files navigation

Introduction to machine learning with scikit-learn

Entire series

Individual videos

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages