Disney Dataset Creation & Analysis

In this video we walk through a series of data science tasks to create a dataset on disney movies and analyze it using Python Beautifulsoup, requests, and several other libraries along the way.

Setup

To access all of the files I recommend you fork this repo and then clone it locally. Instructions on how to do this can be found here: https://help.github.com/en/github/getting-started-with-github/fork-a-repo

The other option is to click the green "clone or download" button and then click "Download ZIP". You then should extract all of the files to the location you want to edit your code.

Installing Jupyter Notebook: https://jupyter.readthedocs.io/en/latest/install.html

Background Information

This repo goes along with my video "Solving real world data science tasks with Python BeautifulSoup!

In this video we scrape Wikipedia pages to create a dataset on Disney movies.

The video is formatted with tasks for you to try to solve on your own throughout. For the best learning experience, at each task you should pause the video, try the task on your own, and then resume when you want to see how I would solve it.

We cover a wide range of Python & data science topics in this video. They include:

Web scraping with BeautifulSoup
Cleaning data
Testing code with Pytest
Pattern matching with regular expressions (Re library)
Working with dates (datetime library)
Saving & loading data with Pickle library
Accessing data from an API using Requests library

To see the steps to create the dataset, check out dataset-creation.ipynb
In a future video we will analyze the dataset in dataset-analysis.ipynb

Save/Load the Datasets

If you want to jump into a specific task, feel free to utilize the dataset checkpoints.
To load these files you can look at the functions found in this file.
If you want to just do analysis on the final dataset, check out this folder.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
dataset		dataset
dataset_checkpoints		dataset_checkpoints
helper		helper
Dataset-Analysis.ipynb		Dataset-Analysis.ipynb
Dataset-Creation.ipynb		Dataset-Creation.ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Disney Dataset Creation & Analysis

Setup

Background Information

Save/Load the Datasets

About

Releases

Packages

Languages

KeithGalli/disney-data-science-tasks

Folders and files

Latest commit

History

Repository files navigation

Disney Dataset Creation & Analysis

Setup

Background Information

Save/Load the Datasets

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages