Behind the Spotlight: Data Analysis on 25k IMDb Movies Dataset

Overview

This repository contains the code for my midterm project in CP102 - Computer Programming 2 at Manuel S. Enverga University Foundation. The goal of this project is to create a Python program that performs data wrangling, exploratory data analysis (EDA), and data visualization on a dataset. For this project, I chose to analyze the 25k IMDb Movies Dataset, which is an open-source dataset available on Kaggle.

Dataset

The data used for the analysis and visualization was sourced from Kaggle and is named 25k IMDb Movies.csv in the repository. It includes columns such as Movie Title, Run Time, Rating, User Rating, Genres, Overview Plot Keyword, Director, Top 5 Casts, Writer, Year, and Path, each containing different data types including strings, integers, floats, and lists. However, the raw dataset contained some impurities, such as typographical errors and mixed values in columns, which required data wrangling to address.

Code

The code for this project is contained in the Echevaria_Movies_Analysis.ipynb Jupyter Notebook file. This notebook contains the code for data wrangling, exploratory data analysis, and data visualization. The notebook is well-documented with markdown cells explaining the purpose of each code block.

Results

The results of this analysis are presented in the Echevaria_Movies_Analysis.ipynb notebook. The analysis includes a summary of the dataset, exploratory data analysis, and visualizations of various aspects of the data. The insights gained from this analysis are discussed in the Midterm Portfolio.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
25k IMDb movie Dataset.csv		25k IMDb movie Dataset.csv
Echevaria_Movies_Analysis.ipynb		Echevaria_Movies_Analysis.ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Behind the Spotlight: Data Analysis on 25k IMDb Movies Dataset

Overview

Dataset

Code

Results

About

Releases

Packages

Languages

AkunoCode/CP102-Midterm_Project

Folders and files

Latest commit

History

Repository files navigation

Behind the Spotlight: Data Analysis on 25k IMDb Movies Dataset

Overview

Dataset

Code

Results

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages