SilverScreenAnalytics is a project focused on analyzing a comprehensive movie dataset to explore patterns, trends, and relationships within the data. The aim was to uncover meaningful insights and conduct exploratory analysis on various aspects of the movies included in the dataset.
The dataset contained the following columns:
- budget: The production budget of the movie
- genres: Genre(s) associated with the movie
- homepage: Official homepage URL
- id: Unique identifier for the movie
- keywords: Key themes or tags associated with the movie
- original_language: Language in which the movie was originally released
- original_title: Original title of the movie
- overview: Brief summary of the movie's plot
- popularity: Popularity score based on various factors
- production_companies: Companies involved in the movie's production
- production_countries: Countries where the movie was produced
- release_date: Official release date
- revenue: Total revenue generated by the movie
- runtime: Movie duration in minutes
- spoken_languages: Languages spoken in the movie
- status: Release status (e.g., Released, Post-Production)
- tagline: Promotional tagline for the movie
- title: Movie title
- vote_average: Average audience rating
- vote_count: Number of votes received
- Removed redundant rows and columns.
- Handled missing data appropriately.
- Changed data types for consistency and analysis.
- Flattened JSON columns for easier exploration and analysis.
- Action Genre Exploration: Focused on movies within the action genre, analyzing patterns specific to this category.
- Budget and Profitability:
- Identified the top 5 least and most expensive movies.
- Analyzed the top 5 most profitable movies.
- Popularity and Ratings:
- Conducted popularity analysis to identify trends.
- Explored movies rated above 7 to identify quality films.
- Genre Frequency: Investigated the frequency of movies in each genre.
- Comparative Analysis: Compared profits, revenue, and budgets for the top 5 most expensive and most profitable movies.
- Relationships Between Variables: Explored correlations and relationships between popularity, vote average, and profit.
The exploration highlighted interesting trends and patterns, including how budget and profitability relate to popularity and audience ratings, as well as the distribution of movies across genres. These findings serve as a foundation for further analysis and hypothesis generation.