Skip to content

A Python-based project for collecting and analyzing YouTube data using the YouTube Data API. This project demonstrates how to fetch YouTube video metadata, preprocess the data, and perform exploratory data analysis (EDA) to gain insights into video trends, performance, and content engagement.

License

Notifications You must be signed in to change notification settings

sentryxgith/YouTube-Data-Collection-and-Analysis-with-Python

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

YouTube Data Collection and Analysis with Python

A Python-based project for collecting and analyzing YouTube data using the YouTube Data API. This project demonstrates how to fetch YouTube video metadata, preprocess the data, and perform exploratory data analysis (EDA) to gain insights into video trends, performance, and content engagement.


Features

  • Data Collection: Fetch video data such as titles, descriptions, view counts, likes, dislikes, and more using the YouTube Data API.
  • Data Preprocessing: Handle missing values, clean data, and transform it for analysis.
  • Exploratory Data Analysis (EDA):
    • Analyze video performance metrics.
    • Visualize trends such as popular categories, top-performing videos, and engagement rates.
    • Detect correlations between features.
  • Flexible Framework: Modular code structure for easy extension and maintenance.

Technologies Used

  • Python: Core programming language.
  • Pandas: For data manipulation and analysis.
  • Matplotlib & Seaborn: For data visualization.
  • YouTube Data API: For accessing YouTube video metadata.

Installation

  1. Clone the repository:

    git clone https://github.com/sentryxgith/YouTube-Data-Collection-and-Analysis-with-Python.git
    cd YouTube-Data-Collection-and-Analysis-with-Python
  2. Obtain your YouTube Data API key:

    • Visit Google Cloud Console.
    • Create a new project and enable the YouTube Data API v3.
    • Generate an API key and replace the placeholder in the script.

Usage

  1. Set up the API key: Open the script where the API key is required, and replace:

    API_KEY = "YOUR_API_KEY_HERE"
  2. Run the script:

    • Use main.py to fetch YouTube data.
    • Perform analysis using the Jupyter Notebooks provided.
  3. Explore the results:

    • View processed data in CSV format.
    • Open the EDA notebook to visualize insights.

Project Structure

YouTube-Data-Collection-and-Analysis-with-Python/
├── main.py                 # Script for data fetching
├── data.py                 # Script for data preprocessing
├── distribution.py         # Script for analyzing data by distribution
├── category.py             # Script for analyzing data by categories
├── duration.py             # Script for analyzing data by duration
├── tags.py                 # Script for analyzing data by tags
├── publish hour.py         # Script for analyzing data by publish hour
├── README.md               # Project documentation

Examples of Analysis

  • Category Popularity: Identify which categories have the most videos trending.
  • Engagement Metrics: Compare likes, dislikes, and comment counts for videos.
  • Time Trends: Understand how upload times affect video popularity.

Charts

View, Like and Comment Count Distribution Correlation Matrix of Engagement Metrics Number of Trending Videos by Category Average View, Like and Comment Count by Category Video Length vs View Count Average View, Like and Comment Count by Duration Range Number of Tags vs View Count Distribution of Videos by Publish Hour Publish Hour vs View Count

Contributing

Contributions are welcome! Please fork the repository, create a branch, and submit a pull request.


License

This project is licensed under the MIT License.


Acknowledgments

  • The YouTube Data API team for providing an excellent API.
  • The open-source Python community for their amazing libraries.

Happy analyzing! 🎉


Feel free to adjust this based on additional project-specific details!

About

A Python-based project for collecting and analyzing YouTube data using the YouTube Data API. This project demonstrates how to fetch YouTube video metadata, preprocess the data, and perform exploratory data analysis (EDA) to gain insights into video trends, performance, and content engagement.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages