In this project, I have implemented a straightforward ETL (Extract, Transform, Load) process. I used Python, along with the Selenium library, to scrape data from the Premier League website. After extracting the data, I performed various transformations using Pandas and NumPy. Finally, I loaded the transformed data into a Google BigQuery database.
For the next version of this project, I plan to enhance the ETL pipeline by utilizing Apache Airflow and the BigQueryOperator to automate and streamline the data extraction, transformation, and loading processes.
Bellow you find the result of this ETL Pipline.
LINKS: https://www.premierleague.com/stats/top/players/goals