Skip to content

Explore Formula 1 data with F1 API Analytics. Extract, clean, predict and visualize data. Final dashboards available in the linked PDF.

Notifications You must be signed in to change notification settings

lucia-corsan/F1-Analytics

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Formula One Analytics

Jupyter notebooks with the automatized process to obtain the necessary data from F1 API, resulting in Looker interactive dashboards.

  • Course: Web Analytics
  • Final grade: 9.2/10

Contents

  • code.ipynb: Jupyter Notebook with data extraction, cleaning, and visualizations.
  • Dashboard Compilation.pdf: PDF with the final dashboards.

Note

The Dashboard Compilation.pdf contains a non-interactive version of the dashboards, check for the interactive version here.

Tools and covered topics

  • Jupyter Notebook
  • Google Looker Studio
  • Python libraries: numpy, pandas, sqlearn, matplotlib, seaborn
  • Manipulation of temporal series and prediction: AutoRegressive Integrated Moving Average (ARIMA)
  • ML models to compare their feature importance: MLPs, Gradient Boosting, DTs, Random Forest, Linear Regression

Objective

The primary goal is to work with Formula One API Ergast (see Documentation here), as well as to perform Data Analytics and a comparison of Machine Learning techniques from the dataset extracted.

Key Features

  • Prediction of the champion in the next season
  • Driver Performance Analysis: Explore how drivers perform across different races and seasons.
  • Race Strategy Optimization: Analyze race data to optimize pit stop strategies and track performance.
  • Team Comparison: Compare the performance of different F1 teams over time.
  • Data Visualization: Visualize complex F1 data in an intuitive Looker dashboard.

Expected Results

The final result should represent accurately the statistics and outputs of the ML models for the Formula One drivers, cars and teams in a Looker Dashboard.

The ML models to compare its feature importances are:

  • Random Forest
  • Linear Regression
  • MLP
  • Gradient Boosting
  • Decision Trees

Some examples of analytics metrics performed are:

  • Clustering of drives thourghout F1 history
  • Lap times in each circuit along history
  • Boxes and result per circuit and year
  • Evolution of the champions' points per year

Evaluation Criteria

The project will be evaluated based on the following criteria:

  1. Code Execution: Your code must run without errors.
  2. Scalability: Your implementation should be scalable, tested with datasets ten times larger on a cluster with 80 execution workers.
  3. Documentation: Adequate code documentation is required.
  4. Complexity of the Machine Learning techniques: Advanced techniques must be performed.

Contributing

Contributions are welcome! Please fork the repository and submit a pull request with your changes. Feel free to customize it further to match your repository's specific details and needs!

About

Explore Formula 1 data with F1 API Analytics. Extract, clean, predict and visualize data. Final dashboards available in the linked PDF.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published