Air travel has become an essential part of modern life, connecting people all over the globe. However, the occurrence of flight cancellations can disrupt plans and inconvenience passengers, which impacts both individuals and various airlines. The ability to anticipate and mitigate such disruptions is crucial. Using machine learning, we aimed to create a predictive model that determines whether a flight will be canceled or not. We hope to aid airlines and travelers by providing information that can potentially make travel smoother and prevent issues regarding flight cancellations.
The dataset used for our prediction model (Combined_Flights_2022.csv) is a collection of flights from year 2022. Using this dataset, we hope to predict if flights will be canceled or not. This dataset can be found on Kaggle using the following link: https://www.kaggle.com/datasets/robikscube/flight-delay-dataset-20182022?select=Combined_Flights_2022.csv.
We started by creating three baseline models: Simple Baseline, Latent Factor, and Logistic Regression. Using the results obtained from these models, the final models implemented were Decision Tree and Random Forest, providing higher precision and recall scores, which contribute to a higher f-1 score.