Forecasting Bus Passenger Occupancy using Time Series Analysis

Anton Drasbæk Schiønning (@drasbaek) & Mina Almasi (@MinaAlmasi)
Data Science, Prediction, and Forecasting (F24)
Aarhus University, Cognitive Science MSc.

🚌 About

This repository contains scripts for developing a pipeline to forecast passenger occupancy at various bus stops on Midttrafik's route 1A in Aarhus. We trained several NeuralProphet models (via grid search), a SARIMA model, and three baselines. The main analysis was focused on the bus stop Nørreport.

To run the pipeline, see the Setup and Usage sections. Note that the initial preprocessing of 1A cannot be reproduced as the file is not shareable. However, data for the five processed bus stops is available in the data folder, allowing the rest of the pipeline to be run.

Project Overview

The repository is structured as such:

Folder/File	Description
`data/`	Contains five bus stops from 1A (raw and aggregated).
`raw_data/`	Empty folder where the raw data can be placed for the initial processing to run.
`plots/`	Plots used in the paper and appendix.
`results/`	Evaluation results and forecasts for the main analysis.
`src/`	Python code related to the project.

For a greater overview of the Python code, see the src/README.md.

💻 Technical Requirements

Grid search and model training was run via Ubuntu 22.04.3, Python 3.10.12 (UCloud, Coder Python 1.87.2). Other analysis work such as plotting was done locally on a Macbook Pro ‘13 (2020, 2 GHz Intel i5, 16GB of ram). Python's venv need to be installed for the code to run as intended.

Code should also work on Python 3.12 although this cannot be guaranteed for all parts of the pipeline.

Please also note that the advanced models were computionally intensive and were run on a 64 machine on UCloud.

🛠️ Setup

Prior to running the code, run the command below to create a virtual environment (env) and install necessary packages within it:

bash setup.sh

🚀 Usage

To run any script in the src folder, you can type specify the script's path in the terminal (with the env active):

# activate env
source env/bin/activate

# run script
python src/neural-prophet/test_prophet.py

# quit env 
deactivate

See also src/README.md for the scripts overview. Note that you cannot run most files in process_data as the raw data is not available on Git.

🌟 Acknowledgements

This work was only possible thanks to our data provider, Midttrafik.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Forecasting Bus Passenger Occupancy using Time Series Analysis

🚌 About

Project Overview

💻 Technical Requirements

🛠️ Setup

🚀 Usage

🌟 Acknowledgements

About

Releases

Packages

Contributors 2

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 193 Commits
data		data
plots		plots
raw_data		raw_data
results		results
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
setup.sh		setup.sh

License

MinaAlmasi/midttrafik-time-series

Folders and files

Latest commit

History

Repository files navigation

Forecasting Bus Passenger Occupancy using Time Series Analysis

🚌 About

Project Overview

💻 Technical Requirements

🛠️ Setup

🚀 Usage

🌟 Acknowledgements

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages