- Author: Elaine Chu, Lukman Lateef, Dhruv Garg, Eugene You & Shawn Xiao Hu
This data analysis project is about the prediction of rental bikes in the Metro city of Seoul.
Currently Rental bikes are introduced in many urban cities for the enhancement of mobility comfort. It is important to make the rental bike available and accessible to the public at the right time as it lessens the waiting time. Eventually, providing the city with a stable supply of rental bikes becomes a major concern. The crucial part is the prediction of bike count required at each hour for the stable supply of rental bikes.
The data set that was used in this project is dataset contains count of public bicycles rented per hour in the Seoul Bike Sharing System, with corresponding weather data and holiday information created by Sathishkumar V E, Jangwoo Park, Yongyun Cho, "Using data mining techniques for bike sharing demand prediction in Metropolitan city", Computer Communications. It was sourced from the UCI Machine Learning Repository (Dua and Graff 2017) and can be found here.
The comprehensive report and the analysis of the Seoul Bike Share Prediction can be found here.
- Docker
- Jupyter Lab (Version 4.2.4 or Higher)
- Conda Lock (Version 2.5.7 or Higher)
- Conda (Version 24.9.1 or Higher)
If you are using Windows or Mac, make sure Docker Desktop is running.
- Clone this GitHub repository.
- Navigate to the root of this project on your computer using the command line and enter the following command:
docker compose up
- In the terminal, look for a URL that starts with
http://127.0.0.1:8888/lab?token=
(for an example, see the highlighted text in the terminal below). Copy and paste that URL into your browser.
- Navigate to the root of this project on your computer using the command line and enter the following command to reset the project to a clean state (i.e., remove all files generated by previous runs of the analysis):
make clean
- To run the analysis in its entirety, enter the following command in the terminal in the project root:
make all
- You can run individual commands for file creation
For downloading, preprocessing, EDA, fitting, or evaluating separately, you can use:
make dats
make preprocess
make eda
make fit
make evaluate
- For cleaning data, figures, tables, models or reports, you can run:
make clean-dats
make clean-figs
make clean-tables
make clean-models
make clean-reports
- To shut down the container and clean up the resources,
type
Cntrl
+C
in the terminal where you launched the container, and then typedocker compose rm
-
Add the dependency to the
environment.yml
file on a new branch. -
Run
conda-lock -k explicit --file environments/environment.yml -p linux-64
to update theconda-linux-64.lock
file. -
Re-build the Docker image locally to ensure it builds and runs properly.
-
Push the changes to GitHub. A new Docker image will be built and pushed to Docker Hub automatically. It will be tagged with the SHA for the commit that changed the file.
-
Update the
docker-compose.yml
file on your branch to use the new container image (make sure to update the tag specifically). -
Send a pull request to merge the changes into the
main
branch.
The Seoul Bike Share Predictor software code contained in this project are licensed under MIT license. See the licence file here for more information. The project report is licensed under Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0) License. See the license file for more information. For proper referencing, when re-using any part of this code and/or report, please include the link to this webpage.
Dua, Dheeru, and Casey Graff. 2017. “UCI Machine Learning Repository.” University of California, Irvine, School of Information; Computer Sciences. (https://archive.ics.uci.edu/).
Sathishkumar V E, Jangwoo Park, Yongyun Cho, "Using data mining techniques for bike sharing demand prediction in Metropolitan city", Computer Communications, vol. 153, pp. 353-366, 2020.
Sathishkumar V E, Yongyun Cho, "A rule-based model for Seoul Bike sharing demand prediction using Weather data", European Journal of Remote Sensing, Vol. 52, no. 1, pp. 166-183, 2020.