The goal of this REST API is to predict the number of bikes or slots available for a Bicing station at a specific time.
By analysing data from different providers and building prediction model foreach Bicing's station, it can advice customers of the best time to pick or return a bike at a station.
Getting Started • Features • Built With • Development • Machine Learning • CI and Deployment
To install and run the API you need Docker Compose and... that's all. Please follow the official documentation to install it on your environment.
Clone the project and run the default installation:
git clone https://github.com/lechatquidanse/bicing-prediction-api.git && cd bicing-prediction-api && make install
Your docker containers should have been successfully built and run.
Multiple features are proposed across 2 user interfaces, a REST API and command-line commands:
The Makefile contains useful command for development purpose
Code and folder structure follow Domain Driven Design (DDD).
Here is a good article to understand naming and folder Domain Driver Design, little explanation and example, even if the technology used is PHP.
src
\
|\ Application `Contains the Use Cases and the Processes of the domain system, commands, handlers and data providers`
|
|\ Domain `The system business logic layer (Models, Exceptions...)`
|
|\ Infrastructure `Its the implementation of the system outside the model. I.E: Persistence, Query, etc`
|
|\ UserInterface `It contains all the interfaces allowed for a user of the API (Cli, Rest, etc)`
For a each stations a prediction model is created thanks to machine learning algorithms. It will allow us to forecast the number of bikes/slots available at a specific time.
Here is some very helpful resources that I encourage you to read to understand machine learning and forecasting algorithm:
- Machine Learning basics tutorial
- France Prediction for bike availability at bike sharing stations
- Forecasting Valencia’s bike share system
- Understand Time Series Prediction
- Time-Series-ARIMA-XGBOOST-RNN Implementation
To create a prediction model, our machine needs data. For now, data used come from Bicing Statistics Api project
Those data are a sequence of observations (availabilites for each stations) taken sequentially in time. So in order to provide prediction, we choose to use time series technique.
This technique predict future events by analyzing the trends of the past, on the assumption that future trends will hold similar to historical trends
The implementation of this algorithm is made with A Regression tree-based XGBoost.
Here is the result for 33 - C/PONTEVEDRA / JUDICI
station:
-
In blue, the data provided to train the prediction model
-
In orange, the data provided to validate the prediction model
-
In green, the data forecasted by the prediction model
The algorithm implementation is a first version. So it's very naive and could be improved by a lot of different approach. Adding weather, station's geo-location or holiday calendar dataset to train the model could make predictions more accurate.
CI and deployment can be handled through Gitlab and Docker thanks to .gitlab-ci.yml It contains 3 different stages.
Environment 'test' is triggered when a 'feature/*' branch is pushed to the repository. It will then install project and launch qa tools.
Environment 'build' is triggered when a 'release/*' branch is pushed to the repository. It will then install project, launch qa tools and then build and push a docker image on a registry if no error occured.
This manual action, will pull the image build by the previous step and update the specific container.
Stéphane EL MANOUNI · Linkedin