The repository contains the code and report for Machine Learning course 2021 (CS-433) project 2 at EPFL. The purpose of this project is to classify snowflakes with hydrometeors class and estimate their riming degree.
The project is accomplished by team Ma_BG
with members:
- Marie-Alix Gillyboeuf: [@GILLYBOEUF]
- Baptiste Hernette: [@Bapitou]
- Gaspard Villa: [@gaspardvilla]
The data for both hydrometeor classification and riming degree are on a google drive, it is not necessary to download them as the file ‘dataloader.py’ do it directly.
The project has been developed and test with python3.6
.
The required library for running the models and training is numpy
, panda
and sklearn
.
The library for visualization is matplotlib
.
Results to predict the test datasets are generated by running:
run.py
.
models.py
: the implementation of the 4 machine learning models to train with their hyperparameters to tuned.
run.py
: the results obtained with the models for both hydrometeor classification and estimation of riming degree.
tutorial.py
: a small run to show how the cross validation for hyperparameters tuning was implemented.
dataloader.py
: permits to load the data from a google drive and obtain the classes to estimate.
Dataprocess.py
: process the data by taking the columns we need and processing standardization.
helpers.py
: useful tools data preprocessing and to load our save the different models.
cross_validalidation.py
: using cross-validation to find the best hyperparameters with the test accuracy of different models.
plots.py
: functions that gives us the tools to visualize the data.
ML_Snowflakes.pdf
: a 4-pages report of the complete solution.