Nordeus Data Science Challenge Submission

Project Overview

This project is a submission for the Nordeus JobFair Data Science Challenge, dedicated to predicting league ranks for clubs inside of Top Eleven. Using machine learning models, the aim to forecast the league position of each club at the end of a season. Presumably, the goal is to create a balanced and competitive experience for players.

Environment

The project was realised inside Google Colab, utilising its resources, because of the limited computing perfornace of the local machine.

Dataset

The project utilizes two datasets:

jobfair_train.csv - Contains features like user activity, player statistics, and the target variable league_rank.
jobfair_test.csv - Similar to the training dataset but without the target variable, for model prediction.

Features

Features include user engagement metrics, player quality indicators, and other relevant game activity data.

Machine Learning Models

We explore several models:

RandomForestClassifier
XGBoostClassifier
LGBMClassifier
DecisionTreeClassifier
StackingClassifier (with Logistic Regression as the final estimator), achieving the best MAE of 2.59

Setup and Installation

Make sure Python is installed on your system. Dependencies include:

pandas
scikit-learn
xgboost
matplotlib

Install these using pip:

pip install pandas scikit-learn xgboost matplotlib

Usage

To run the models and evaluate their performance, follow these steps:

Load the datasets jobfair_train.csv and jobfair_test.csv.
Preprocess the data as per the preprocessing steps outlined in the code.
Train the machine learning models using the preprocessed training data.
Evaluate the models using cross-validation techniques.
Use the trained models to make predictions on the preprocessed test data.
Analyze the results, and adjust the models or preprocessing steps as needed.

Future Improvements

There are several areas where this project can be further enhanced:

Further Data Exploration: Implementing more sophisticated data exploration techqniues (eg. different plots) to avoid having a large bias.
Advanced Modeling Techniques: Experimenting with more sophisticated machine learning algorithms or deep learning models.
Feature Engineering: Exploring additional features or transformations that could improve model performance.
Hyperparameter Tuning: More extensive tuning of model parameters to optimize performance.
Data Augmentation: Increasing the dataset size or variety, possibly by incorporating additional relevant data sources.
Model Interpretability: Implementing tools and techniques for better understanding and interpreting the model's decisions.
Deployment Strategy: Developing a plan for deploying the model in a real-world environment, ensuring scalability and maintainability.

Usage

To run the models and evaluate their performance, follow these steps:

Load the datasets jobfair_train.csv and jobfair_test.csv.
Preprocess the data as per the preprocessing steps outlined in the code.
Train the machine learning models using the preprocessed training data.
Evaluate the models using cross-validation techniques.
Use the trained models to make predictions on the preprocessed test data.
Analyze the results, and adjust the models or preprocessing steps as needed.

Repository Structure

README.md: This file, providing an overview and instructions.
NordeusChallenge.ipynb: Contains code for the challenge.
league_rank_predictions.csv: Includes all the predictions.

License

This project was realised as part of the JobFiar 2023 Nordeus challenge.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Nordeus Data Science Challenge Submission

Project Overview

Environment

Dataset

Features

Machine Learning Models

Setup and Installation

Usage

Future Improvements

Usage

Repository Structure

License

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
Challenge		Challenge
NordeusChallenge.ipynb		NordeusChallenge.ipynb
README.md		README.md
league_rank_predictions.csv		league_rank_predictions.csv

stefisha/NordeusChallenge-2023

Folders and files

Latest commit

History

Repository files navigation

Nordeus Data Science Challenge Submission

Project Overview

Environment

Dataset

Features

Machine Learning Models

Setup and Installation

Usage

Future Improvements

Usage

Repository Structure

License

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages