5GROWTH-AIMLP

Project information

5GROWTH is funded by the European Union’s Research and Innovation Programme Horizon 2020 under Grant Agreement no. 856709

Call: H2020-ICT-2019. Topic: ICT-19-2019. Type of action: RIA. Duration: 30 Months. Start date: 1/6/2019

5GROWTH-AIMLP

This repository contains the code for the AI/ML platform developed in the 5Growth EU project.

The AI/ML Platform, a novel component of the 5Growth architecture, is a centralized and optimized environment to train and host AI/ML models.

Whenever an entity of the 5Gr stack needs a trained AI/ML model, it can query the AI/ML Platform in order to retrieve it. If the model is not trained yet, a training job will be triggered and, as soon as it is completed, the link to download the trained model will be made available to the entity.

Requirements

The AI/ML Platform works in conjunction with an Apache Hadoop cluster. The required projects are:

Apache HDFS
Apache YARN
Apache Spark (developed and tested with version 2.4)
BigDL (for deep neural networks models)

To install BigDL, follow the instructions in the project documentation. It is suggested to build the requirements archive using 'conda pack'. The following commands can be used as example:

#!/bin/sh
conda create -y -n bigdl python=3.6
conda activate bigdl
# requirements as provided by bigdl
conda install -y -c conda-forge --file bigdl/bin/requirements.txt
conda pack -o environment.tar.gz
conda deactivate bigdl
conda remove --name bigdl --all

The application is written in Python, so a Python interpreter (version > 3.7) is required to run it. The following packages, which can be installed using pip, are required:

flask
marshmallow
marshmallow-sqlalchemy
sqlalchemy
flask-sqlalchemy
flask-marshmallow
flask-login
pyarrow
pyspark=2.4

Installation

Copy the projet folder in a machine that has access to the cluster, using an user with the proper rights, i.e. can execute spark-submit and access HDFS.

Before being able to use the platform, it is required to customize the config.py and rest-spark-submit.sh files, according to the cluster configuration and available compute resources. The provided files, which contain the configuration used to develop and test the platform, can be used as example.

Usage

Run the command python3 main.py. The default server port is 5000.

The first time the application is started, before it can be utilized, it requires the generation of the sqlite3 database. To do that, call GET /reset?forced=1. To clean the database without deleting the registered users, use GET /reset.

Name		Name	Last commit message	Last commit date
Latest commit History 52 Commits
rest		rest
web		web
.gitignore		.gitignore
LICENSE		LICENSE
NOTICE		NOTICE
README.md		README.md
build_db.py		build_db.py
config.py		config.py
main.py		main.py
model.py		model.py
rest-spark-submit.sh		rest-spark-submit.sh
rest-spark-submit_light.sh		rest-spark-submit_light.sh
spark-process-ns.py		spark-process-ns.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Project information

5GROWTH-AIMLP

Requirements

Installation

Usage

About

Releases

Packages

Contributors 3

Languages

License

5growth/aimlp

Folders and files

Latest commit

History

Repository files navigation

Project information

5GROWTH-AIMLP

Requirements

Installation

Usage

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages