Violent threat detection application

An application for detection of violent threats in online discussions and forums. This project includes

Training workflow for a multinomial Naive Bayes classifier
Model test validation and tracking
Prediction and monitoring APIs + Swagger documentation
Simple web user interface
Logging and monitoring dashboard

Documentations are available here. This repository has been created using the MLOps Platform Skeleton here

Overview

The full setup consists of three steps:

Training - A training script trains a model for the Threat dataset with sklearn, training is orchestrated by prefect and the models metrics and artifacts (the actual models) are uploaded to mlflow.
Serving - The model is pulled and FastAPI delivers the prediction, a streamlit app serves as the user interface.
Monitoring - Metrics about the API usage/performance are pushed to Prometheus/Grafana and shown in a dashboard.

The individual services are packaged as docker containers and setup with docker compose.

How to use

Prerequisite: Install Docker (Windows: Docker Desktop)

Download repository from GitHub

git clone https://github.com/dpleus/mlops.git

Start docker compose (from project folder)

docker composer up

Access individual services

Prefect http://localhost:4200
mlflow http://localhost:5000
FastAPI (to test) http://localhost:8086/docs
Streamlit UI http://localhost:8501
Grafana Dashboard http://localhost:3000 Login: admin/admin

Create example model

Run deployment in Prefect UI, deploy model artifacts in mlflow, tag it with "production" in mflow.

Note: The UI will only work if there is one "production" model in mlflow.

Services

1) Docker and docker compose

docker-compose.yaml contains the definitions for all services. For every service it contains the docker image (either through build if based on a Dockerfile, or through image if a remote image). Also it opens the relevant ports within your "docker compose network", so that the services can communicate with each other. Additionally, a common volume for all containers that use mlflow is created and mounted into /mlruns. For Prometheus/Grafana a few configuration files are also mounted.

To initialize all services the command docker compose up can be used from the project folder.

2+3) Training script and prefect

The training script and prefect (for orchestration) are packaged into one service.

The training script is placed under training/model_training.py.

The train function is wrapped into an mlflow flow operator. Also, it uses mlflow autolog.

prefetc is an orchestration tool and can therefore be used to schedule, monitor and organize jobs.

Based on the training script, a prefect deployment file train-deployment.yaml is generated using the following command:

prefect deployment build training/model_training.py:train

The Dockerfile ultimately glues these components together. It

Creates folders
Installs requirements.txt
Sets the PREFECT_API_URL and MLFLOW_TRACKING_URI*
Starts the server, pushes the deployment and starts an agent**

*Using docker you can refer to the containers ip using host.docker.internal and refer to the other services with their docker compose name, e.g http://mlflow:5000

**In this project the prefect server and the agent (who executes the scripts) are on one container.

4) FastAPI

FastAPI is a framework for high-performance API. In this project I implemented a /predict endpoint. If that endpoint is queried it will download the latest model from mlflow and output the prediction. Additionally, prometheus_fastapi_instrumentator scrapes events and sends them to Prometheus.

Please note: Currently the script will fetch the first model that is in production. It won't show any error if there is no model or there are multiple models.

5+6) Prometheus/Grafana

Prometheus open source monitoring system. Grafana is a dashboarding platform. In short, Prometheus receives the data, while Grafana puts a dashboard on top. For this project, I used the provided images and just added a few configuration files:

monitoring/prometheus.yml - Contains configuration to connect Prometheus to FastAPI

monitoring/datasource.yml - Grafana: Datasource configuration

monitoring/dashboard.json - Grafana: Dashboard

This part was heavily inspired by https://github.com/Kludex/fastapi-prometheus-grafana

7) Streamlit

Streamlit is a Python library to rapidly build UIs. The app is very simple and only passes input to the API to retrieve results.

Limitations

Multiple host machines: Kubernetes

This project is meant to be deployed on a single host machine. In practice, you might want to use Kubernetes to deploy it on multiple instances to gain more isolation and scalability. Kompose could be an option to convert your docker compose file to Kubernetes yaml.

Storage on cloud

All artifacts, logs, etc. are saved locally/on docker volumes. In practice, you would save them to the cloud.

Advanced Security

Security - of course. Authentication, SSL encryption, API authentication and what not. Good example using nginx. Example

References

Threat Corpus Dataset

Hammer, H. L., Riegler, M. A., Øvrelid, L. & Veldal, E. (2019). "THREAT: A Large Annotated Corpus for Detection of Violent Threats". 7th IEEE International Workshop on Content-Based Multimedia Indexing.

Wester, A. L., Øvrelid, L., Velldal, E., & Hammer, H. L. (2016). "Threat detection in online discussions". Proceedings of the 7th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis.

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
.github/workflows		.github/workflows
api		api
app		app
docs		docs
mlflow		mlflow
monitoring		monitoring
readme		readme
training		training
.gitignore		.gitignore
.prefectignore		.prefectignore
docker-compose.yaml		docker-compose.yaml
readme.md		readme.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Violent threat detection application

Overview

How to use

Services

1) Docker and docker compose

2+3) Training script and prefect

4) FastAPI

5+6) Prometheus/Grafana

7) Streamlit

Limitations

References

Threat Corpus Dataset

About

Contributors 2

Languages

bhnum/mlops-threats

Folders and files

Latest commit

History

Repository files navigation

Violent threat detection application

Overview

How to use

Services

1) Docker and docker compose

2+3) Training script and prefect

4) FastAPI

5+6) Prometheus/Grafana

7) Streamlit

Limitations

References

Threat Corpus Dataset

About

Topics

Resources

Stars

Watchers

Forks

Contributors 2

Languages