How are you feeling?

Cloud Engineering Final Project

Contributors: Gabriel Zhang, Kate Yee, Xenia Vrettakou & Zoe Li

Purpose

Addressing mental health concerns has become increasingly important in the persisting isolation of the post-pandemic period. Our goal for this project was to create a cloud based web app to enable early detection of emotional distress to proactively provide students with counselling services.

Architecture

Our architecture can be easily split into three processes:

1. Training Data Augmentation - When an image training data set is uploaded to the S3 Raw bucket, a lambda function is triggered to ingest that dataset, create an augmented image dataset, and upload the resulting augmented dataset to the S3 Refined bucket.

2. Model Training - The model training pipeline is containerized in ECR and can be deployed as an ECS task to ingest an augmented dataset from the S3 Refined bucket, train, evaluate, and score a model. Then upload a pickled model and related artifacts to the S3 Model Storage bucket.

3. Inference Web Application - The web application is deployed as a streamlit app containerized in an ECS service. A user can upload a raw image to the web app that the service will send to a lambda function for preprocessing. The preprocessed image is then returned to the ECS service, which pulls a model from the S3 Model Storage bucket, predicts an emotion from the given image, and returns the inferred emotion to be displayed on the web app to the user.

Link to our cost estimation for this architecture.

Data

To train our model, we found a Kaggle dataset for facial emotional recognition.

Repository Overview

This repository contains all the code necessary to deploy on AWS the architecture previously described. The contents of the repository are detailed below:

app/: The directory containing the script and resources for running the streamlit web application.
dockerfiles/: The directory containg dockerfiles for building the web app.
images/: The directory containg images referenced in this README.
pipeline/: The directory containing the model training pipeline scripts and associated resources:
- config/: The directory containing the model training pipeline configuration files.
- logs/: The directory containing the model training pipeline log files.
- src/: The directory containing the model training pipeline python module scripts and the main.py script.
- tests/: The directory containing the model training pipeline unit tests.
- Dockerfile: The Dockerfile for building the model training pipeline imgae.
- requirements.txt: The model training pipeline package requirements.
preprocessing_lambda/: The directory containing the script used to augment training data.
preprocessing_lambda_inference/: The directory containing the script used to augment user uploaded data for inference.
.gitignore: The file detailing untracked files Git should ignore.
.pylintrc: The file containing the lintr standard configurations for this repository.
README.md: The README you're reading right now.

Pipeline Unittests

To run the pipeline unittests, navigate to the pipeline folder and run the following line of code:

python -m unittest discover -s tests -p '*test*.py'

To receive this result:

Deployment Overview

The steps for deploying

Training Data Augmentation

Create a lambda function with S3 permissions for the execution role, the preprocessing_lambda/main.py script, a trigger on created objects in the S3 Raw bucket, 3004 MB memory, and 2048 MB ephemeral storage.

Add three layers to the function to enable the pandas, numpy, and pillow libraries using these ARN's.
Upload a training data csv to the S3 Raw bucket and wait for the augmented data csv to appear in the S3 Refined bucket.

Model Training

Create a new ECR registry and use the push commands from the pipeline/ directory to build and push the model training pipeline to ECR.
Create a new ECS cluster to run tasks.
Create a new task definiion to run the model training pipeline image in ECR using Faregate.
Deploy the task in the ECS cluster and wait for the pipeline to deploy and run.

During that process, the trained model and associated artifacts will be uploaded to the S3 Model Storage bucket.

Inference Web Application

Create AWS Lambda function to preprocess user uploaded image for model inference.
Make the web app to load model from the S3 bucket dynamically, and invoke Lambda function using boto3 in Python when received a new image.
Build and push docker image of the web app in the app directory to ECR.
Create a new ECS cluster for app hosting and model inference.
Define task and deploy it as service with necessary permissions, port mappings, and networking security group.
Access the app from the public IP, and upload facial images to get emotion prediction.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

How are you feeling?

Cloud Engineering Final Project

Purpose

Architecture

Data

Repository Overview

Pipeline Unittests

Deployment Overview

Training Data Augmentation

Model Training

Inference Web Application

About

Releases

Packages

Contributors 3

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 72 Commits
app		app
dockerfiles		dockerfiles
images		images
pipeline		pipeline
preprocessing_lambda		preprocessing_lambda
preprocessing_lambda_inference		preprocessing_lambda_inference
.gitignore		.gitignore
.pylintrc		.pylintrc
README.md		README.md

xvrettakou/cloud_eng_project

Folders and files

Latest commit

History

Repository files navigation

How are you feeling?

Cloud Engineering Final Project

Purpose

Architecture

Data

Repository Overview

Pipeline Unittests

Deployment Overview

Training Data Augmentation

Model Training

Inference Web Application

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages