Real-time Data Processing with Go, Elastic Beanstalk, and Redshift

Project Overview

This project demonstrates a real-time data processing pipeline using a Go application. The application processes data stored in S3 and stores the results in Redshift. The Go application is deployed using Elastic Beanstalk.

Objective

Create a real-time data processing pipeline with a Go application.
Deploy the application using Elastic Beanstalk.
Store processed data in Amazon Redshift.
Verify data processing and storage using SQL queries in Redshift.

Prerequisites

Ensure you have the following installed on your local machine:

Docker
Docker Compose
Go

AWS Setup

Download Dataset
- Download the Online Retail dataset from UCI Machine Learning Repository.
Create an S3 Bucket
- Follow the instructions to create an S3 bucket.
Upload the Dataset to S3
- Upload the Online Retail CSV file to the S3 bucket you created. Instructions can be found here.
Create a Redshift Cluster
- Follow the steps to create a Redshift cluster.
Create a New Database within the Redshift Cluster
- Once your Redshift cluster is created, use the AWS Management Console or AWS CLI to create a new database within the cluster.

.env Variables

# To read from S3:
REGION=
BUCKET=
KEY=     #name of the .csv file in S3

# To push data to Redshift:
AWS_ACCESS_KEY_ID=
AWS_SECRET_ACCESS_KEY=
REDSHIFT_CONN_STRING=

Run with Docker

docker-compose build
docker-compose up

# To print the processed data:
curl "http://localhost:8080?action=print"
(note: the data is printed in the Docker container's console, not where curl is called)

# To insert processed data into Redshift:
curl "http://localhost:8080?action=insert"

Run without Docker

go mod tidy

# To print the processed data:
go run main.go -action=print

# To insert processed data into Redshift:
go run main.go -action=insert

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

readme.md

readme.md

Real-time Data Processing with Go, Elastic Beanstalk, and Redshift

Project Overview

Objective

Prerequisites

AWS Setup

.env Variables

Run with Docker

Run without Docker

Files

readme.md

Latest commit

History

readme.md

File metadata and controls

Real-time Data Processing with Go, Elastic Beanstalk, and Redshift

Project Overview

Objective

Prerequisites

AWS Setup

.env Variables

Run with Docker

Run without Docker