Skip to content

MartinKalema/mysql-kafka-s3-redshift-data-pipeline

Repository files navigation

Export data from mysql-database to REDSHIFT using kafka

Data flow diagram

Problem Statement: I need to build an ETL pipeline to dump mysql data base record to redshift using kafka MY SQL DATABASE

RedShift Dataware house Red Shift Approach

  1. Read data from mysql and send to kafka topic and from kafka topic i will dump to s3 bucket mysql-kafka-s3

  2. Read data from s3 bucket and dump in REDSHIFT s3-redshift

Launch entire server setup

docker-compose up

Dump data in mysql db

docker exec -i mysql sh -c 'exec mysql -uroot -p"$MYSQL_ROOT_PASSWORD"' < "./database-dump/mysqlsampledatabase.sql"

I will design Star Schema so that i can export above attached OLTP to OLAP

  1. Redshift setup

  2. Kafka setup

  3. MYSQL KAFKA S3 Project Description

  4. S3 Redshift Project Description