This project builds a scalable and robust data warehouse tech-stack that will help to provide an AI service to a client. The Data used for this project is sensor data in csv format. In Data (ucdavis.edu) you can find parquet data, and or sensor data in CSV. ELT pipeline was developed by migrating an ELT pipeline that was developed using MYSQL, DBT and Airflow for orchestrating the tasks. This was done by changing the the MySQL data warehouse to Postgres and the Redash dashboard to Superset.
- Project: Migrate to Postgres and Superset from MYSQL and Redash
- Table of Contents
- Project Structure
- ELT Pipeline
- License
airflow_postgres
|__dags
|___postgres
|____create_station_Summary.sql # exported queries
|____insert_station_summary.sql # exported queries
|___src
|____mysql_converter.py # A script which convert mysql queries into postgres
|____redash_export.py # export redash queries
|___migrate.py # ELT pipeline for migrating data to postgres
|___redash_dag.py # ELT pipeline for exporting queries to superset
DBT
|___models
|____merged_station.sql # sql file for transforming tables
ELT pipeline builder
Get_table name
- reads table name from mysql database
Migrate
- Migrate sql statement from mysql to postgres
Extract and export queries to superset