Skip to content

matbragan/data-lake-solution

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

data-lake-solution

Architecture

Infra

All the resources needed to create a Kubernetes cluster as well the Platform for the data environment.

For environment deployment:

  1. Kubernetes
  2. Platform

App

Development of an application that creates json or parquet files to place in the landing zone folder of a Data Lake, in this case using MinIO (s3).

  1. Data Gen DataStores

Data

Creating a data pipeline using Trino, dbt-Core & Apache Airflow to create a complete end-to-end data environment.

To build the data environment:

  1. Trino
  2. dbt-Core
  3. Airflow
  4. DAG

About

A solution to create a Data Lake

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published