data-lake

Here are 5 public repositories matching this topic...

dominikhei / Local-Data-LakeHouse

Sample Data Lakehouse deployed in Docker containers using Apache Iceberg, Minio, Trino and a Hive Metastore. Can be used for local testing.

data-lake minio trino hive-metastore apache-iceberg lakehouse data-lakehouse

Updated Sep 2, 2023
Dockerfile

dgkatz / trino-hive-superset-docker

Star

Cloud-native Trino (prestosql) + Hive + Minio + Superset

big-data hive superset data-lake minio query-engine trino prestosql

Updated Nov 29, 2021
Dockerfile

apache / kyuubi-docker

Star

Apache Kyuubi is a distributed and multi-tenant gateway to provide serverless SQL on data warehouses and lakehouses.

kubernetes sql spark hive hadoop jdbc thrift data-lake spark-sql

Updated Dec 2, 2024
Dockerfile

Pirate-Emperor / BigData-Pipeline

Star

BigData Pipeline is a local testing environment for experimenting with various storage solutions (RDB, HDFS), query engines (Trino), schedulers (Airflow), and ETL/ELT tools (DBT). It supports MySQL, Hadoop, Hive, Kudu, and more.

Updated Nov 2, 2024
Dockerfile

budproj / data

Star

Analysis, extraction, infrastructure, and transformations of our data

data data-warehouse data-lake dbt

Updated Aug 15, 2024
Dockerfile

Improve this page

Add a description, image, and links to the data-lake topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the data-lake topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

data-lake

Here are 5 public repositories matching this topic...

dominikhei / Local-Data-LakeHouse

dgkatz / trino-hive-superset-docker

apache / kyuubi-docker

Pirate-Emperor / BigData-Pipeline

budproj / data

Improve this page

Add this topic to your repo