Pinned Loading
-
-
data_pipeline
data_pipeline PublicA repo for pulling data from different sources using Apache Spark written in Scala and Python. Apache Airflow is used to schedule the task on Google DataProc
Python
-
elt-pipeline
elt-pipeline PublicThis is a repository for an elt pipeline. Data is moved from a postgres database to s3 and from s3 to redshift.
Python
-
realtime-pipeline
realtime-pipeline PublicThis is a repository for a realtime analytics pipeline using Clickhouse, Kafka, Spark and Superset
Python
Something went wrong, please refresh the page to try again.
If the problem persists, check the GitHub status page or contact support.
If the problem persists, check the GitHub status page or contact support.