GitHub - danielbeach/DataEngineeringProjects: Some example projects for Data Engineers to build, end-to-end.

Data Engineering Project(s)

The purpose of this repository is to give Data Engineers the chance to complete an end-to-end Data Engineering project from start to finish. Complete instructions will be given on the desired architecture and steps to take to complete each project.

The expectation of these Project(s) is that you will do everything, including Bash, Dockerfiles, README's, coding, etc. Nothing is going to be done for you, it forces you to not rely on others and skip things you might not be familiar with. Growth comes with struggle.

Similar to how work for a project might be handed down in a Data Team, some of the instructions will be specific, some will be ambiguous, and the solution you choose will be generally up to you.

This project(s) will test a Data Engineers abilities across multiple techs and concepts not limited to, but including

Docker
Bash
Python
Airflow
Async
Data Modeling
Postgres
Delta Lake
PySpark
Parquet/CSV
BytesIO
Lazy Evaluation
SQL
Analytics
Dashboards
AWS Cloud

Good Data Engineers are well-rounded and are able to work across multiple techs and concepts, as well as the ability to understand clear and unclear directions, and develop architecture to support the requirements.

Project 1

In this first Data Engineering project the idea is to setup a Data Platform
that will provide the ability to visually build a data pipeline capable of
downloading some raw TSV data, processing it, and depositing the results into
a Lake House, then displaying a Dashboard of the results.

This project tests your skills to understand high level requirements and turn them
into a technical details without much guidance.

It also tests your ability to work on the entire Data Engineering stack from `bash`,
 to `Python` and `Docker` as well as various tools.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
Project_1		Project_1
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Data Engineering Project(s)

Project 1

About

Releases

Packages

danielbeach/DataEngineeringProjects

Folders and files

Latest commit

History

Repository files navigation

Data Engineering Project(s)

Project 1

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages