Ethereum Transaction Data Pipeline Project

Introduction

This is a learn-as-you-go project aimed at learning data engineering technologies and practices. The overarching aim is to tap ino a source - The Ethereum Blockchain Network - and deliver the data to a consumer - Tableau or any other front-end equivalent. The project is meant to be expansive, where deploying most optimal solution is not necessarily top of the priority, but learning how to definitely is. The project timeline will be in phases and the project structure fluid.

Project Diagram

Ethereum Blockchain > Web3.eth.py > MongoDB / Hadoop > Kafka > Spark / Flink > Hadoop / Doris > Tableau / Plotly Dash / ML-Pytorch-Streamlit

Timeline & Phases

Phase 1 - Project Setup

Setup of goals, basic project structure, timeline, virtual environments and version control.

Phase 2 - Duct Tape Phase (Current Phase)

Get the entire pipeline running in whatever form as quick as possible.

Phase 3 - Stabilizing Phase

Start implementation of refactoring, OOP and modular programming. Build infrastructure for health-checks, cybersecurity, exception and logging if not already implemented.

Phase 4 - Data Product Design Start

Assessment of 'Data Product' requirements, design and planning of data transformations required at each point of the pipeline to deliver the required 'Data Product' to the consumer.

Phase 5 - Expansion of Data Product Offerings

Exploration of other data sources, and new data products from these new data sources. ie: X/TikTok sentiment data sources and delivering a sentiment analysis front-end.

Phase 6 - Maintenance Phase

Patches and red-teaming on integrity of data pipeline. Cost-benefit analysis of version upgrades.

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
config		config
docker		docker
docs		docs
kafka-connect-plugins		kafka-connect-plugins
notebooks		notebooks
scripts		scripts
.gitignore		.gitignore
ethereum_alchemy_pipeline.code-workspace		ethereum_alchemy_pipeline.code-workspace
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Ethereum Transaction Data Pipeline Project

Introduction

Project Diagram

Timeline & Phases

Phase 1 - Project Setup

Phase 2 - Duct Tape Phase (Current Phase)

Phase 3 - Stabilizing Phase

Phase 4 - Data Product Design Start

Phase 5 - Expansion of Data Product Offerings

Phase 6 - Maintenance Phase

About

Releases

Packages

Languages

jbgithub22/ethereum_alchemy_pipeline

Folders and files

Latest commit

History

Repository files navigation

Ethereum Transaction Data Pipeline Project

Introduction

Project Diagram

Timeline & Phases

Phase 1 - Project Setup

Phase 2 - Duct Tape Phase (Current Phase)

Phase 3 - Stabilizing Phase

Phase 4 - Data Product Design Start

Phase 5 - Expansion of Data Product Offerings

Phase 6 - Maintenance Phase

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages