Skip to content

jbgithub22/ethereum_alchemy_pipeline

Repository files navigation

Ethereum Transaction Data Pipeline Project

Introduction

This is a learn-as-you-go project aimed at learning data engineering technologies and practices. The overarching aim is to tap ino a source - The Ethereum Blockchain Network - and deliver the data to a consumer - Tableau or any other front-end equivalent. The project is meant to be expansive, where deploying most optimal solution is not necessarily top of the priority, but learning how to definitely is. The project timeline will be in phases and the project structure fluid.

Project Diagram

Ethereum Blockchain > Web3.eth.py > MongoDB / Hadoop > Kafka > Spark / Flink > Hadoop / Doris > Tableau / Plotly Dash / ML-Pytorch-Streamlit

Timeline & Phases

Phase 1 - Project Setup

Setup of goals, basic project structure, timeline, virtual environments and version control.

Phase 2 - Duct Tape Phase (Current Phase)

Get the entire pipeline running in whatever form as quick as possible.

Phase 3 - Stabilizing Phase

Start implementation of refactoring, OOP and modular programming. Build infrastructure for health-checks, cybersecurity, exception and logging if not already implemented.

Phase 4 - Data Product Design Start

Assessment of 'Data Product' requirements, design and planning of data transformations required at each point of the pipeline to deliver the required 'Data Product' to the consumer.

Phase 5 - Expansion of Data Product Offerings

Exploration of other data sources, and new data products from these new data sources. ie: X/TikTok sentiment data sources and delivering a sentiment analysis front-end.

Phase 6 - Maintenance Phase

Patches and red-teaming on integrity of data pipeline. Cost-benefit analysis of version upgrades.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published