Uber Data Analytics | Data Engineering GCP Project

Introduction

The goal of this project is to perform data analytics on Uber data using various tools and technologies, including GCP Storage, Python, Compute Instance, Mage Data Pipeline Tool, BigQuery, and Looker Studio.

Implementation Details

Layout data pipeline architecture.

Host Uber data (CSV file) on staging storage (google storage).
Set up Google Compute Instance (VM) with Python and Mage to handle the ETL process.
Model the data into various tables (star schema).

Write Python scripts on Mage to:

Extract data from google cloud.
Transform, filter, and split the data into multiple tables.
Load the data into BigQuery schema.

Create a new analytics table to feed Looker Dashbaord.
Set up Looker Dashbaord to visualize the data into different charts.

Technology Used

Programming Language - Python

Google Cloud Platform

Google Storage
Compute Instance
BigQuery
Looker Studio

Dataset Used

TLC Trip Record Data Yellow and green taxi trip records include fields capturing pick-up and drop-off dates/times, pick-up and drop-off locations, trip distances, itemized fares, rate types, payment types, and driver-reported passenger counts.

More info about the dataset can be found here:

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
data		data
mage-files		mage-files
BigQuery_schema.png		BigQuery_schema.png
README.md		README.md
Uber Transformations Tests.ipynb		Uber Transformations Tests.ipynb
analytics_query.sql		analytics_query.sql
architecture.jpg		architecture.jpg
commands.txt		commands.txt
data_model.jpeg		data_model.jpeg

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Uber Data Analytics | Data Engineering GCP Project

Introduction

Implementation Details

Technology Used

Dataset Used

About

Releases

Packages

Languages

Phylake1337/uber-data-pipeline-end2end

Folders and files

Latest commit

History

Repository files navigation

Uber Data Analytics | Data Engineering GCP Project

Introduction

Implementation Details

Technology Used

Dataset Used

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages