Skip to content

👷🌇 Set up and build a big data processing pipeline with Apache Spark, 📦 AWS services (S3, EMR, EC2, IAM, VPC, Redshift) Terraform to setup the infrastructure and Integration Airflow to automate workflows🥊

License

Notifications You must be signed in to change notification settings

longNguyen010203/Spark-Processing-AWS

Repository files navigation

👷 Spark-Processing-AWS

In this project, I set up and build a big data processing pipeline using Apache Spark integrated with various AWS services, including S3, EMR, EC2, VPC, IAM, and Redshift and Terraform to setup the infrastructure

🔦 About Project

📦 Technologies

  • S3
  • EMR
  • EC2
  • Airflow
  • Redshift
  • Terraform
  • Spark
  • VPC
  • IAM

🦄 Features

👩🏽‍🍳 The Process

📚 What I Learned

💭 How can it be improved?

🚦 Running the Project

About

👷🌇 Set up and build a big data processing pipeline with Apache Spark, 📦 AWS services (S3, EMR, EC2, IAM, VPC, Redshift) Terraform to setup the infrastructure and Integration Airflow to automate workflows🥊

Topics

Resources

License

Stars

Watchers

Forks