Skip to content

Streaming pipeline using AWS MSK and AWS EMR with Spark, retrieving the data from Twitter Streams API

Notifications You must be signed in to change notification settings

escobarana/twitter_msk_emr

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Real-time Twitter Analytics for Eurovision 2023

Streaming pipeline using Amazon MSK and Amazon EMR with Spark, retrieving the data from Twitter Streams API

Important Notes:

  • Amazon S3 bucket (holds Spark resources and output);
  • Amazon MSK cluster (using IAM Access Control);
  • Amazon EKS container or an EC2 instance with the Kafka APIs installed and capable of connecting to Amazon MSK;
  • Connectivity between the Amazon EKS cluster or EC2 and Amazon MSK cluster;
  • Ensure the Amazon MSK Configuration has auto.create.topics.enable=true; this setting is false by default;

About

Streaming pipeline using AWS MSK and AWS EMR with Spark, retrieving the data from Twitter Streams API

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published