In this tutorial you learn how to deploy an Apache Spark streaming application on Cloud Dataproc and process messages from Cloud Pub/Sub in near real-time. The system you build in this scenario generates thousands of random tweets, identifies trending hashtags over a sliding window, saves results in Cloud Datastore, and displays the results on a web page.

Please refer to the related article for all the steps to follow in this tutorial: [INSERT LINK WHEN PUBLISHED]

Contents of this repository:

http_function: Javascript code for the HTTP function deployed on Cloud Functions.
spark: Scala code for the Apache Spark streaming application.
tweet-generator: Python code for the randomized tweet generator.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Files

README.md

Latest commit

History

README.md

File metadata and controls