Skip to content
#

gcp-dataproc

Here are 15 public repositories matching this topic...

ecommerce GCP Streaming pipeline ― Cloud Storage, Compute Engine, Pub/Sub, Dataflow, Apache Beam, BigQuery and Tableau; GCP Batch pipeline ― Cloud Storage, Dataproc, PySpark, Cloud Spanner and Tableau

  • Updated Mar 9, 2022
  • Python

Leveraging NYC Open Data, this repository contains Databricks notebooks for analyzing motor vehicle collisions. We perform EDA, spatial clustering, and predictive modeling on collision, vehicle, and person datasets to understand accident trends and predict potential risks.

  • Updated Feb 5, 2025
  • Jupyter Notebook

Improve this page

Add a description, image, and links to the gcp-dataproc topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the gcp-dataproc topic, visit your repo's landing page and select "manage topics."

Learn more