spark-cluster

Here are 7 public repositories matching this topic...

minhky2185 / healthcare_data_pipeline

An end-to-end data pipeline for building Data Lake and supporting report using Apache Spark.

visualization mysql data big-data spark apache-spark analytics postgresql s3 data-engineering data-lake powerbi emr-cluster spark-cluster data-engineering-pipeline healthcare-data rds-mysql rds-postgres

Updated Jan 31, 2023
Python

vaibhavmagon / Spark-Python-MovieReviews

Star

Script to run and find similarities between movies from a movie lens data set using Python & Spark Clustering.

python spark movie dataset recommendation-system easy-to-use movielens-dataset spark-cluster

Updated Sep 30, 2020
Python

longNguyen010203 / Spark-Processing-AWS

Star

👷🌇 Set up and build a big data processing pipeline with Apache Spark, 📦 AWS services (S3, EMR, EC2, IAM, VPC, Redshift) Terraform to setup the infrastructure and Integration Airflow to automate workflows🥊

aws apache-spark terraform aws-s3 iam pyspark cloud-computing aws-ec2 redshift data-pipeline aws-services apache-airflow emr-cluster spark-cluster spark-master spark-worker

Updated Jul 12, 2024
Python

karamolegkos / Diastema

Star

This is my contribution in the project Diastema

api kubernetes spark kubernetes-api openstack-heat spark-cluster spark-on-kubernetes microstack diastema openstack-heat-api

Updated Sep 2, 2022
Python

ayseirmak / DistributedFraudDetection

Star

In this study, we propose to use a distributed storage and computation system in order to track money transfers instantly. In particular, we keep our transaction history in a distributed file system as a graph data structure. We try to detect illegal activities by using Graph Neural Networks (GNN) in distributed manner.

apache-spark python3 keras-tensorflow graph-convolutional-networks spark-cluster bigdl-orca

Updated Jan 30, 2024
Python

Turnipdo / Spark-Standalone-Cluster-Setup

Star

To facilitate the initial setup of Apache Spark, this repository provides a beginner-friendly, step-by-step guide on setting up a master node and two worker nodes.

python spark spark-cluster

Updated Jun 10, 2024
Python

euiyounghwang / spark_job_interface_service

Star

spark_job_interface_service

spark spark-jobs spark-cluster fastapi

Updated Oct 21, 2024
Python

Improve this page

Add a description, image, and links to the spark-cluster topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the spark-cluster topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

spark-cluster

Here are 7 public repositories matching this topic...

minhky2185 / healthcare_data_pipeline

vaibhavmagon / Spark-Python-MovieReviews

longNguyen010203 / Spark-Processing-AWS

karamolegkos / Diastema

ayseirmak / DistributedFraudDetection

Turnipdo / Spark-Standalone-Cluster-Setup

euiyounghwang / spark_job_interface_service

Improve this page

Add this topic to your repo