Trying best case apache spark working environment for robust data pipelines
-
Updated
Apr 1, 2023 - Python
Trying best case apache spark working environment for robust data pipelines
This project focuses on analyzing the questions on askubuntu.com to find the most common topics asked about in order to better understand what areas of Ubuntu may need more attention for bug fixing and also what features might be good to add in future releases of Ubuntu. To do this, I analyzed public data from askubuntu.com using Azure HDInsight…
stockmarket machine learning
Demonstrating Spark Structured Streaming using Twitter API, Apache Spark and Apache Kafka.
Data lake project for sparkify music platform. Written with py spark and run on an EMR cluster on AWS.
Analysing Data scientist growth rate from Naukri website
pyspark streaming example with flask dashboard
Querying Snowflake from Spark in 4 different ways
Códigos em spark utilizados no dia a dia para manipulação de dados desde a ingestão até o refinamento.
Repositório do curso "Spark: apresentando a ferramenta" da Alura.
Add a description, image, and links to the pyspark topic page so that developers can more easily learn about it.
To associate your repository with the pyspark topic, visit your repo's landing page and select "manage topics."