Interactive and Reactive Data Science using Scala and Spark.
-
Updated
May 16, 2023 - JavaScript
Apache Spark is an open source distributed general-purpose cluster-computing framework. It provides an interface for programming entire clusters with implicit data parallelism and fault tolerance.
Interactive and Reactive Data Science using Scala and Spark.
Use SQL to build ELT pipelines on a data lakehouse.
JupyterLab extension that enables monitoring launched Apache Spark jobs from within a notebook
Shale-Reservoir-DNN and Drilling-Rare-Events-Graph
Website for AstroLab
This project uses Apache Spark to explore the popular New York City Current Job Postings Kaggle dataset.
This project is a Flask interactive web application that displays a map of New York City and allows users to query it, along with a recommendation algorithm that matches suppliers to restuarants. The application uses a combination of Python, html, css, and Javascript. The data is stored using Apache Spark and MongoDB.
This project showcases the use of NoSQL technologies and usage of Elastic Stack for comprehensive data processing and visualization.
System for stock prediction, analysis and investment.
λtrace - Performance Optimization tool for AWS Lambda Function
Airavat is a metric interceptor and a job watchdog for Spark Applications. It also features an interactive UI which shows all Spark Applications running, jobs and SQL Queries along with their metrics.
online blog website
Alternative UI for Spark Web UI that scrapes the Spark Web APIs
Scrapped and Analyzed Twitter data using Spark. Run Spark queries on Millions of tweets and trained models for sentiment analysis.
Created by Matei Zaharia
Released May 26, 2014