big-data
Here are 253 public repositories matching this topic...
An open-source storage framework that enables building a Lakehouse architecture with compute engines including Spark, PrestoDB, Flink, Trino, and Hive and APIs
-
Updated
Jul 7, 2024 - Scala
High performance data store solution
-
Updated
Jul 6, 2024 - Scala
Low-code tool for automating actions on real time data | Stream processing for the users.
-
Updated
Jul 6, 2024 - Scala
Qbeast-spark: DataSource enabling multi-dimensional indexing and efficient data sampling. Big Data, free from the unnecessary!
-
Updated
Jul 5, 2024 - Scala
Simple and Distributed Machine Learning
-
Updated
Jul 4, 2024 - Scala
An open protocol for secure data sharing
-
Updated
Jul 3, 2024 - Scala
Resilient data pipeline framework running on Apache Spark
-
Updated
Jun 27, 2024 - Scala
Sparkling Water provides H2O functionality inside Spark cluster
-
Updated
Jun 26, 2024 - Scala
A framework for rapid reporting API development; with out of the box support for high cardinality dimension lookups with druid.
-
Updated
Jun 14, 2024 - Scala
Introduction to Spark Batch processing.
-
Updated
May 27, 2024 - Scala
Flowman is an ETL framework powered by Apache Spark. With its declarative approach, Flowman simplifies the development of complex data pipelines.
-
Updated
Jul 6, 2024 - Scala
A Spark-interface for the io.scif and other libraries to take advantage of the ImageJ2 ecosystem.
-
Updated
Apr 26, 2024 - Scala
Improve this page
Add a description, image, and links to the big-data topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the big-data topic, visit your repo's landing page and select "manage topics."