apache-hadoop-framework

Here are 5 public repositories matching this topic...

divithraju / divith-aju-Hadoop-Pyspark-pipeline

This project demonstrates the creation of a scalable data processing pipeline for handling and analyzing log data from a hypothetical e-commerce platform. Leveraging Hadoop and PySpark, the pipeline is designed to process large volumes of log files, providing meaningful insights into user behavior, system performance, and sales metrics.

client documentation data database apache-spark pipeline bigdata project python3 pyspark hdfs software-engineering ecommerce-platform dataengineering datapreprocessing apache-hadoop-framework project-repository dataingestionframework

Updated Aug 17, 2024
Python

sauravgore / Predictive-Statistics-of-Imbalanced-Rainfall-Data

Star

Preprocessing imbalanced rainfall data using Apache Hadoop framework and give predictive statistics of rainfall data using R

analysis estimation forecast rainfall india preprocessing predictive-analytics agricultural rainfall-data apache-hadoop-framework rstudio-analysis apache-hadoop-preprocessing

Updated Oct 16, 2018
R

DuarteDomingues / Large-Scale-Data-Computation-Word-Count-project

Star

Large-scale data computing word count project

word-count apache-hadoop-framework big-data-processing data-computing

Updated Feb 21, 2023
Java

13caroline / imdb-datasets

Star

Managing large data sets projects (Data Science)

java dockerfile map-reduce avro-schema parquet-files spark-sql hive-metastore imdb-dataset apache-hadoop-framework

Updated Jun 29, 2021
Java

bayudwiyansatria / library-java-apache-hadoop

Star

Apache Hadoop. Apache Hadoop is a collection of open-source software utilities that facilitate using a network of many computers to solve problems involving massive amounts of data and computation. It provides a software framework for distributed storage and processing of big data using the MapReduce programming model. Originally designed for co…

library libraries java-library apache-hadoop apache-hadoop-framework bayudwiyansatria apache-hadoop-library

Updated Oct 7, 2021
Java

Improve this page

Add a description, image, and links to the apache-hadoop-framework topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the apache-hadoop-framework topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

apache-hadoop-framework

Here are 5 public repositories matching this topic...

divithraju / divith-aju-Hadoop-Pyspark-pipeline

sauravgore / Predictive-Statistics-of-Imbalanced-Rainfall-Data

DuarteDomingues / Large-Scale-Data-Computation-Word-Count-project

13caroline / imdb-datasets

bayudwiyansatria / library-java-apache-hadoop

Improve this page

Add this topic to your repo