-
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
1c04c1d
commit a6c7904
Showing
34 changed files
with
52 additions
and
55 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,16 +1,15 @@ | ||
--- | ||
title: "Hive Setup and Operations" | ||
summary: "This article is about how to deploy Hive services on Hadoop Cluster, which components it has, how the data is stored and managed in Hive, how the calculation is done via MapReduce, and how Yarn manage the resources" | ||
description: "This article is about how to deploy Hive services on Hadoop Cluster, which components it has, how the data is stored and managed in Hive, how the calculation is done via MapReduce, and how Yarn manage the resources" | ||
categories: ["Docker","Hadoop","Data Engineering"] | ||
tags: ["tutorial", "hdfs", "hive", "mapreduce", "postgres", "catalog"] | ||
date: 2024-06-01 | ||
title: "Kafka Topics and Operations" | ||
summary: "This article is about how to operate on Kafka topics, their management, and configure important parameters" | ||
description: "This article is about how to operate on Kafka topics, their management, and configure important parameters" | ||
categories: ["Kafka","Data Engineering","Deployment"] | ||
tags: ["tutorial", "kafka", "topics", "configuration", "kubernetes", "docker"] | ||
date: 2024-06-03 | ||
draft: false | ||
showauthor: false | ||
authors: | ||
- nunocoracao | ||
--- | ||
# Hive Deployment and Operations | ||
|
||
In this article, we will be deploying Hive services on Hadoop cluster | ||
# Kafka Topics and Operations | ||
|
||
This article is about how to operate on Kafka topics, their management, and configure important parameters |
Binary file modified
BIN
-309 KB
(21%)
content/posts/20240604-kafka-python-operations/background.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,16 +1,16 @@ | ||
--- | ||
title: "Hive Setup and Operations" | ||
title: "Kafka Python Operations" | ||
summary: "This article is about how to deploy Hive services on Hadoop Cluster, which components it has, how the data is stored and managed in Hive, how the calculation is done via MapReduce, and how Yarn manage the resources" | ||
description: "This article is about how to deploy Hive services on Hadoop Cluster, which components it has, how the data is stored and managed in Hive, how the calculation is done via MapReduce, and how Yarn manage the resources" | ||
categories: ["Docker","Hadoop","Data Engineering"] | ||
tags: ["tutorial", "hdfs", "hive", "mapreduce", "postgres", "catalog"] | ||
date: 2024-06-01 | ||
date: 2024-06-04 | ||
draft: false | ||
showauthor: false | ||
authors: | ||
- nunocoracao | ||
--- | ||
# Hive Deployment and Operations | ||
# Kafka Python Operations | ||
|
||
In this article, we will be deploying Hive services on Hadoop cluster | ||
|
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,16 +1,15 @@ | ||
--- | ||
title: "Hive Setup and Operations" | ||
title: "Deploy Spark Cluster" | ||
summary: "This article is about how to deploy Hive services on Hadoop Cluster, which components it has, how the data is stored and managed in Hive, how the calculation is done via MapReduce, and how Yarn manage the resources" | ||
description: "This article is about how to deploy Hive services on Hadoop Cluster, which components it has, how the data is stored and managed in Hive, how the calculation is done via MapReduce, and how Yarn manage the resources" | ||
categories: ["Docker","Hadoop","Data Engineering"] | ||
tags: ["tutorial", "hdfs", "hive", "mapreduce", "postgres", "catalog"] | ||
date: 2024-06-01 | ||
categories: ["Docker","Spark","Data Engineering"] | ||
tags: ["tutorial", "spark", "hive", "mapreduce", "postgres", "catalog"] | ||
date: 2024-06-05 | ||
draft: false | ||
showauthor: false | ||
authors: | ||
- nunocoracao | ||
--- | ||
# Hive Deployment and Operations | ||
|
||
In this article, we will be deploying Hive services on Hadoop cluster | ||
# Deploy Spark Cluster | ||
|
||
In this article, we will be deploying Spark Cluster on local, docker env, and Kubernetes |
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,16 +1,16 @@ | ||
--- | ||
title: "Hive Setup and Operations" | ||
title: "Use PySpark for Data Clean up" | ||
summary: "This article is about how to deploy Hive services on Hadoop Cluster, which components it has, how the data is stored and managed in Hive, how the calculation is done via MapReduce, and how Yarn manage the resources" | ||
description: "This article is about how to deploy Hive services on Hadoop Cluster, which components it has, how the data is stored and managed in Hive, how the calculation is done via MapReduce, and how Yarn manage the resources" | ||
categories: ["Docker","Hadoop","Data Engineering"] | ||
tags: ["tutorial", "hdfs", "hive", "mapreduce", "postgres", "catalog"] | ||
date: 2024-06-01 | ||
date: 2024-06-06 | ||
draft: false | ||
showauthor: false | ||
authors: | ||
- nunocoracao | ||
--- | ||
# Hive Deployment and Operations | ||
# Use PySpark for Data Clean up | ||
|
||
In this article, we will be deploying Hive services on Hadoop cluster | ||
In this article, we will be cleaning up a dirty data by using PySpark | ||
|
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,16 +1,16 @@ | ||
--- | ||
title: "Hive Setup and Operations" | ||
title: "Spark DataFrame Operations" | ||
summary: "This article is about how to deploy Hive services on Hadoop Cluster, which components it has, how the data is stored and managed in Hive, how the calculation is done via MapReduce, and how Yarn manage the resources" | ||
description: "This article is about how to deploy Hive services on Hadoop Cluster, which components it has, how the data is stored and managed in Hive, how the calculation is done via MapReduce, and how Yarn manage the resources" | ||
categories: ["Docker","Hadoop","Data Engineering"] | ||
tags: ["tutorial", "hdfs", "hive", "mapreduce", "postgres", "catalog"] | ||
date: 2024-06-01 | ||
date: 2024-06-07 | ||
draft: false | ||
showauthor: false | ||
authors: | ||
- nunocoracao | ||
--- | ||
# Hive Deployment and Operations | ||
# Spark DataFrame Operations | ||
|
||
In this article, we will be deploying Hive services on Hadoop cluster | ||
In this article, we will be practicing Spark DataFrame operations | ||
|
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,16 +1,16 @@ | ||
--- | ||
title: "Hive Setup and Operations" | ||
title: "Submitting Spark Application" | ||
summary: "This article is about how to deploy Hive services on Hadoop Cluster, which components it has, how the data is stored and managed in Hive, how the calculation is done via MapReduce, and how Yarn manage the resources" | ||
description: "This article is about how to deploy Hive services on Hadoop Cluster, which components it has, how the data is stored and managed in Hive, how the calculation is done via MapReduce, and how Yarn manage the resources" | ||
categories: ["Docker","Hadoop","Data Engineering"] | ||
tags: ["tutorial", "hdfs", "hive", "mapreduce", "postgres", "catalog"] | ||
date: 2024-06-01 | ||
date: 2024-06-08 | ||
draft: false | ||
showauthor: false | ||
authors: | ||
- nunocoracao | ||
--- | ||
# Hive Deployment and Operations | ||
# Submitting Spark Application | ||
|
||
In this article, we will be deploying Hive services on Hadoop cluster | ||
In this article, we will be submitting Spark application to the Spark cluster we previously deployed | ||
|
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,16 +1,16 @@ | ||
--- | ||
title: "Hive Setup and Operations" | ||
title: "Optimizing Spark Applications" | ||
summary: "This article is about how to deploy Hive services on Hadoop Cluster, which components it has, how the data is stored and managed in Hive, how the calculation is done via MapReduce, and how Yarn manage the resources" | ||
description: "This article is about how to deploy Hive services on Hadoop Cluster, which components it has, how the data is stored and managed in Hive, how the calculation is done via MapReduce, and how Yarn manage the resources" | ||
categories: ["Docker","Hadoop","Data Engineering"] | ||
tags: ["tutorial", "hdfs", "hive", "mapreduce", "postgres", "catalog"] | ||
date: 2024-06-01 | ||
date: 2024-06-09 | ||
draft: false | ||
showauthor: false | ||
authors: | ||
- nunocoracao | ||
--- | ||
# Hive Deployment and Operations | ||
# Optimizing Spark Applications | ||
|
||
In this article, we will be deploying Hive services on Hadoop cluster | ||
|
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,16 +1,15 @@ | ||
--- | ||
title: "Hive Setup and Operations" | ||
title: "Processing Complex Nested JSON File with Spark" | ||
summary: "This article is about how to deploy Hive services on Hadoop Cluster, which components it has, how the data is stored and managed in Hive, how the calculation is done via MapReduce, and how Yarn manage the resources" | ||
description: "This article is about how to deploy Hive services on Hadoop Cluster, which components it has, how the data is stored and managed in Hive, how the calculation is done via MapReduce, and how Yarn manage the resources" | ||
categories: ["Docker","Hadoop","Data Engineering"] | ||
tags: ["tutorial", "hdfs", "hive", "mapreduce", "postgres", "catalog"] | ||
date: 2024-06-01 | ||
date: 2024-06-10 | ||
draft: false | ||
showauthor: false | ||
authors: | ||
- nunocoracao | ||
--- | ||
# Hive Deployment and Operations | ||
|
||
In this article, we will be deploying Hive services on Hadoop cluster | ||
# Processing Complex Nested JSON File with Spark | ||
|
||
In this article, we will be processing complex nested JSON file with Apache Spark |
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,16 +1,16 @@ | ||
--- | ||
title: "Hive Setup and Operations" | ||
title: "Spark Streaming Hands On from/to Kafka" | ||
summary: "This article is about how to deploy Hive services on Hadoop Cluster, which components it has, how the data is stored and managed in Hive, how the calculation is done via MapReduce, and how Yarn manage the resources" | ||
description: "This article is about how to deploy Hive services on Hadoop Cluster, which components it has, how the data is stored and managed in Hive, how the calculation is done via MapReduce, and how Yarn manage the resources" | ||
categories: ["Docker","Hadoop","Data Engineering"] | ||
tags: ["tutorial", "hdfs", "hive", "mapreduce", "postgres", "catalog"] | ||
date: 2024-06-01 | ||
date: 2024-06-11 | ||
draft: false | ||
showauthor: false | ||
authors: | ||
- nunocoracao | ||
--- | ||
# Hive Deployment and Operations | ||
# Spark Streaming Hands On from/to Kafka | ||
|
||
In this article, we will be deploying Hive services on Hadoop cluster | ||
In this article, we will be developing a Spark Streaming application which will read data from Kafka, process, and write back to Kafka | ||
|
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,16 +1,16 @@ | ||
--- | ||
title: "Hive Setup and Operations" | ||
title: "Airflow Introduction Pipeline" | ||
summary: "This article is about how to deploy Hive services on Hadoop Cluster, which components it has, how the data is stored and managed in Hive, how the calculation is done via MapReduce, and how Yarn manage the resources" | ||
description: "This article is about how to deploy Hive services on Hadoop Cluster, which components it has, how the data is stored and managed in Hive, how the calculation is done via MapReduce, and how Yarn manage the resources" | ||
categories: ["Docker","Hadoop","Data Engineering"] | ||
tags: ["tutorial", "hdfs", "hive", "mapreduce", "postgres", "catalog"] | ||
date: 2024-06-01 | ||
date: 2024-06-12 | ||
draft: false | ||
showauthor: false | ||
authors: | ||
- nunocoracao | ||
--- | ||
# Hive Deployment and Operations | ||
# Airflow Introduction Pipeline | ||
|
||
In this article, we will be deploying Hive services on Hadoop cluster | ||
In this article, we will be deploying Apache Airflow, and create a sample pipeline which fetches data from a webserver and write into MinIO bucket. | ||
|
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,16 +1,16 @@ | ||
--- | ||
title: "Hive Setup and Operations" | ||
title: "Elasticsearch Indexing and Kibana Dashboard with PySpark" | ||
summary: "This article is about how to deploy Hive services on Hadoop Cluster, which components it has, how the data is stored and managed in Hive, how the calculation is done via MapReduce, and how Yarn manage the resources" | ||
description: "This article is about how to deploy Hive services on Hadoop Cluster, which components it has, how the data is stored and managed in Hive, how the calculation is done via MapReduce, and how Yarn manage the resources" | ||
categories: ["Docker","Hadoop","Data Engineering"] | ||
tags: ["tutorial", "hdfs", "hive", "mapreduce", "postgres", "catalog"] | ||
date: 2024-06-01 | ||
date: 2024-06-13 | ||
draft: false | ||
showauthor: false | ||
authors: | ||
- nunocoracao | ||
--- | ||
# Hive Deployment and Operations | ||
# Elasticsearch Indexing and Kibana Dashboard with PySpark | ||
|
||
In this article, we will be deploying Hive services on Hadoop cluster | ||
In this article, we will be sinking data to ElasticSearch by PySpark and create a dashboard on Kibana | ||
|
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,16 +1,16 @@ | ||
--- | ||
title: "Hive Setup and Operations" | ||
title: "Change Data Capture (CDC) Pipeline Implementation" | ||
summary: "This article is about how to deploy Hive services on Hadoop Cluster, which components it has, how the data is stored and managed in Hive, how the calculation is done via MapReduce, and how Yarn manage the resources" | ||
description: "This article is about how to deploy Hive services on Hadoop Cluster, which components it has, how the data is stored and managed in Hive, how the calculation is done via MapReduce, and how Yarn manage the resources" | ||
categories: ["Docker","Hadoop","Data Engineering"] | ||
tags: ["tutorial", "hdfs", "hive", "mapreduce", "postgres", "catalog"] | ||
date: 2024-06-01 | ||
date: 2024-06-14 | ||
draft: false | ||
showauthor: false | ||
authors: | ||
- nunocoracao | ||
--- | ||
# Hive Deployment and Operations | ||
# Change Data Capture (CDC) Pipeline Implementation | ||
|
||
In this article, we will be deploying Hive services on Hadoop cluster | ||
In this article, we will be implementing a pipeline with PostgreSQL, Debezium CDC, Kafka, MinIO and the Spark. | ||
|