Skip to content

Commit

Permalink
Templates of blogs prepared
Browse files Browse the repository at this point in the history
  • Loading branch information
nacisimsek committed Jun 8, 2024
1 parent 1c04c1d commit a6c7904
Show file tree
Hide file tree
Showing 34 changed files with 52 additions and 55 deletions.
17 changes: 8 additions & 9 deletions content/posts/20240603-kafka-topics/index.md
Original file line number Diff line number Diff line change
@@ -1,16 +1,15 @@
---
title: "Hive Setup and Operations"
summary: "This article is about how to deploy Hive services on Hadoop Cluster, which components it has, how the data is stored and managed in Hive, how the calculation is done via MapReduce, and how Yarn manage the resources"
description: "This article is about how to deploy Hive services on Hadoop Cluster, which components it has, how the data is stored and managed in Hive, how the calculation is done via MapReduce, and how Yarn manage the resources"
categories: ["Docker","Hadoop","Data Engineering"]
tags: ["tutorial", "hdfs", "hive", "mapreduce", "postgres", "catalog"]
date: 2024-06-01
title: "Kafka Topics and Operations"
summary: "This article is about how to operate on Kafka topics, their management, and configure important parameters"
description: "This article is about how to operate on Kafka topics, their management, and configure important parameters"
categories: ["Kafka","Data Engineering","Deployment"]
tags: ["tutorial", "kafka", "topics", "configuration", "kubernetes", "docker"]
date: 2024-06-03
draft: false
showauthor: false
authors:
- nunocoracao
---
# Hive Deployment and Operations

In this article, we will be deploying Hive services on Hadoop cluster
# Kafka Topics and Operations

This article is about how to operate on Kafka topics, their management, and configure important parameters
Binary file modified content/posts/20240604-kafka-python-operations/background.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified content/posts/20240604-kafka-python-operations/featured.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
6 changes: 3 additions & 3 deletions content/posts/20240604-kafka-python-operations/index.md
Original file line number Diff line number Diff line change
@@ -1,16 +1,16 @@
---
title: "Hive Setup and Operations"
title: "Kafka Python Operations"
summary: "This article is about how to deploy Hive services on Hadoop Cluster, which components it has, how the data is stored and managed in Hive, how the calculation is done via MapReduce, and how Yarn manage the resources"
description: "This article is about how to deploy Hive services on Hadoop Cluster, which components it has, how the data is stored and managed in Hive, how the calculation is done via MapReduce, and how Yarn manage the resources"
categories: ["Docker","Hadoop","Data Engineering"]
tags: ["tutorial", "hdfs", "hive", "mapreduce", "postgres", "catalog"]
date: 2024-06-01
date: 2024-06-04
draft: false
showauthor: false
authors:
- nunocoracao
---
# Hive Deployment and Operations
# Kafka Python Operations

In this article, we will be deploying Hive services on Hadoop cluster

Binary file modified content/posts/20240605-spark-deploy/background.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified content/posts/20240605-spark-deploy/featured.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
13 changes: 6 additions & 7 deletions content/posts/20240605-spark-deploy/index.md
Original file line number Diff line number Diff line change
@@ -1,16 +1,15 @@
---
title: "Hive Setup and Operations"
title: "Deploy Spark Cluster"
summary: "This article is about how to deploy Hive services on Hadoop Cluster, which components it has, how the data is stored and managed in Hive, how the calculation is done via MapReduce, and how Yarn manage the resources"
description: "This article is about how to deploy Hive services on Hadoop Cluster, which components it has, how the data is stored and managed in Hive, how the calculation is done via MapReduce, and how Yarn manage the resources"
categories: ["Docker","Hadoop","Data Engineering"]
tags: ["tutorial", "hdfs", "hive", "mapreduce", "postgres", "catalog"]
date: 2024-06-01
categories: ["Docker","Spark","Data Engineering"]
tags: ["tutorial", "spark", "hive", "mapreduce", "postgres", "catalog"]
date: 2024-06-05
draft: false
showauthor: false
authors:
- nunocoracao
---
# Hive Deployment and Operations

In this article, we will be deploying Hive services on Hadoop cluster
# Deploy Spark Cluster

In this article, we will be deploying Spark Cluster on local, docker env, and Kubernetes
Binary file modified content/posts/20240606-spark-cleanup-data/background.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified content/posts/20240606-spark-cleanup-data/featured.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
8 changes: 4 additions & 4 deletions content/posts/20240606-spark-cleanup-data/index.md
Original file line number Diff line number Diff line change
@@ -1,16 +1,16 @@
---
title: "Hive Setup and Operations"
title: "Use PySpark for Data Clean up"
summary: "This article is about how to deploy Hive services on Hadoop Cluster, which components it has, how the data is stored and managed in Hive, how the calculation is done via MapReduce, and how Yarn manage the resources"
description: "This article is about how to deploy Hive services on Hadoop Cluster, which components it has, how the data is stored and managed in Hive, how the calculation is done via MapReduce, and how Yarn manage the resources"
categories: ["Docker","Hadoop","Data Engineering"]
tags: ["tutorial", "hdfs", "hive", "mapreduce", "postgres", "catalog"]
date: 2024-06-01
date: 2024-06-06
draft: false
showauthor: false
authors:
- nunocoracao
---
# Hive Deployment and Operations
# Use PySpark for Data Clean up

In this article, we will be deploying Hive services on Hadoop cluster
In this article, we will be cleaning up a dirty data by using PySpark

Binary file modified content/posts/20240607-spark-dataframe/background.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified content/posts/20240607-spark-dataframe/featured.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
8 changes: 4 additions & 4 deletions content/posts/20240607-spark-dataframe/index.md
Original file line number Diff line number Diff line change
@@ -1,16 +1,16 @@
---
title: "Hive Setup and Operations"
title: "Spark DataFrame Operations"
summary: "This article is about how to deploy Hive services on Hadoop Cluster, which components it has, how the data is stored and managed in Hive, how the calculation is done via MapReduce, and how Yarn manage the resources"
description: "This article is about how to deploy Hive services on Hadoop Cluster, which components it has, how the data is stored and managed in Hive, how the calculation is done via MapReduce, and how Yarn manage the resources"
categories: ["Docker","Hadoop","Data Engineering"]
tags: ["tutorial", "hdfs", "hive", "mapreduce", "postgres", "catalog"]
date: 2024-06-01
date: 2024-06-07
draft: false
showauthor: false
authors:
- nunocoracao
---
# Hive Deployment and Operations
# Spark DataFrame Operations

In this article, we will be deploying Hive services on Hadoop cluster
In this article, we will be practicing Spark DataFrame operations

Binary file modified content/posts/20240608-spark-submit/background.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified content/posts/20240608-spark-submit/featured.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
8 changes: 4 additions & 4 deletions content/posts/20240608-spark-submit/index.md
Original file line number Diff line number Diff line change
@@ -1,16 +1,16 @@
---
title: "Hive Setup and Operations"
title: "Submitting Spark Application"
summary: "This article is about how to deploy Hive services on Hadoop Cluster, which components it has, how the data is stored and managed in Hive, how the calculation is done via MapReduce, and how Yarn manage the resources"
description: "This article is about how to deploy Hive services on Hadoop Cluster, which components it has, how the data is stored and managed in Hive, how the calculation is done via MapReduce, and how Yarn manage the resources"
categories: ["Docker","Hadoop","Data Engineering"]
tags: ["tutorial", "hdfs", "hive", "mapreduce", "postgres", "catalog"]
date: 2024-06-01
date: 2024-06-08
draft: false
showauthor: false
authors:
- nunocoracao
---
# Hive Deployment and Operations
# Submitting Spark Application

In this article, we will be deploying Hive services on Hadoop cluster
In this article, we will be submitting Spark application to the Spark cluster we previously deployed

Binary file modified content/posts/20240609-spark-optimization/background.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified content/posts/20240609-spark-optimization/featured.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
6 changes: 3 additions & 3 deletions content/posts/20240609-spark-optimization/index.md
Original file line number Diff line number Diff line change
@@ -1,16 +1,16 @@
---
title: "Hive Setup and Operations"
title: "Optimizing Spark Applications"
summary: "This article is about how to deploy Hive services on Hadoop Cluster, which components it has, how the data is stored and managed in Hive, how the calculation is done via MapReduce, and how Yarn manage the resources"
description: "This article is about how to deploy Hive services on Hadoop Cluster, which components it has, how the data is stored and managed in Hive, how the calculation is done via MapReduce, and how Yarn manage the resources"
categories: ["Docker","Hadoop","Data Engineering"]
tags: ["tutorial", "hdfs", "hive", "mapreduce", "postgres", "catalog"]
date: 2024-06-01
date: 2024-06-09
draft: false
showauthor: false
authors:
- nunocoracao
---
# Hive Deployment and Operations
# Optimizing Spark Applications

In this article, we will be deploying Hive services on Hadoop cluster

Binary file modified content/posts/20240610-spark-json-process/background.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified content/posts/20240610-spark-json-process/featured.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
9 changes: 4 additions & 5 deletions content/posts/20240610-spark-json-process/index.md
Original file line number Diff line number Diff line change
@@ -1,16 +1,15 @@
---
title: "Hive Setup and Operations"
title: "Processing Complex Nested JSON File with Spark"
summary: "This article is about how to deploy Hive services on Hadoop Cluster, which components it has, how the data is stored and managed in Hive, how the calculation is done via MapReduce, and how Yarn manage the resources"
description: "This article is about how to deploy Hive services on Hadoop Cluster, which components it has, how the data is stored and managed in Hive, how the calculation is done via MapReduce, and how Yarn manage the resources"
categories: ["Docker","Hadoop","Data Engineering"]
tags: ["tutorial", "hdfs", "hive", "mapreduce", "postgres", "catalog"]
date: 2024-06-01
date: 2024-06-10
draft: false
showauthor: false
authors:
- nunocoracao
---
# Hive Deployment and Operations

In this article, we will be deploying Hive services on Hadoop cluster
# Processing Complex Nested JSON File with Spark

In this article, we will be processing complex nested JSON file with Apache Spark
Binary file modified content/posts/20240611-spark-streaming/background.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified content/posts/20240611-spark-streaming/featured.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
8 changes: 4 additions & 4 deletions content/posts/20240611-spark-streaming/index.md
Original file line number Diff line number Diff line change
@@ -1,16 +1,16 @@
---
title: "Hive Setup and Operations"
title: "Spark Streaming Hands On from/to Kafka"
summary: "This article is about how to deploy Hive services on Hadoop Cluster, which components it has, how the data is stored and managed in Hive, how the calculation is done via MapReduce, and how Yarn manage the resources"
description: "This article is about how to deploy Hive services on Hadoop Cluster, which components it has, how the data is stored and managed in Hive, how the calculation is done via MapReduce, and how Yarn manage the resources"
categories: ["Docker","Hadoop","Data Engineering"]
tags: ["tutorial", "hdfs", "hive", "mapreduce", "postgres", "catalog"]
date: 2024-06-01
date: 2024-06-11
draft: false
showauthor: false
authors:
- nunocoracao
---
# Hive Deployment and Operations
# Spark Streaming Hands On from/to Kafka

In this article, we will be deploying Hive services on Hadoop cluster
In this article, we will be developing a Spark Streaming application which will read data from Kafka, process, and write back to Kafka

Binary file modified content/posts/20240612-airflow-nginx-minio/background.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified content/posts/20240612-airflow-nginx-minio/featured.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
8 changes: 4 additions & 4 deletions content/posts/20240612-airflow-nginx-minio/index.md
Original file line number Diff line number Diff line change
@@ -1,16 +1,16 @@
---
title: "Hive Setup and Operations"
title: "Airflow Introduction Pipeline"
summary: "This article is about how to deploy Hive services on Hadoop Cluster, which components it has, how the data is stored and managed in Hive, how the calculation is done via MapReduce, and how Yarn manage the resources"
description: "This article is about how to deploy Hive services on Hadoop Cluster, which components it has, how the data is stored and managed in Hive, how the calculation is done via MapReduce, and how Yarn manage the resources"
categories: ["Docker","Hadoop","Data Engineering"]
tags: ["tutorial", "hdfs", "hive", "mapreduce", "postgres", "catalog"]
date: 2024-06-01
date: 2024-06-12
draft: false
showauthor: false
authors:
- nunocoracao
---
# Hive Deployment and Operations
# Airflow Introduction Pipeline

In this article, we will be deploying Hive services on Hadoop cluster
In this article, we will be deploying Apache Airflow, and create a sample pipeline which fetches data from a webserver and write into MinIO bucket.

Binary file modified content/posts/20240613-elasticsearch-kibana/background.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified content/posts/20240613-elasticsearch-kibana/featured.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
8 changes: 4 additions & 4 deletions content/posts/20240613-elasticsearch-kibana/index.md
Original file line number Diff line number Diff line change
@@ -1,16 +1,16 @@
---
title: "Hive Setup and Operations"
title: "Elasticsearch Indexing and Kibana Dashboard with PySpark"
summary: "This article is about how to deploy Hive services on Hadoop Cluster, which components it has, how the data is stored and managed in Hive, how the calculation is done via MapReduce, and how Yarn manage the resources"
description: "This article is about how to deploy Hive services on Hadoop Cluster, which components it has, how the data is stored and managed in Hive, how the calculation is done via MapReduce, and how Yarn manage the resources"
categories: ["Docker","Hadoop","Data Engineering"]
tags: ["tutorial", "hdfs", "hive", "mapreduce", "postgres", "catalog"]
date: 2024-06-01
date: 2024-06-13
draft: false
showauthor: false
authors:
- nunocoracao
---
# Hive Deployment and Operations
# Elasticsearch Indexing and Kibana Dashboard with PySpark

In this article, we will be deploying Hive services on Hadoop cluster
In this article, we will be sinking data to ElasticSearch by PySpark and create a dashboard on Kibana

Binary file modified content/posts/20240614-debezium-cdc-flink/background.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified content/posts/20240614-debezium-cdc-flink/featured.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
8 changes: 4 additions & 4 deletions content/posts/20240614-debezium-cdc-flink/index.md
Original file line number Diff line number Diff line change
@@ -1,16 +1,16 @@
---
title: "Hive Setup and Operations"
title: "Change Data Capture (CDC) Pipeline Implementation"
summary: "This article is about how to deploy Hive services on Hadoop Cluster, which components it has, how the data is stored and managed in Hive, how the calculation is done via MapReduce, and how Yarn manage the resources"
description: "This article is about how to deploy Hive services on Hadoop Cluster, which components it has, how the data is stored and managed in Hive, how the calculation is done via MapReduce, and how Yarn manage the resources"
categories: ["Docker","Hadoop","Data Engineering"]
tags: ["tutorial", "hdfs", "hive", "mapreduce", "postgres", "catalog"]
date: 2024-06-01
date: 2024-06-14
draft: false
showauthor: false
authors:
- nunocoracao
---
# Hive Deployment and Operations
# Change Data Capture (CDC) Pipeline Implementation

In this article, we will be deploying Hive services on Hadoop cluster
In this article, we will be implementing a pipeline with PostgreSQL, Debezium CDC, Kafka, MinIO and the Spark.

0 comments on commit a6c7904

Please sign in to comment.