It just loads data to kafka through spark and reads it back.
This repository is partially based in those tutorials:
In order to execute this code, you are going to need:
- sbt
- Java 8+
- docker
- on Windows:
- setup winutil.exe and hadoop.dll, like here.
docker-compose up -d #starting kafka
docker-compose ps #checking if expected services are running
docker-compose logs zookeeper | grep -i binding #check logs from zookeeper
docker-compose logs kafka | grep -i started #check logs from kafka
# creating a new topic
docker-compose exec kafka kafka-topics --create --topic meu-topico-legal --partitions 1 --replication-factor 1 --if-not-exists --zookeeper zookeeper:2181
#checking topic existence
docker-compose exec kafka kafka-topics --describe --topic meu-topico-legal --zookeeper zookeeper:2181
#Producing 100 messages
docker-compose exec kafka bash -c "seq 100 | kafka-console-producer --request-required-acks 1 --broker-list kafka:9092 --topic meu-topico-legal && echo 'Produced 100 messages.'"
#Consuming 100 messages
docker-compose exec kafka kafka-console-consumer --bootstrap-server kafka:9092 --topic meu-topico-legal --from-beginning --max-messages 100
sbt run