Skip to content

JohnRTurner/riviandatagen

Repository files navigation

Rivian Datagen Demo

The application creates test data in Kafka

This project was generated with Python 3 and Docker.

Loading the Data to Kafka

  1. Setup Kafka Server as needed - Kafka Quick Setup
  2. Obtain application server - can use AWS
  3. In AWS add the application server to the Security Group's Inbound Rules for the Kafka Server
  4. Install Docker - sudo apt install docker.io nmon -y
  5. Add Docker Group to User sudo usermod -a -G docker ubuntu
  6. Relogin for the user to gain access to Docker.
  7. Make a local copy of the application code found on GitHub by git clone https://github.com/JohnRTurner/riviandatagen.git
  8. Build the Docker image docker build riviandatagen -t riviandatagen
  9. Run the Image docker run -d --name riviandatagen -e KAFKA_SERVER=localhost:29092 -e BATCH_SIZE=1000 -e KAFKA_TOPIC=test -e PROC_COUNT=8 -t riviandatagen
  10. View the logs docker logs -f riviandatagen
  11. Proceed to loading the data SingleStore Setup
Option Description
BATCH_SIZE Batch Size
KAFKA_TOPIC Kafka Topic Name -Will Create
PROC_COUNT Processes to Concurrently Run
KAFKA_SERVER Kakfka Server

Kafka Data Load Code Description

Can view the code on GitHub

Filename Description
main.py Main module takes parameters and runs generator
datagenerators.py Creates data and sends to Kafka
kafka.py wrapper for Kafka calls
README.md This file
Dockerfile Files not to copy to the repository
.dockerignore File to generate docker image
requirements.txt Python library requirements
kafkasetup/README.md Instructions to setup Kafka docker
kafkasetup/docker-compose.yml Sample docker-compose.yml
singlestoresetup/README.md Instructions to setup SingleStore with Pipelines

About

Generic Data Generator of Kafka Data

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published