legendary-goggles is a repository that is configured as a message broker that is intended to simulate microservices during high traffic. It tries to mimmick the behavior of chatbots with its message-reply format. It utilizes Python with RabbitMQ as its message broker, FastAPI as its web framework for building APIs, FastStream connects the services to queue making it possible to interact with event streams, and Docker for scalability. Oh, and SQLite as our database.
to run, first we need to clone this repository.
git clone https://github.com/fadhilrp/legendary-goggles.git
second, we need to build.
docker compose build
third, we need to compose
docker compose up
our microservice is up and running!
sending a message and getting back a reply is the backbone of our system, here's how you do it.
localhost:8080/docs
after successfully serving the microservice without errors, this endpoint should be available for you to access. here you could easily select the endpoints you want to use. for our case, select /prompt
and try entering your message as a string as the key to the m
your message. if successful, your message should pop. if the service said it doesn't know how to answer your message, try sending a message from the dataset inside components
, (e.g. "Where do you see yourself in 5 years?").
you could import the postman collection that is included inside this repository. you could then select the one that says http://localhost:8000/prompt
and bombs away.
to simulate high traffic you should run simu.py
inside src/api
the script would then try to send all the available prompts that are inside our dataset.
python3 simu.py
to get the database, you would need to be inside our lovely container. to do that we could run,
docker exec -it <container id> /bin/bash
then copying the database file by running,
docker cp <container id>:/legendary-goggles/database.db .
to add a server instance we could do that from adding a server instance inside the docker-compose.yml
rpc_server2:
build:
context: .
dockerfile: Dockerfile.server
environment:
- RABBITMQ_HOST=rabbitmq
- SERVER_NAME=server2
- QUEUE_NAME=rpc_queue2
depends_on:
rabbitmq:
condition: service_healthy
rpc_server1:
condition: service_started
increment the numbers (e.g. 2 to 3) it should be good to go.
to check the health of our service we could use
selecting the /health
endpoint should give the health of our current connection with RabbitMQ.
you could select the one that says http://localhost:8000/health
and bombs away.
- to see all logs
selecting the
GET /logs/
endpoint should show all of our current logs. - to see a specific log
selecting the
GET /logs/{logs_id}
endpoint should show the log you are currently looking for. - to delete a specific log
selecting the
DELETE /logs/{logs_id}
endpoint deletes the log you want to delete.
same endpoints apply to the functions.
thought process:
- since there are only four days left, despite its performance cutbacks python's general purpose helps speeds up development
- again with the general purpose, RabbitMQ should do the job where if a system simply needs to notify another part of the system to start to work on a task.
- get a dataset worth of prompts so it would simulate the real world scenario of chatbots.
explanation:
- establish a simple pub/sub pipeline
- read the dataset, take the prompt as the input
- send all prompts to the subscriber through the publisher
- make sure the subscriber gets all of the message
thought process:
- the subscriber should then reply after receiving the message
- one thing to do that is through a rpc server because it has a
reply_to
function - use the responses from the dataset for the replies to the received messages
explanation:
- convert both pub/sub to a rpc client and server.
- use the dataset to match the received message with the answer.
- make a placeholder to receive the answer.
- make sure the answer is received.
- add a condition if there are no matched answers in the dataset based on the message
thought process:
- right now it could simulate the basic things, there should be edge cases where it would go wrong
- generative AI should be good with accelerating the integration of error handling and logging, let's use it
explanation:
- on every bit of script, identify errors
- add error handling by sending logs to the identified errors
example output: the output is the result of this function down here
@app.get("/logs/{log_id}")
def read_log(log_id: int, session: SessionDep) -> Log:
log = session.get(Log, log_id)
if not log:
raise HTTPException(status_code=404, detail="Log not found")
return log
thought process:
- health checks should be based on the connection with the RabbitMQ, because the thing that's always alive is the message broker, and it's crucial to our services
- there should be a lot of libraries out there that should provide placeholders for health checks, another way to accelerate development. but we should do it our own next time if given the chance, it could potentially be better.
explanation:
- import fastapi_healthz
- use placeholder to see how is the status of our services
thought process:
- as someone who would assess other's people work, it would be tedious if I need to install a lot of things to my environment locally.
- as someone who is being assessed I would want to streamline that.
- before I was experimenting on pure rpc messaging, I should add the norm of HTTP endpoints. I'd use FastAPI for quick deployments (Netflix also use FastAPI)
- connecting FastAPI to RabbitMQ becomes easy with the integration of FastStreams, a trending integration library that helps accelerate integration between services
- dockerize the application for easy deployment. docker allows us to easily create multiple identical instances of our application (containers) and distribute them across different servers. hence the portability of this microservice makes it easy to deploy and maintain.
- for future use and references, the log should be saved into a proper database. SQLite.
explanation:
- FastStreamer allows the router to seem instantaneously click with the fastapi app
- making dockerfiles for client and server
- support dockerfiles with docker-compose.yml because the rabbitmq should start first before the server (second) and the fastapi client (third)
- integrate SQLite on client code, auto generating a database on-start, hence accessing logs is also a possibility
- dockerfile troubleshooting
- docker-compose generation
- error handling generation
- combination of legacy files into one or various files
- automate some markdown
A big thank you to Antrixsh Gupta on kaggle for uploading Prompt Engineering and Responses Datas. This dataset is used for simulating the responses to the messages that are inserted into the queues.
legendary-goggles/
├── components/
│ └── prompt_engineering_dataset.csv # Dataset for the RPC servers
├── img/ # Directory for project images/assets
├── src/
│ └── api/ # Main API module
│ ├── database/ # Database related code
│ │ ├── init.py # Database module initialization
│ │ ├── models.py # Database models/schemas
│ │ └── database.db # SQLite database file
│ ├── init.py # API module initialization
│ ├── api_router.py # FastAPI routes and endpoints
│ ├── rpc_client.py # RabbitMQ RPC client implementation
│ ├── rpc_server.py # RabbitMQ RPC server implementation
│ └── simu.py # Simulation/testing utilities
├── .gitignore # Git ignore file
├── docker-compose.yml # Docker services orchestration
├── Dockerfile.client # Docker build for FastAPI client
├── Dockerfile.server # Docker build for RPC servers
├── legendary-goggles.postman_collection.json # Postman API collection
├── README.md # Project documentation
└── requirements.txt # Python dependencies