Ever wanted to display never before seen art, on demand, using AI? Press a button, speak a "prompt" for the AI artist, and see the new art!
By now everyone has seen AI generated art. There has been lots of amazing works in this field, perhaps most notably DALLE2 by OpenAI. In my opinion, the best way to view art is not on a computer screen, but in a frame on the wall.
This project use a local server to host the art generation AI and automatic speech recognition capabilities. The ePaper frame acts as a client to the server, requesting new art to be generated on demand.
The server is using an NVIDIA GPU (e.g. a Jetson, or other discrete GPU), and the ePaper frame "client" is running on a Raspberry Pi.
The frame has four buttons, and a microphone.
The four buttons have the following functions:
- Request a new generation of art with the same prompt previously used (and currently displayed on the ePaper frame).
- Request a new generation of art with a new prompt created from the pre-built prompts. (see
prompts.txt
) - Request a new generation of art with a new prompt created from the microphone. After the button is pressed, the microphone will start recording for 3 seconds.
- Enable/disable automatic art generation (based on previously used prompt or pre-built prompts).
The display I used was an Inky Impression 5.7" ePaper frame.
It is connected to a Raspberry Pi 1B+ but any other Raspberry Pi should work.
The client/
directory contains the single Python script used to control the frame and connect to the server.
The server is running two docker containers, orchestrated by a Docker Compose file. The two containers are:
triton-inference-server
: Uses NVIDIA's Triton Inference Server to host the art generation AI model and the automatic speech recognition (ASR) model.- The ASR model is a wav2vec 2.0 Large model converted to ONNX format for inference.
- The art generation model is a DALLE-mini variant called min-dalle (massive shoutout to Brett Kuprel for this incredible Pytorch port).
art-generator-api
: a FastAPI server that acts a clean endpoint for the client to request new art. Theserver/
directory contains the code for the server.
Set up the server with the script setup_server.sh
:
cd server/
bash setup_server.sh
cd server/
bash run_server.sh
Set up the client with the script setup_client.sh
:
cd client/
bash setup_client.sh
cd client/
bash run_client.sh <ip_address> # the IP address of the server
If you want to have this hanging on a wall like I do, you can connect the Raspberry Pi to a cellphone battery pack.
See here for other notes on reducing power consumption on Raspberry Pis:
- https://www.cnx-software.com/2021/12/09/raspberry-pi-zero-2-w-power-consumption/
- https://blues.io/blog/tips-tricks-optimizing-raspberry-pi-power/
The min-dalle model and ASR model take around 8GB and 4GB of GPU memory, respectively. So ensure you have at least 12GB of GPU memory.
If your GPU does not have enough memory, you may want to consider only running the min-dalle
model for generating art.
The time taken for art generation is about 10 seconds on an NVIDIA Jetson AGX Orin and about 7 seconds with an NVIDIA RTX2070.
One thing to note about the ePaper is that it is not a perfect display. The particular one I chose has only 7 colors, which can lead to some images looking a bit weird.
Another note is when ePaper refreshes its display. It takes a few seconds to do so. The particular one I chose has a refresh rate of about 30 seconds. Here is a highly sped up sample: