QueryCap - A fashionable headset AI companion that can see what you see and answer questions about what's in front of you. For VTHacks 2023.
See more details about project, such as video, on this Devpost link
- Raspberry Pi
- Runs
Pi/pi_controller.py
- Runs
- Computer/server with at least 64GB of RAM for model
- We used an AWS EC2 server
- Runs
model.py
- Computer/server on same WiFi network as Raspberry Pi for server
- Runs
server.py
- Runs
- Need a Raspberry Pi on the same WiFi network as the computer running
server.py
- Plug a USB webcam, into the Raspberry Pi, as well as a microphone. Our webcam had a built-in microphone
- Connect a button to Rapsberry Pi GPIO pin 15 (BCM numbering) and Ground, and configure pin 15 to use an internal pull-up resistor
- On Raspberry Pi, install libraries in
Pi/requirements.txt
usingpip install -r Pi/requirements.txt
in a Python virtual environment - On computer/servers running the model and the server, install libraries in
requirements.txt
usingpip install -r requirements.txt
in a Python virtual environment - Networking setup for HTTP requests between devices:
- On the Pi, in
Pi/pi_controller.py
, set theURL
variable tohttp://<server IP address>:5000/send_image
, where<IP address>
is the IP address of the computer runningserver.py
on the same WiFi network - On the computer running
server.py
, set up SSH port tunneling from port 5001 to port 5000 of the computer runningmodel.py
in another terminal: SSH into the computer runningmodel.py
like normal but passing the flag-L 5001:<model server IP address>:5000
, where<model server IP address>
is the IP address of the computer runningmodel.py
- In our case, we set this up because we were using an AWS server
- On the Pi, in
- For computer running
server.py
, runflask --app server.py run --host 0.0.0.0
- For computer running
model.py
, runflask --app model.py run
- For Raspberry Pi, run
python Pi/pi_controller.py
- Now, whenever you want to ask a question, hold down the button connected to the Raspberry Pi and speak your question
- A picture will be taken on button press, and your audio will be recorded for as long as button is held
- Then, the computer running
server.py
will play the answer through its speaker. Voila!