Designing an adaptive room for captivating the collective consciousness from internal states.
For demo purposes as a workflow showcasing, multimodal language models (MMLMs) were implemented via API requests.
Features shown:
- Video capturing with
opencv
- Image analysis with
gemini-1.5-flash
, delivers a description for the image generator. - Image generation with
dall-e-2
- Image display with
PIL
- Binaural beats generation with
sounddevice
Recorded demos (no audio capturing):
Screencast.from.2024-10-27.14-56-43.webm
Screencast.from.2024-10-27.14-52-53.webm
Test system: Laptop with GTX 1650, on Ubuntu 24.04.1 with Python 3.12
- Install Tkinter from apt:
sudo apt-get install python3-tk
- Install Portaudio from apt:
sudo apt-get install portaudio19-dev
- Clone the repository
- Create a virtual environment:
python -m venv env
- Activate virtual env:
source env\bin\activate
- Install libraries:
pip install -r requirements.txt
- Create
.env
file, with API keys (OPENAI_API_KEY
,GOOGLE_API_KEY
) - Run:
python3 hack-demo.py