Skip to content

ryanmckim/CalHacks2023

Repository files navigation

TuneAI: Video/Image to Audio

Generative AI technology for audio content creation.

Going from Video to Audio, an example process:

Visual Input: Video background party

One Snapshot:

Captions with Blip: ['purple light shining on a crowd of people at a concert', 'purple light shining on a crowd of people at a concert', 'crowd of people at a concert with their hands in the air']

Prompt generation with OpenAI: EDM, Energetic, Pop with pulsating beats, synths, and euphoric crowd samples.

Audio generation with MusicGen: 10s music

Other Video Sources

Australia vs USA | Women's Beach Volleyball Gold Medal Match | Tokyo Replays Video background party Fly Me To The Moon - Stringspace Jazz Band Eating at Grand Central Oyster Bar NYC. Tourist Trap? or Classic Restaurant?