Skip to content

peacheym/LatentRepresentations

Repository files navigation

Creating Latent Representations of Synthesizer Patches using Variational Autoencoders

We are pleased to present our work "Creating Latent Representations of Synthesizer Patches using Variational Autoencoders" at the 4th Annual International Symposium on the Internet of Sounds in Pisa, Italy.

This work introduced a method for generating synthesizer patches using a VAE with an extremely small latent dimensionality. We force our VAE to use a two-diemsional latent space, as two-dimensions couples well with human spatial experience and typical commodity user interfaces (such as touch screens, computer mice, etc). Upon successful training of our VAE, we introduct two new Latent Representations based on properties of the latent space and original data set of synth patches.

In this work, we use amSynth as a test bed, with plans to incorporate similar pipelines for other synthesizers.

Poster

LatentRepsPoster

VAE Architecture

Our VAE architecture can be found in GUI/VAE.py. A figure depicting the structure is shown below.

Latent Representations

Once the VAE architecure is fully trained, we generate two Latent Representations based on attributes of the VAE's latent space.

Latent Coordinates Representation

In this representation, each patch used for training the VAE is again pushed through the Encoder, and the resulting 2D vector is plotted into a space. Thus, each of the data points in the 2D space represent an existing patch, while the white space inbetween represents new patches waiting to be discovered.

latent_coords

Timbral Representation

In this representation, we sample the entire latent space using a 50x50 grid. For each of the sampled latent vectors, we decode the latent vector in order to generate a new synthesizer patch, load that patch into amSynth, record a 4 second audio clip of the resulting sound, and analyze that sound using the AudioCommons Timbral Analysis Toolkit. We then use a perceptually uniform colormap to encode the entire latent space based on timbral values, as shown below.

depth_ls

Demo

Demo video coming soon!

A GUI for exploring these latent representations is avaliable in the GUI/ directory of this repo. It uses tkinter, libmapper, and a fork of amSynth to operate. Further instructions are provided in that directory.

About

Source Code associated with paper presented at MMRP'23 (IS2)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published