Creating Latent Representations of Synthesizer Patches using Variational Autoencoders

We are pleased to present our work "Creating Latent Representations of Synthesizer Patches using Variational Autoencoders" at the 4th Annual International Symposium on the Internet of Sounds in Pisa, Italy.

This work introduced a method for generating synthesizer patches using a VAE with an extremely small latent dimensionality. We force our VAE to use a two-diemsional latent space, as two-dimensions couples well with human spatial experience and typical commodity user interfaces (such as touch screens, computer mice, etc). Upon successful training of our VAE, we introduct two new Latent Representations based on properties of the latent space and original data set of synth patches.

In this work, we use amSynth as a test bed, with plans to incorporate similar pipelines for other synthesizers.

Poster

VAE Architecture

Our VAE architecture can be found in GUI/VAE.py. A figure depicting the structure is shown below.

Latent Representations

Once the VAE architecure is fully trained, we generate two Latent Representations based on attributes of the VAE's latent space.

Latent Coordinates Representation

In this representation, each patch used for training the VAE is again pushed through the Encoder, and the resulting 2D vector is plotted into a space. Thus, each of the data points in the 2D space represent an existing patch, while the white space inbetween represents new patches waiting to be discovered.

Timbral Representation

In this representation, we sample the entire latent space using a 50x50 grid. For each of the sampled latent vectors, we decode the latent vector in order to generate a new synthesizer patch, load that patch into amSynth, record a 4 second audio clip of the resulting sound, and analyze that sound using the AudioCommons Timbral Analysis Toolkit. We then use a perceptually uniform colormap to encode the entire latent space based on timbral values, as shown below.

Demo

Demo video coming soon!

A GUI for exploring these latent representations is avaliable in the GUI/ directory of this repo. It uses tkinter, libmapper, and a fork of amSynth to operate. Further instructions are provided in that directory.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
GUI		GUI
Sounds		Sounds
amSynthData		amSynthData
.gitignore		.gitignore
LatentRepresentations_CameraReadyPaper.pdf		LatentRepresentations_CameraReadyPaper.pdf
README.md		README.md
TRAIN_SYNTH_VAE.ipynb		TRAIN_SYNTH_VAE.ipynb
losscurves.svg		losscurves.svg

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Creating Latent Representations of Synthesizer Patches using Variational Autoencoders

Poster

VAE Architecture

Latent Representations

Latent Coordinates Representation

Timbral Representation

Demo

About

Releases

Packages

Languages

peacheym/LatentRepresentations

Folders and files

Latest commit

History

Repository files navigation

Creating Latent Representations of Synthesizer Patches using Variational Autoencoders

Poster

VAE Architecture

Latent Representations

Latent Coordinates Representation

Timbral Representation

Demo

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages