Gesture2Vec: Clustering Gestures using Representation Learning Methods for Co-speech Gesture Generation
This is an official PyTorch implementation of Gesture2Vec: Clustering Gestures using Representation Learning Methods for Co-speech Gesture Generation (IROS 2022). In this paper, we present an automatic gesture generation model that uses a vector-quantized variational autoencoder structure as well as training techniques to learn a rigorous representation of gesture sequences. We then translate input text into a discrete sequence of associated gesture chunks in the learned gesture space. Subjective and objective evaluations confirm the success of our approach in terms of appropriateness, human-likeness, and diversity. We also introduce new objective metrics using the quantized gesture representation.
TODO
This code is distributed under an MIT LICENSE.
Note that our code uses datasets inluding Trinity and Talk With Hand (TWH) that each have their own respective licenses that must also be followed.
Please feel free to contact us (pjomeyaz@sfu.ca) with any question or concerns.