CoMo: Controllable Motion Generation
through Language Guided Pose Code Editing

ECCV 2024

Paper | Project Page

ToDos

Release motion editing demo.
Release finegrained keyword data.
Release training/evaluation code.

Installation

To get started, clone this project, then setup the required dependencies using the following commands:

conda env create -f environment.yml
conda activate como
bash dataset/prepare/download_glove.sh
bash dataset/prepare/download_extractor.sh

The code was tested on Ubuntu 22.04.4 LTS.

Data

Motion-Language Data

For the HumanML3D and KIT-ML datasets, please follow find the instructions for downloading and preprocessing [here].

The resulting file directory should look like this:

./dataset/[dataset_name]/
├── new_joint_vecs/
├── new_joints/
├── texts/
├── Mean.npy 
├── Std.npy 
├── train.txt
├── val.txt
├── test.txt
├── train_val.txt
└── all.txt

Fine-grained Descriptions

We prompt GPT-4 to obtain fine-grained keywords that describe the motion of different body parts. The collected keywords and corresponding CLIP embeddings can be downloaded using the following commands:

bash dataset/prepare/download_keywords.sh

The keywords and keyword embeddings will be stored in the ./keywords and ./keyword_embeddings sub-folders, respectively, for each dataset ./dataset/[dataset_name]/. The training/evaluation code directly loads keyword embeddings. The original text is stored in dictionaries and can be read as follows:

text = np.load("./dataset/[dataset_name]/keywords/[file_id].npy", allow_pickle = True).item()

Pose Codes

We adapt [PoseScript] to parse poses into pose codes, the parsed codes will be stored in the ./codes sub-folder for each dataset ./dataset/[dataset_name]/:

bash dataset/prepare/parse_motion.sh

Although we chose to obtain pose codes through heuristic skeleton parsing throughout our framework, it is also possible to train an encoder module using the parsed pose codes as latent supervision to encode motion sequences into pose code sequences. We include the checkpoint and training details for this encoder in the sections below.

Pre-trained Models

The pretrained model checkpoints will be stored in the ./pretrained folder:

bash dataset/prepare/download_model.sh

Training

Motion Decoder

python train_dec.py \
--batch-size 256 \
--lr 1e-4 \
--total-iter 300000 \
--lr-scheduler 200000 \
--nb-code 392 \
--down-t 2 \
--depth 3 \
--dilation-growth-rate 3 \
--out-dir output \
--dataname t2m \
--vq-act relu \
--loss-vel 0.5 \
--recons-loss l1_smooth \
--exp-name Dec \
--output-emb-width 392

[Optional] Motion Encoder

python train_enc.py \
--batch-size 256 \
--lr 1e-4 \
--total-iter 300000 \
--lr-scheduler 200000 \
--nb-code 392 \
--down-t 2 \
--depth 3 \
--dilation-growth-rate 3 \
--out-dir output \
--dataname t2m \
--vq-act relu \
--loss-vel 0.5 \
--recons-loss l1_smooth \
--exp-name Enc \
--output-emb-width 392 \
--resume-pth ./pretrained/t2m/Dec/model.pth

Motion Generator

python train_t2m.py \
--exp-name Trans \
--batch-size 64 \
--num-layers 9 \ 
--nb-code 392 \
--n-head-gpt 16 \ 
--block-size 62 \
--ff-rate 4 \ 
--out-dir output \
--total-iter 300000 \
--lr-scheduler 150000 \
--lr 0.0001 \
--dataname t2m \
--down-t 2 \
--depth 3 \
--eval-iter 10000 \
--pkeep 0.5 \
--dilation-growth-rate 3 \
--output-emb-width 392 \
--resume-pth ./pretrained/t2m/Dec/model.pth

Evaluation

Motion Decoder

python eval_dec.py \
--batch-size 256 \
--lr 2e-4 \
--total-iter 300000 \
--lr-scheduler 200000 \
--nb-code 392 \
--down-t 2 \
--depth 3 \
--dilation-growth-rate 3 \
--out-dir output \
--dataname t2m \
--vq-act relu \
--loss-vel 0.5 \
--recons-loss l1_smooth \
--exp-name TEST_Dec \
--resume-pth ./pretrained/t2m/Dec/model.pth \
--output-emb-width 392

Motion Generator

python eval_t2m.py  \
--exp-name TEST_Trans \
--batch-size 256 \
--num-layers 9 \
--embed-dim-gpt 1024 \
--nb-code 392 \
--n-head-gpt 16 \
--block-size 62 \
--ff-rate 4 \
--drop-out-rate 0.1 \
--resume-pth ./pretrained/t2m/Dec/model.pth \
--vq-name VQVAE \
--out-dir output \
--total-iter 300000 \
--lr-scheduler 150000 \
--lr 0.0001 \
--dtaname t2m \
--down-t 2 \
--depth 3 \
--eval-iter 10000 \
--pkeep 0.5 \
--dilation-growth-rate 3 \
--vq-act relu \
--output-emb-width 392 \
--resume-trans ./pretrained/t2m/Trans/model.pth

BibTeX

If you find our work helpful or use our code, please consider citing:

@misc{huang2024como,
      title={CoMo: Controllable Motion Generation through Language Guided Pose Code Editing}, 
      author={Yiming Huang and Weilin Wan and Yue Yang and Chris Callison-Burch and Mark Yatskar and Lingjie Liu},
      year={2024},
      eprint={2403.13900},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

Acknowledgement

We would like to thank the following contributors whose amazing work our code is based on:

text-to-motion, MDM, T2M-GPT, PoseScript

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
assets		assets
dataset		dataset
models		models
options		options
posescript		posescript
utils		utils
visualization		visualization
visualize		visualize
README.md		README.md
environment.yml		environment.yml
eval_dec.py		eval_dec.py
eval_enc.py		eval_enc.py
eval_t2m.py		eval_t2m.py
train_dec.py		train_dec.py
train_enc.py		train_enc.py
train_t2m.py		train_t2m.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CoMo: Controllable Motion Generation
through Language Guided Pose Code Editing

ECCV 2024

Paper | Project Page

ToDos

Installation

Data

Motion-Language Data

Fine-grained Descriptions

Pose Codes

Pre-trained Models

Training

Motion Decoder

[Optional] Motion Encoder

Motion Generator

Evaluation

Motion Decoder

Motion Generator

BibTeX

Acknowledgement

About

Releases

Packages

Languages

yh2371/CoMo

Folders and files

Latest commit

History

Repository files navigation

CoMo: Controllable Motion Generation through Language Guided Pose Code Editing

ECCV 2024

Paper | Project Page

ToDos

Installation

Data

Motion-Language Data

Fine-grained Descriptions

Pose Codes

Pre-trained Models

Training

Motion Decoder

[Optional] Motion Encoder

Motion Generator

Evaluation

Motion Decoder

Motion Generator

BibTeX

Acknowledgement

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

CoMo: Controllable Motion Generation
through Language Guided Pose Code Editing

Packages