MotionClone

This repository is the official implementation of MotionClone. It is a training-free framework that enables motion cloning from a reference video for controllable video generation, without cumbersome video inversion processes.

Click for the full abstract of MotionClone

Motion-based controllable video generation offers the potential for creating captivating visual content. Existing methods typically necessitate model training to encode particular motion cues or incorporate fine-tuning to inject certain motion patterns, resulting in limited flexibility and generalization. In this work, we propose MotionClone a training-free framework that enables motion cloning from reference videos to versatile motion-controlled video generation, including text-to-video and image-to-video. Based on the observation that the dominant components in temporal-attention maps drive motion synthesis, while the rest mainly capture noisy or very subtle motions, MotionClone utilizes sparse temporal attention weights as motion representations for motion guidance, facilitating diverse motion transfer across varying scenarios. Meanwhile, MotionClone allows for the direct extraction of motion representation through a single denoising step, bypassing the cumbersome inversion processes and thus promoting both efficiency and flexibility. Extensive experiments demonstrate that MotionClone exhibits proficiency in both global camera motion and local object motion, with notable superiority in terms of motion fidelity, textual alignment, and temporal consistency.

MotionClone: Training-Free Motion Cloning for Controllable Video Generation
Pengyang Ling*, Jiazi Bu*, Pan Zhang^†, Xiaoyi Dong, Yuhang Zang, Tong Wu, Huaian Chen, Jiaqi Wang, Yi Jin^†
(*Equal Contribution)(^†Corresponding Author)

Demo

MotionClone_demo_compressed.mp4

🖋 News

The latest version of our paper (v4) is available on arXiv! (10.08)
The latest version of our paper (v3) is available on arXiv! (7.2)
Code released! (6.29)

🏗️ Todo

We have updated the latest version of MotionCloning, which performs motion transfer without video inversion and supports image-to-video and sketch-to-video.
Release the MotionClone code (We have released the first version of our code and will continue to optimize it. We welcome any questions or issues you may have and will address them promptly.)
Release paper

📚 Gallery

We show more results in the Project Page.

🚀 Method Overview

Feature visualization

Pipeline

MotionClone utilizes sparse temporal attention weights as motion representations for motion guidance, facilitating diverse motion transfer across varying scenarios. Meanwhile, MotionClone allows for the direct extraction of motion representation through a single denoising step, bypassing the cumbersome inversion processes and thus promoting both efficiency and flexibility.

🔧 Installations (python==3.11.3 recommended)

Setup repository and conda environment

git clone https://github.com/Bujiazi/MotionClone.git
cd MotionClone

conda env create -f environment.yaml
conda activate motionclone

🔑 Pretrained Model Preparations

Download Stable Diffusion V1.5

git lfs install
git clone https://huggingface.co/runwayml/stable-diffusion-v1-5 models/StableDiffusion/

After downloading Stable Diffusion, save them to models/StableDiffusion.

Prepare Community Models

Manually download the community .safetensors models from RealisticVision V5.1 and save them to models/DreamBooth_LoRA.

Prepare AnimateDiff Motion Modules

Manually download the AnimateDiff modules from AnimateDiff, we recommend v3_adapter_sd_v15.ckpt and v3_sd15_mm.ckpt.ckpt. Save the modules to models/Motion_Module.

Prepare SparseCtrl for image-to-video and sketch-to-video

Manually download "v3_sd15_sparsectrl_rgb.ckpt" and "v3_sd15_sparsectrl_scribble.ckpt" from AnimateDiff. Save the modules to models/SparseCtrl.

🎈 Quick Start

Perform Text-to-video generation with customized camera motion

python t2v_video_sample.py --inference_config "configs/t2v_camera.yaml" --examples "configs/t2v_camera.jsonl"

Perform Text-to-video generation with customized object motion

python t2v_video_sample.py --inference_config "configs/t2v_object.yaml" --examples "configs/t2v_object.jsonl"

Combine motion cloning with sketch-to-video

python i2v_video_sample.py --inference_config "configs/i2v_sketch.yaml" --examples "configs/i2v_sketch.jsonl"

Combine motion cloning with image-to-video

python i2v_video_sample.py --inference_config "configs/i2v_rgb.yaml" --examples "configs/i2v_rgb.jsonl"

📎 Citation

If you find this work helpful, please cite the following paper:

@article{ling2024motionclone,
  title={MotionClone: Training-Free Motion Cloning for Controllable Video Generation},
  author={Ling, Pengyang and Bu, Jiazi and Zhang, Pan and Dong, Xiaoyi and Zang, Yuhang and Wu, Tong and Chen, Huaian and Wang, Jiaqi and Jin, Yi},
  journal={arXiv preprint arXiv:2406.05338},
  year={2024}
}

📣 Disclaimer

This is official code of MotionClone. All the copyrights of the demo images and audio are from community users. Feel free to contact us if you would like remove them.

💞 Acknowledgements

The code is built upon the below repositories, we thank all the contributors for open-sourcing.

Name		Name	Last commit message	Last commit date
Latest commit History 31 Commits
__assets__		__assets__
condition_images		condition_images
configs		configs
generated_videos		generated_videos
models/Motion_Module		models/Motion_Module
motionclone		motionclone
reference_videos		reference_videos
README.md		README.md
environment.yaml		environment.yaml
i2v_video_sample.py		i2v_video_sample.py
t2v_video_sample.py		t2v_video_sample.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MotionClone

Demo

🖋 News

🏗️ Todo

📚 Gallery

🚀 Method Overview

Feature visualization

Pipeline

🔧 Installations (python==3.11.3 recommended)

Setup repository and conda environment

🔑 Pretrained Model Preparations

Download Stable Diffusion V1.5

Prepare Community Models

Prepare AnimateDiff Motion Modules

Prepare SparseCtrl for image-to-video and sketch-to-video

🎈 Quick Start

Perform Text-to-video generation with customized camera motion

Perform Text-to-video generation with customized object motion

Combine motion cloning with sketch-to-video

Combine motion cloning with image-to-video

📎 Citation

📣 Disclaimer

💞 Acknowledgements

About

Releases

Packages

Contributors 3

Languages

LPengYang/MotionClone

Folders and files

Latest commit

History

Repository files navigation

MotionClone

Demo

🖋 News

🏗️ Todo

📚 Gallery

🚀 Method Overview

Feature visualization

Pipeline

🔧 Installations (python==3.11.3 recommended)

Setup repository and conda environment

🔑 Pretrained Model Preparations

Download Stable Diffusion V1.5

Prepare Community Models

Prepare AnimateDiff Motion Modules

Prepare SparseCtrl for image-to-video and sketch-to-video

🎈 Quick Start

Perform Text-to-video generation with customized camera motion

Perform Text-to-video generation with customized object motion

Combine motion cloning with sketch-to-video

Combine motion cloning with image-to-video

📎 Citation

📣 Disclaimer

💞 Acknowledgements

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages