Authors: Jingxian Lu*, Wenke Xia*, Dong Wang‡, Zhigang Wang, Bin Zhao, Di Hu‡, Xuelong Li
Accepted By: 2024 Conference on Robot Learning (CoRL)
Resources:[Project Page],[Arxiv]
If you have any questions, please open an issue or send an email to jingxianlu1122@gmail.com.
This is the PyTorch code of our paper: KOI: Accelerating Online Imitation Learning via Hybrid Key-state Guidance.
In this work, we propose the hybrid Key-state guided Online Imitation (KOI) learning method, which estimates precise task-aware reward for efficient online exploration, through decomposing the target task into the objectives of what to do
and the mechanisms of how to do
.
As shown, we initially utilize the rich world knowledge of visual-language models to extract semantic key states from expert trajectory, clarifying the objectives of what to do
. Within intervals between semantic key states, optical flow is employed to identify essential motion key states to comprehend the dynamic transition to the subsequent semantic key state, indicating how to do
the target task. By integrating both types of key states, we adjust the importance weight of expert trajectory states in OT-based reward estimation to empower efficient online imitation learning.
Download expert demonstrations, weights [link]
The link contains all expert demonstrations in our paper.
Please set the path/to/dir
portion of the root_dir
path variable in KOI/cfgs/metaworld_config.yaml
and KOI/cfgs/libero_config.yaml
to the path of the this repository.
Then, extract the files and place the expert_demos
and weights
folders in ${root_dir}/Keystate_Online_Imitation
.
This code is tested in Ubuntu 18.04, pytorch 1.12.1+cu113
Install the requirements
pip install -r requirements.txt
If you have problem installing environment libraries LIBERO, please refer to its official documents.
For a fair comparison with ROT, we conduct experiments in Meta-World suite they provided, which modified the simulation for pixel input. Please follow their instructions to setup Gym-Robotics and Meta-World libraries.
Meta-World
The models of offline imitation are provided, or can be trained using:
python train_metaworld.py agent=bc_metaworld suite/metaworld_task=bin load_bc=false exp_name=bc
To run keystate-guided online imitation:
python train_metaworld.py agent=koi_metaworld suite/metaworld_task=bin keyidx=[50,160] exp_name=koi
The "keyidx" in this command indicates the indexes of semantic key-states of "bin-picking" task. As demonstrated in our paper, they can be extracted by VLMs like this example, or assigned by users manually.
LIBERO
Similarly, to run the offline imitation in LIBERO suite:
python train_bc_libero.py agent=bc_libero suite/libero_task=plate num_demos=50 load_bc=false exp_name=bc
For online imitation:
python train_libero.py agent=koi_libero suite/libero_task=plate keyidx=[40,80] exp_name=koi
@article{lu2024koi,
title={KOI: Accelerating Online Imitation Learning via Hybrid Key-state Guidance},
author={Lu, Jingxian and Xia, Wenke and Wang, Dong and Wang, Zhigang and Zhao, Bin and Hu, Di and Li, Xuelong},
journal={arXiv preprint arXiv:2408.02912},
year={2024}
}