This repository contains the pytorch implementation for the ICRA'24 paper titled "PROGrasp: Pragmatic Human-Robot Communication for Object Grasping".
demo.mp4
The source code is based on PyTorch v1.9.1+, CUDA 11+ and CuDNN 7+. Anaconda/Miniconda is the recommended to set up this codebase:
- Install Anaconda or Miniconda distribution based on Python3.7+ from their downloads' site.
- Clone this repository and create an environment:
git clone https://www.github.com/gicheonkang/gst-visdial
conda create -n prograsp python=3.7.16 -y
# activate the environment and install all dependencies
conda activate prograsp
pip install torch==1.9.1+cu111 torchvision==0.10.1+cu111 torchaudio==0.9.1 -f https://download.pytorch.org/whl/torch_stable.html
pip install -r requirements.txt
If you have trouble installing the above, please consult OFA repository. The repository has rich installation know-how.
Download the preprocessed and raw data. Simply run the following scripts.
chmod +x scripts/download_data.sh
./scripts/download_data.sh
Run the following scripts if you want to train the visual grounding module.
chmod +x OFA/run_scripts/prograsp/train_progrounding.sh
./OFA/run_scripts/prograsp/train_progrounding.sh
If you want to see the data loader for each module, please see OFA/data/mm_data/
.
The file OFA/utils/eval_utils.py
contains codes for evaluation
Please download the checkpoints below.
Model | Link |
---|---|
Visual Grounding | Download |
Question Generation | Download |
Answer Interpretation | Download |
We implement evaluation / inference codes for interactive object discovery. Please check the following jupyter notebook file.
OFA/prograsp_eval.ipynb
If you use this code or preprocessed data in your research, please consider citing:
@article{kang2023prograsp,
title={PROGrasp: Pragmatic Human-Robot Communication for Object Grasping},
author={Kang, Gi-Cheon and Kim, Junghyun and Kim, Jaein and Zhang, Byoung-Tak},
journal={arXiv preprint arXiv:2309.07759},
year={2023}
}
We use OFA as reference code. Thanks!
MIT License