ZePHyR is a zero-shot 6D object pose estimation pipeline. The core is a learned scoring function that compares the sensor observation to a sparse object rendering of each candidate pose hypothesis. We used PointNet++ as the network structure and trained and tested on YCB-V and LM-O dataset.
[ArXiv] [Project Page] [Video] [BibTex]
First, checkout this repo by
git clone --recurse-submodules git@github.com:r-pad/zephyr.git
- We recommend building the environment and install all required packages using Anaconda.
conda env create -n zephyr --file zephyr_env.yml
conda activate zephyr
- Install the required packages for compiling the C++ module
sudo apt-get install build-essential cmake libopencv-dev python-numpy
- Compile the c++ library for python bindings in the conda virtual environment
mkdir build
cd build
cmake .. -DPYTHON_EXECUTABLE=$(python -c "import sys; print(sys.executable)") -DPYTHON_INCLUDE_DIR=$(python -c "from distutils.sysconfig import get_python_inc; print(get_python_inc())") -DPYTHON_LIBRARY=$(python -c "import distutils.sysconfig as sysconfig; print(sysconfig.get_config_var('LIBDIR'))")
make; make install
- Install the current python package
cd .. # move to the root folder of this repo
pip install -e .
Download pre-processed training and testing data (ycbv_preprocessed.zip
, lmo_preprocessed.zip
and ppf_hypos.zip
) from this Google Drive link and unzip it in the python/zephyr/data
folder. The unzipped data takes around 66GB of storage in total.
The following commands need to be run in python/zephyr/
folder.
cd python/zephyr/
To use the network, an example is provided in notebooks/TestExample.ipynb. In the example script, a datapoint is loaded from LM-O dataset provided by the BOP Challenge. The pose hypotheses is provided by PPF algorithm (extracted from ppf_hypos.zip
). Despite the complex dataloading code, only the following data of the observation and the model point clouds is needed to run the network:
img
: RGB image, np.ndarray of size (H, W, 3) in np.uint8depth
: depth map, np.ndarray of size (H, W) in np.float, in meterscam_K
: camera intrinsic matrix, np.ndarray of size (3, 3) in np.floatmodel_colors
: colors of model point cloud, np.ndarray of size (N, 3) in float, scaled in [0, 1]model_points
: xyz coordinates of model point cloud, np.ndarray of size (N, 3) in float, in metersmodel_normals
: normal vectors of mdoel point cloud, np.ndarray of size (N, 3) in float, each L2 normalizedpose_hypos
: pose hypotheses in camera frame, np.ndarray of size (K, 4, 4) in float
The PPF algorithm we used is the surface matching function implmemented in MVTec HALCON software. HALCON provides a Python interface for programmers together with its newest versions. I wrote a simple wrapper which calls create_surface_model()
and find_surface_model()
to get the pose hypotheses. See notebooks/TestExample.ipynb for how to use it.
The wrapper requires the HALCON 21.05 to be installed, which is a commercial software but it provides free licenses for students.
If you don't have access to HALCON, sets of pre-estimated pose hypotheses are provided in the pre-processed dataset.
Download the pretrained pytorch model checkpoint from this Google Drive link and unzip it in the python/zephyr/ckpts/
folder. We provide 3 checkpoints, two trained on YCB-V objects with odd ID (final_ycbv.ckpt
) and even ID (final_ycbv_valodd.ckpt
) respectively, and one trained on LM objects that are not in LM-O dataset (final_lmo.ckpt
).
Test on the YCB-V dataset using the model trained on objects with odd ID
python test.py \
--model_name pn2 \
--dataset_root ./data/ycb/matches_data_test/ \
--dataset_name ycbv \
--dataset HSVD_diff_uv_norm \
--no_valid_proj --no_valid_depth \
--loss_cutoff log \
--exp_name final \
--resume_path ./ckpts/final_ycbv.ckpt
Test on the YCB-V dataset using the model trained on objects with even ID
python test.py \
--model_name pn2 \
--dataset_root ./data/ycb/matches_data_test/ \
--dataset_name ycbv \
--dataset HSVD_diff_uv_norm \
--no_valid_proj --no_valid_depth \
--loss_cutoff log \
--exp_name final \
--resume_path ./ckpts/final_ycbv_valodd.ckpt
python test.py \
--model_name pn2 \
--dataset_root ./data/lmo/matches_data_test/ \
--dataset_name lmo \
--dataset HSVD_diff_uv_norm \
--no_valid_proj --no_valid_depth \
--loss_cutoff log \
--exp_name final \
--resume_path ./ckpts/final_lmo.ckpt
The testing results will be stored in test_logs
and the results in BOP Challenge format will be in test_logs/bop_results
. Please refer to bop_toolkit for converting the results to BOP Average Recall scores used in BOP challenge.
These commands will train the network on the real-world images in the YCB-Video training set.
On object Set 1 (objects with odd ID)
python train.py \
--model_name pn2 \
--dataset_root ./data/ycb/matches_data_train/ \
--dataset_name ycbv \
--dataset HSVD_diff_uv_norm \
--no_valid_proj --no_valid_depth \
--loss_cutoff log \
--exp_name final
On object Set 2 (objects with even ID)
python train.py \
--model_name pn2 \
--dataset_root ./data/ycb/matches_data_train/ \
--dataset_name ycbv \
--dataset HSVD_diff_uv_norm \
--no_valid_proj --no_valid_depth \
--loss_cutoff log \
--val_obj odd \
--exp_name final_valodd
This command will train the network on the synthetic images provided by BlenderProc4BOP. We take the lm_train_pbr.zip as the training set but the network is only supervised on objects that is in Linemod but not in Linemod-Occluded (i.e. IDs for training objects are 2 3 4 7 13 14 15
).
python train.py \
--model_name pn2 \
--dataset_root ./data/lmo/matches_data_train/ \
--dataset_name lmo \
--dataset HSVD_diff_uv_norm \
--no_valid_proj --no_valid_depth \
--loss_cutoff log \
--exp_name final
If you find this codebase useful in your research, please consider citing:
@inproceedings{okorn2021zephyr,
title={Zephyr: Zero-shot pose hypothesis rating},
author={Okorn, Brian and Gu, Qiao and Hebert, Martial and Held, David},
booktitle={2021 IEEE International Conference on Robotics and Automation (ICRA)},
pages={14141--14148},
year={2021},
organization={IEEE}
}
- We used the PPF implementation provided in MVTec HALCON software for pose hypothese generation. It is a commercial software but provides free license for student.
- We used bop_toolkit for data loading and results evaluation.