Chen-Hsuan Lin,
Oliver Wang,
Bryan C. Russell,
Eli Shechtman,
Vladimir G. Kim,
Matthew Fisher,
and Simon Lucey
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019
Project page: https://chenhsuanlin.bitbucket.io/photometric-mesh-optim
Paper: https://chenhsuanlin.bitbucket.io/photometric-mesh-optim/paper.pdf
arXiv preprint: https://arxiv.org/abs/1903.08642
We provide PyTorch code for the following experiments:
- ShapeNet+SUN360 sequences
- (coming soon!) Real-world videos
This code is developed with Python3 (python3
). PyTorch 1.0+ is required.
(If you wish to run using PyTorch 0.4, please switch to the pytorch-0.4
branch.)
First, create a Python virtual environment by running
virtualenv -p python3 PMO
Before installing dependencies and/or running the code, activate the virtual environment by running
source PMO/bin/activate
The dependencies can be installed by running (within the virtual environment)
pip3 install --upgrade -r requirements.txt
The test sequences are composited from the leave-out set of ShapeNet and SUN360.
To download the dataset (64GB), run the script file download_sequences.sh
under the directory data
.
After downloading, run tar -zxf sequences.tar.gz
under the directory data
. The files will be extracted to a directory sequences
.
We render images from ShapeNet with higher quality (with resolution of 224x224).
To download the dataset (33GB), run the script file download_rendering.sh
under the directory data
.
After downloading, run tar -xf rendering.tar
. The files will be extracted to a directory rendering
.
Please follow the instructions in the AtlasNet repository to download the ground-truth point clouds.
The directory customShapeNet
should be placed under the directory data
.
The cropped background images from SUN360 (92GB) can be downloaded by running the script file download_background.sh
under the directory data
.
After downloading, run tar -xf background.tar
. The files will be extracted to a directory background
.
The pretrained models (626MB each) can be downloaded by running the command
wget https://cmu.box.com/shared/static/oryysitkhn2eldgb90qkr3lh7j469sj1.npz # airplane
wget https://cmu.box.com/shared/static/jgif23ytibtektwwcji8wiv0jbubzs08.npz # car
wget https://cmu.box.com/shared/static/zakir5pi9xma4l3d5c2g74i8r0lggp36.npz # chair
The meshrender
library can be compiled by running python3 setup.py install
under the directory meshrender
.
The chamfer
library can be compiled by running python3 setup.py install
under the directory chamfer
.
The source code is taken from the AtlasNet repository.
When compiling CUDA code, you may need to modify CUDA_PATH
accordingly.
To try a demo of the photometric mesh optimization, download our pretrained model for cars.
Then run (setting the model
variable to the downloaded checkpoint)
model=pretrained/02958343_atl25.npz
python3 main.py --load=${model} --code=5e-2 --scale=2e-2 --lr-pmo=3e-3 --noise=0.1 --video
This will create the following output files:
- the optimized object mesh (saved into the directory
optimized_mesh
), - the input video sequence with the overlayed 3D mesh (saved to
video
), and - (coming soon!) a 3D mesh model (in
.obj
format) with textures estimated from the input RGB sequence.
The flags --log-tb
and --log-vis
toggles visualization of the optimization process.
More optional arguments can be found by running python3 main.py --help
.
To pretrain AtlasNet with our new dataset (high-resolution ShapeNet rendering + SUN360 cropped backgrounds), run the following command (taking the airplane category for example)
cat=02691156
python3 main_pretrain.py --category=${cat} --name=${cat}_pretrain \
--imagenet-enc --pretrained-dec=pretrained/ae_atlasnet_25.pth
By default, we initialize the encoder with an ImageNet-pretrained ResNet-18 and the decoder with the pretrained AtlasNet (Please refer to the AtlasNet repository for downloading their pretrained models).
More optional arguments can be found by running python3 main_pretrain.py --help
.
We've included code to visualize the training over TensorBoard(X). To execute, run
tensorboard --logdir=summary/GROUP --port=6006
where GROUP
is specified in the pretraining arguments.
For pretraining, we provide three types of data visualization:
- SCALARS: training and test loss curves over epochs
- IMAGES: sample input images
If you find our code useful for your research, please cite
@inproceedings{lin2019photometric,
title={Photometric Mesh Optimization for Video-Aligned 3D Object Reconstruction},
author={Lin, Chen-Hsuan and Wang, Oliver and Russell, Bryan C and Shechtman, Eli and Kim, Vladimir G and Fisher, Matthew and Lucey, Simon},
booktitle={IEEE Conference on Computer Vision and Pattern Recognition ({CVPR})},
year={2019}
}
Please contact me (chlin@cmu.edu) if you have any questions!