Name		Name	Last commit message	Last commit date
parent directory ..
nets		nets
.gitignore		.gitignore
BUILD		BUILD
README.md		README.md
WORKSPACE		WORKSPACE
eval_ptn.py		eval_ptn.py
eval_rotator.py		eval_rotator.py
input_generator.py		input_generator.py
losses.py		losses.py
metrics.py		metrics.py
model_ptn.py		model_ptn.py
model_rotator.py		model_rotator.py
model_voxel_generation.py		model_voxel_generation.py
pretrain_rotator.py		pretrain_rotator.py
train_ptn.py		train_ptn.py
utils.py		utils.py

README.md

Perspective Transformer Nets

Introduction

This is the TensorFlow implementation for the NIPS 2016 work "Perspective Transformer Nets: Learning Single-View 3D Object Reconstrution without 3D Supervision"

Re-implemented by Xinchen Yan, Arkanath Pathak, Jasmine Hsu, Honglak Lee

Reference: Orginal implementation in Torch

How to run this code

This implementation is ready to be run locally or "distributed across multiple machines/tasks". You will need to set the task number flag for each task when running in a distributed fashion. Please refer to the original paper for parameter explanations and training details.

Installation

TensorFlow
- This code requires the latest open-source TensorFlow that you will need to build manually. The documentation provides the steps required for that.
Bazel
- Follow the instructions here.
- Alternately, Download bazel from https://github.com/bazelbuild/bazel/releases for your system configuration.
- Check for the bazel version using this command: bazel version
matplotlib
- Follow the instructions here.
- You can use a package repository like pip.
scikit-image
- Follow the instructions here.
- You can use a package repository like pip.
PIL
- Install from here.

Dataset

This code requires the dataset to be in tfrecords format with the following features:

image
- Flattened list of image (float representations) for each view point.
mask
- Flattened list of image masks (float representations) for each view point.
vox
- Flattened list of voxels (float representations) for the object.
- This is needed for using vox loss and for prediction comparison.

You can download the ShapeNet Dataset in tfrecords format from here^*.

^* Disclaimer: This data is hosted personally by Arkanath Pathak for non-commercial research purposes. Please cite the ShapeNet paper in your works when using ShapeNet for non-commercial research purposes.

Pretraining: pretrain_rotator.py for each RNN step

$ bazel run -c opt :pretrain_rotator -- --step_size={} --init_model={}

Pass the init_model as the checkpoint path for the last step trained model. You'll also need to set the inp_dir flag to where your data resides.

Training: train_ptn.py with last pretrained model.

$ bazel run -c opt :train_ptn -- --init_model={}

Example TensorBoard Visualizations

To compare the visualizations make sure to set the model_name flag different for each parametric setting:

This code adds summaries for each loss. For instance, these are the losses we encountered in the distributed pretraining for ShapeNet Chair Dataset with 10 workers and 16 parameter servers:

You can expect such images after fine tuning the training as "grid_vis" under Image summaries in TensorBoard: Here the third and fifth columns are the predicted masks and voxels respectively, alongside their ground truth values.

A similar image for when trained on all ShapeNet Categories (Voxel visualizations might be skewed):

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ptn

ptn

README.md

Perspective Transformer Nets

Introduction

How to run this code

Installation

Dataset

Pretraining: pretrain_rotator.py for each RNN step

Training: train_ptn.py with last pretrained model.

Example TensorBoard Visualizations

Files

ptn

Directory actions

More options

Directory actions

More options

Latest commit

History

ptn

Folders and files

parent directory

README.md

Perspective Transformer Nets

Introduction

How to run this code

Installation

Dataset

Pretraining: pretrain_rotator.py for each RNN step

Training: train_ptn.py with last pretrained model.

Example TensorBoard Visualizations