Finding the Next Best View for Object Recognition through Maximum Entropy Viewpoint Selection

A collection of scripts related to my Master's thesis - a method for finding the most informative camera positions for multiview object recognition.

Thesis Report PDF: https://drive.google.com/file/d/1bxV0k1IZEmBeeNDRbrTXAjw1fXbQj_C8/view (soon to be published at https://fse.studenttheses.ub.rug.nl/31411/)

There are two methods: one based on differentiable rendering and one based on point clouds (see thesis report for details). They are both implemented in PyTorch.

Setup 🧑‍🔧

Create a conda environment:

conda create --name nbv_mevs_env python=3.8
conda activate nbv_mevs_env

Install PyTorch (the important part is to use some version that has the entr function):

pip install --pre torch torchvision torchaudio -f https://download.pytorch.org/whl/nightly/cu111/torch_nightly.html

Install other dependencies using pip and conda (I would have preferred to use only conda, but the neural_renderer is not available there):
```
pip install -r requirements_pip.txt
conda install --file requirements.txt
```

You might need a separate conda environment for the method based on point clouds, since it was developed in a different version of PyTorch. For that, check out the am/thesis branch on my fork of PointNet2_PyTorch. The pipeline script (below) will work on either environment. The graph triangulation script build_graph_from_spherical_coords (which uses stripy) might also need a separate environment.

Usage 🧑‍💻

The main script is classification_pipeline in the pipeline directory. Given the paths of the object mesh, the checkpoint files and the desired method, it will run the pipeline for the given method. For more info, run:

python3 pipeline/classification_pipeline.py --help

The script evaluate_pipeline was used to run the pipeline on the entire test set.

Datasets are not tracked and can be obtained by running the scripts in the generate_datasets directory. You can get more info by running each script with the --help flag, e.g.:

python3 generate_datasets/generate_view_dataset.py --help

Note that you need ModelNet10 downloaded and extracted, (classification_pipeline assumes in ~/datasets/ModelNet10). You can get it here.

When training, caching is used (with lmdb) to speed up loading data. However, the cache database files can get quite large for images (in my experience, 10-20 times the size of a png dataset), so make sure you have plenty of disk space.

You might need to prepend PYTHONPATH=. to the commands for the imports to work.

Visualization 📊

The figures in the thesis can be generated using the scripts in the visualization directory. The output will be saved in the assets directory.

License

MIT License

Name		Name	Last commit message	Last commit date
Latest commit History 223 Commits
assets/animations		assets/animations
config		config
generate_datasets		generate_datasets
geometry_utils		geometry_utils
mesh_utils		mesh_utils
neural_renderer_approach		neural_renderer_approach
node_weighted_graph		node_weighted_graph
pipeline		pipeline
visualization		visualization
.gitignore		.gitignore
.gitmodules		.gitmodules
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
requirements_pip.txt		requirements_pip.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Finding the Next Best View for Object Recognition through Maximum Entropy Viewpoint Selection

Setup 🧑‍🔧

Usage 🧑‍💻

Visualization 📊

License

About

Releases

Packages

Languages

License

AndreiMiculita/nbv_mevs

Folders and files

Latest commit

History

Repository files navigation

Finding the Next Best View for Object Recognition through Maximum Entropy Viewpoint Selection

Setup 🧑‍🔧

Usage 🧑‍💻

Visualization 📊

License

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages