Sebastian Weiss, Philipp Hermüller, Rüdiger Westermann
This repository contains the code and settings to reproduce all figures (and more) from the paper. https://arxiv.org/abs/2112.01579
- NVIDIA GPU with RTX, e.g. RTX20xx or RTX30xx (we use an RTX2070)
- CUDA 11
- OpenGL with GLFW and GLM
- Python 3.8 or higher, see
applications/env.txt
for the required packages
Tested systems:
- Windows 10, Visual Studio 2019, CUDA 11.1, Python 3.9, PyTorch 1.9.0
- Ubuntu 20.04, gcc 9.3.0, CUDA 11.1, Python 3.8, PyTorch 1.8
The project consists of a C++/CUDA part that has to be compiled first:
renderer
: the renderer static library, see below for noteworthy files. Files ending in.cuh
and.cu
are CUDA kernel files.bindings
: entry point to the Python bindings, after compilation leads to a python extension modulepyrenderer
, placed inbin
gui
: the interactive GUI to design the config files, explore the reference datasets and the trained networks. Requires OpenGL
For compilation, we recommend CMake. For running on a headless server, specify -DRENDERER_BUILD_OPENGL_SUPPORT=Off -DRENDERER_BUILD_GUI=Off
.
Alternatively, compile-library-server.sh
is provided for compilation with the built-in extension compiler of PyTorch. We use this for compilation on our headless GPU server, as it simplifies potential wrong dependencies to different CUDA, Python or PyTorch versions with different virtualenvs or conda environments.
After compiling the C++ library, the network training and evaluation is performed in Python. The python files are all found in applications
:
applications/volumes
the volumes used in the ablation studiesapplications/config-files
the config filesapplications/common
: common utilities, especiallyutils.py
for loading thepyrenderer
library and other helpersapplications/losses
: the loss functions, including SSIM and LPIPSapplications/volnet
: the main network code for training in inference, see below.
To generate a stubfile with all the Python functions exposed by the renderer, launch python applications/common/create_sub.py
- Wrong python version used: make sure to run cmake in the same console where the correct python version is selected via conda or virtualenv. This must be done on a clean
build
-folder, as the paths to the python libraries are cached in thebuild/CMakeCache.txt
-file. - Wrong pytorch version used or no pytorch found: Sometimes, the automatic query of the PyTorch installation folder from the current Python installation failes. You can manually specify the path to PyTorch by passing
-DTORCH_PATH=...
as argument to CMake - Libraries
torch_cuda_cpp
andtorch_cuda_cu
are not found. Some PyTorch installations don't split the CUDA-kernels into different libraries, but have a single monolitic librarytorch_cuda
. In that case, simply removetorch_cuda_cpp
andtorch_cuda_cu
from https://github.com/shamanDevel/fV-SRN/blob/master/CMakeLists.txt#L125. If you know a way to query if PyTorch was built with the CUDA-kernels split into multiple libraries, please write me or open an issue. Thanks
Here we list and explain noteworthy files that contain important aspects of the presented method
On the side of the C++/CUDA library in renderer/
are the following files important. Note that for the various modules, multiple implementations exists, e.g. for the TF. Therefore, the CUDA-kernels are assembled on-demand using NVRTC runtime compilation.
-
Image evaluators (
iimage_evaluator.h
), the entry point to the renderer. Only one implementation:image_evaluator_simple.h
,renderer_image_evaluator_simple.cuh
: Contains the loop over the pixels and generates the rays -- possibly multisampled for Monte Carlo -- from the camera
-
Ray evaluators (
iray_evaluation.h
), called per ray and returns the colors. They call the volume implementation to fetch the densityray_evaluation_stepping.h
,renderer_ray_evaluation_stepping_iso.cuh
,renderer_ray_evaluation_stepping_dvr.cuh
: constant stepping for isosurfaces and DVR.ray_evaluation_monte_carlo.h
Monte Carlo path tracing with multiple bounces, delta tracking and various phase functions
-
Volume interpolations (
volume_interpolation.h
). On the CUDA-side, implementations provide a functor that evaluates a position and returns the density or color at that point- Grid interpolation (
volume_interpolation_grid.h
), trilinear interpolation into a voxel grid stored involume.h
. - Scene Reconstruction Networks (
volume_interpolation_network.h
). The SRNs as presented in the paper. See the header for the binary format of the.volnet
file. The proposed tensor core implementation (Sec. 4.1) can be found inrenderer_volume_tensorcores.cuh
- Grid interpolation (
On the python side in applications/volnet/
, the following files are important:
train_volnet
: the entry point for traininginference.py
: the entry point for inference, used in the scripts for evaluation. Also converts trained models into the binary format for the GUInetwork.py
: The SRN network specificationinput_data.py
: The loader of the input grids, possibly time-dependenttraining_data.py
: world- and screen-space data loaders, contains routines for importance sampling / adaptive resampling. The rejection sampling is implemented in CUDA for performance and called from hereraytracing.py
: Differentiable raytracing in PyTorch, including the memory optimization from Weiss&Westermann 2021, DiffDVR
The training is launched via applications/volnet/train_volnet.py
. Have a look at python train_volnet.py --help
for the available command line parameters.
A typical invocation looks like this (this is how fV-SRN with Ejecta from Fig. 1 was trained)
python train_volnet.py
config-files/ejecta70-v6-dvr.json
--train:mode world # instead of 'screen', Sec. 5.4
--train:samples 256**3
--train:sampler_importance 0.01 # importance sampling based on the density, optional, see Section 5.3
--train:batchsize 64*64*128
--rebuild_dataset 51 # adaptive resampling after 51 epochs, see Section 5.3
--val:copy_and_split # for validation, use 20% of training samples
--outputmode density:direct # instead of e.g. 'color', Sec. 5.3
--lossmode density
--layers 32:32:32 # number of hidden feature layers -> that number + 1 for the number of linear layers / weight matrices.
--activation SnakeAlt:2
--fouriercount 14
--fourierstd -1 # -1 indicates NeRF-construction, positive value indicate sigma for random Fourier Features, see Sec. 5.5
--volumetric_features_resolution 32 # the grid specification, see Sec. 5.2
--volumetric_features_channels 16
-l1 1 #use L1-loss with weight 1
--lr 0.01
--lr_step 100 #lr reduction after 100 epochs, default lr is used
-i 200 # number of epochs
--save_frequency 20 # checkpoints + test visualization
After training, the resulting .hdf5
file contains the network weights + latent grid and can be compiled to our binary format via inference.py
. The resulting .volnet
file can the be loaded in the GUI.
Each figure is associated with a respective script in applications/volnet
. Those scripts include the training of the networks, evaluation, and plot generation. They have to be launched with the current path pointing to applications/
. Note that some of those scripts take multiple hours due to the network training.
- Figure 1, teaser:
applications/volnet/eval_CompressionTeaser.py
- Table 1, possible architectures:
applications/volnet/collect_possible_layers.py
- Section 4.2, change to performance due to grid compression:
applications/volnet/eval_VolumetricFeatures_GridEncoding
- Figure 4, performance of the networks:
applications/volnet/eval_NetworkConfigsGrid.py
- Figure 5+6, latent grid, also includes other datasets:
applications/volnet/eval_VolumetricFeatures.py
- Figure 7, Fourier features:
applications/volnet/eval_Fourier_Grid.py
, includes the datasets not shown in the paper for space reasons - Figure 8, density-vs-color:
applications/volnet/eval_world_DensityVsColorGrid_NoImportance.py
without initial importance sampling and adaptive resampling (Fig. 6)applications/volnet/eval_world_DensityVsColorGrid.py
, includes initial importance sampling, not shownapplications/volnet/eval_world_DensityVsColorGrid_WithResampling.py
, with initial importance sampling and adaptive resampling, improvement reported in Section 5.4 - Figure 9, gradient prediction:
applications/volnet/eval_GradientNetworks1_v2.py
- Figure 10, curvature prediction:
applications/volnet/eval_CurvatureNetworks2.py
- Figure 11, 12, comparison with baseline methods:
applications/volnet/eval_CompressionExtended.py
- Figure 13,14, time-dependent fields:
applications/volnet/eval_TimeVolumetricFeatures.py
: train on every fifth timestepapplications/volnet/eval_TimeVolumetricFeatures2.py
: train on every second timestepapplications/volnet/eval_TimeVolumetricFeatures_plotPaper.py
: assembles the plot for Figure 14
Supplementary Paper:
- Section 1, study on the activation functions:
applications/volnet/eval_ActivationFunctions.py
- Table 2, Figure 1, screen-vs-world:
applications/volnet/eval_ScreenVsWorld_GridNeRF.py
- Figure 2-6, ablation study for gradients:
applications/volnet/eval_GradientNetworks1_v2.py
,applications/volnet/eval_GradientNetworks2.py
- Figure 8, comparison with baseline methods:
applications/volnet/eval_CompressionExtended.py
The other eval_*.py
scripts were cut from the paper due to space limitations. They equal the tests above, except that no grid was used and instead the largest possible networks fitting into the TC-architecture
By default, building the other compression algorithms (TThresh, cudaCompress) for comparisons is disabled. They can be quite tricky to compile.
To enable them, set the CMake-option RENDERER_BUILD_COMPRESSION
to ON
(pass -DRENDERER_BUILD_COMPRESSION=ON
to cmake). Then re-run cmake, compile the whole project, extract the stubs anew (python applications/common/create_sub.py
) and you can run the comparisons.
To measure the memory demands during the various compression algorithms, I had to hack new, malloc(), delete, free()
. This can lead to the build errors in Eigen, std::free
does not point to a function or similar.
In this case, you have to patch the file third-party/cuMat/third-party/Eigen/src/Core/util/Memory.h
and replace all occurrences of std::free
by free
and std::malloc
by malloc
(remove the explicit std:: namespace)