diff --git a/.env.example b/.env.example index 60e9e0b..dd62a26 100644 --- a/.env.example +++ b/.env.example @@ -20,3 +20,6 @@ RABBITMQ_DEFAULT_PASS="your_password" RABBITMQ_IP="rabbitmq" # Any changes to Database or RabbitMQ ip address should be in configs/docker_out.json + +# Use GPU acceleration on sfm reconstruction (good for large resolution inputs) +SFM_USE_GPU = 0 \ No newline at end of file diff --git a/.gitignore b/.gitignore index 0020ba7..6455871 100644 --- a/.gitignore +++ b/.gitignore @@ -6,4 +6,6 @@ Pipfile # local env files # Do not commit any .env files to git, except for the .env.example file. .env -.env*.local \ No newline at end of file +.env*.local + +*.log diff --git a/.gitmodules b/.gitmodules index b26acc0..865147a 100644 --- a/.gitmodules +++ b/.gitmodules @@ -1,3 +1,9 @@ -[submodule "NeRF/TensoRF"] - path = NeRF/TensoRF - url = git@github.com:NeRF-or-Nothing/TensoRF.git +[submodule "go-web-server"] + path = go-web-server + url = https://github.com/NeRF-or-Nothing/go-web-server.git +[submodule "nerf-worker"] + path = nerf-worker + url = https://github.com/NeRF-or-Nothing/nerf-worker.git +[submodule "sfm-worker"] + path = sfm-worker + url = https://github.com/NeRF-or-Nothing/sfm-worker.git diff --git a/README.md b/README.md index 7d28839..199cdba 100644 --- a/README.md +++ b/README.md @@ -13,28 +13,28 @@ Logo -

NeRF or Nothing core repository

+

NeRF or Nothing backend core repository

A micro-services based project in rendering novel perspectives of input videos utilizing neural radiance fields.
- + Learn more about NeRFs »

- View Demo + View Demo · - Report Bug + Report Bug · - Request Feature + Request Feature

## About The Project -This repository contains the backend for the NeRf (Neural Radiance Fields) or Nothing +This repository contains the backend for the (Neural Radiance Fields) NeRF-or-Nothing web application that takes raw user video and renders a novel realistic view of the scene they captured. Neural Radiance Fields are a new technique in novel view synthesis that has recently reached state of the art results. @@ -59,6 +59,49 @@ the locations for each image are needed in order to train a NeRF, we get this da from running structure from motion (using COLMAP) on the input video. To learn more please visit the learning resources in the wiki. +## Gaussian Splatting Background +Gaussian splatting is a novel approach to neural scene representation that offers significant +improvements over traditional Neural Radiance Fields (NeRFs) in terms of rendering speed and +visual quality. Like NeRFs, gaussian splatting starts with a set of input images capturing +different perspectives of the same scene, along with their corresponding camera positions and orientations. + +The key difference lies in how the scene is represented and rendered: + +1. **Scene Representation**: Instead of using a neural network to model the entire scene, gaussian + splatting represents the scene as a collection of 3D Gaussian primitives. Each Gaussian + is defined by its position, covariance matrix (which determines its shape and orientation), and + appearance attributes (color and opacity). + +4. **Initialization**: The process begins by running structure from motion (using tools like COLMAP) on + the input images to obtain initial camera parameters and a sparse point cloud. This point + cloud is used to initialize the Gaussian primitives. + +5. **Training**: The system then optimizes these Gaussians to best reproduce the input images. This + involves adjusting the Gaussians' positions, shapes, and appearance attributes. + The training process is typically faster than NeRF training and can be done end-to-end using gradient descent. + +6. **Rendering**: To generate a new view, the Gaussians are projected onto the image plane of the +virtual camera. Each Gaussian splat contributes to the final image based on its projected size, shape, +and appearance. This process is highly parallelizable and can be efficiently implemented on GPUs, +resulting in real-time or near-real-time rendering speeds. + +7. **View-dependent Effects**: Gaussian splatting can model view-dependent effects by incorporating +additional parameters for each Gaussian, allowing for realistic +representation of specular highlights and reflections. If you want to take advantage of this, use .ply files, and for quick reflectionless +rendering, use .splat files. + +The resulting representation is compact, efficient to render, and capable of producing high-quality novel views. +Importantly, like NeRFs, gaussian splatting requires accurate camera positions for the input images, + which are typically obtained through structure from motion techniques. + +Gaussian splatting offers several advantages over traditional NeRFs: +- Faster training times +- Real-time or near-real-time rendering of novel views +- Better preservation of fine details and sharp edges +- More compact scene representation + +To learn more about gaussian splatting and its implementation details, please refer to the learning resources in the wiki. + ### General Pipeline: 1. Run Structure from motion on input video (using COLMAP implementation) to @@ -81,20 +124,36 @@ aforementioned folders. ## Getting Started -To run the project install and run the web-server, the nerf worker, and the -colmap worker in any order by running their respective installations in their -READMEs. Once these are running the front-end can be started by visiting the -[front end repo](https://github.com/NeRF-or-Nothing/web-app). Once everything is -running the website should be available at `localhost:3000` and a video can -be uploaded to test the application. - ### Prerequisites 1. Have [Docker](https://www.docker.com/) installed locally -2. Install [COLMAP](https://colmap.github.io/) -3. Install [ffmpeg](https://ffmpeg.org/) -4. If you intend to run the NeRF and COLMAP workers locally ensure you have -NVIDIA GPUS with atleast 6GB of vram as these are resource intensive applications +2. Have a CUDA 11.7+ Nvidia GPU (To run training) +3. Follow the service prerequisites: + - [go-web-server]() + - [sfm-worker]() + - [nerf-worker]() **IMPORTANT READ** + +### Instalation + +The project should be be easy to install/run once you have completed the respective prerequisites. +The files `./docker-compose-go.yml` and `docker-compose-flask.yml` handle the setup given that you want to run +V3 or V2 of the api, respectively. + +1. Clone this repository + ``` + git clone https://github.com/NeRF-or-Nothing/backend.git + ``` + +2. Compose the backend. View indepth [instructions]() + ``` + docker compose -f .yml up -d + ``` + +3. Follow the [frontend](https://github.com/NeRF-or-Nothing/frontend) installation. + +Once everything is running the website should be available at `localhost:5173` and a video can +be uploaded to test the application. + ## Output Example @@ -105,7 +164,18 @@ dataset lego example to a video then running vidtonerf produces the following re ## Roadmap -TODO +- **Deployment**: The team has been mixing with the idea of deploying for numerous years now. In order to do so we need to get production ready. + 1. More request verification + 2. Reverse proxy + 3. TSL/SSL frontend + 4. Lockdown communication +- **Colmap**: Colmap is notoriously hard to please, and we should investigate how to make it more tolerant of user videos. See [Colmap Brainstorming]() to get started. +- **Expand functionality**: We could possible expand into a more general purpose Deep Learning powered video app. Some possibilites: + 1. Stylized Text-to-Scene: Recent research for Text-Based Scene generation has shown crazy progress on stylized/themed scene generation +- **Testing and Cleanup**: We can always improve our codebase by implementing further testing. +- **CI/CD Pipelines**: Upon successful deployment we could set up dedicated testing pipelines. This would be a big stretch. For now, we could create + workflows to ensure code quality, security, and testing coverage for lighter parts of the system. +- **Docker Hub Image Generation**: Setting up image generation would allow for users to easily start their own instance without build hassles. ## Contributing @@ -113,7 +183,7 @@ Contributions are what make the open source community such an amazing place to learn, inspire, and create. Any contributions you make are **greatly appreciated**. If you have a suggestion that would make this better, please fork the repo and -create a pull request. +create a pull request. Please go the the relevant repository and follow this process. 1. Fork the Project 2. Create your Feature Branch (`git checkout -b feature/AmazingFeature`) @@ -136,16 +206,17 @@ Or, inquire at: `nerf@quicktechtime.com` ## Acknowledgments * [TensoRF project](https://github.com/apchenstu/TensoRF) +* [Gaussian Splatting](https://github.com/graphdeco-inria/gaussian-splatting) * [Original NeRF](https://github.com/bmild/nerf) * [COLMAP](https://colmap.github.io/) -[contributors-shield]: https://img.shields.io/github/contributors/NeRF-or-Nothing/vidtonerf.svg?style=for-the-badge -[contributors-url]: https://github.com/NeRF-or-Nothing/vidtonerf/graphs/contributors -[forks-shield]: https://img.shields.io/github/forks/NeRF-or-Nothing/vidtonerf.svg?style=for-the-badge -[forks-url]: https://github.com/NeRF-or-Nothing/vidtonerf/network/members -[issues-shield]: https://img.shields.io/github/issues/NeRF-or-Nothing/vidtonerf.svg?style=for-the-badge -[issues-url]: https://github.com/NeRF-or-Nothing/vidtonerf/issues -[license-shield]: https://img.shields.io/github/license/NeRF-or-Nothing/vidtonerf.svg?style=for-the-badge -[license-url]: https://github.com/NeRF-or-Nothing/vidtonerf/blob/master/LICENSE.txt +[contributors-shield]: https://img.shields.io/github/contributors/NeRF-or-Nothing/backend.svg?style=for-the-badge +[contributors-url]: https://github.com/NeRF-or-Nothing/backend/graphs/contributors +[forks-shield]: https://img.shields.io/github/forks/NeRF-or-Nothing/backend.svg?style=for-the-badge +[forks-url]: https://github.com/NeRF-or-Nothing/backend/network/members +[issues-shield]: https://img.shields.io/github/issues/NeRF-or-Nothing/backend.svg?style=for-the-badge +[issues-url]: https://github.com/NeRF-or-Nothing/backend/issues +[license-shield]: https://img.shields.io/github/license/NeRF-or-Nothing/backend.svg?style=for-the-badge +[license-url]: https://github.com/NeRF-or-Nothing/backend/blob/master/LICENSE.txt diff --git a/TensoRF/.gitignore b/TensoRF/.gitignore deleted file mode 100644 index 3f91f5f..0000000 --- a/TensoRF/.gitignore +++ /dev/null @@ -1,8 +0,0 @@ -/log -/output -data -log -Pipfile* -__pycache__ -.vscode -*.log diff --git a/TensoRF/Dockerfile b/TensoRF/Dockerfile deleted file mode 100644 index a868f64..0000000 --- a/TensoRF/Dockerfile +++ /dev/null @@ -1,26 +0,0 @@ -# NVIDIA CUDA Toolkit 12.3 for ubuntu -FROM nvidia/cuda:12.3.2-devel-ubuntu22.04 - -WORKDIR /TensoRF - -RUN apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1604/x86_64/3bf863cc.pub -RUN export DEBIAN_FORNTEND=noninteractive && \ - apt-get update -y && \ - apt-get install libssl-dev -y && \ - apt-get install software-properties-common -y && \ - add-apt-repository ppa:deadsnakes/ppa && \ - apt-get update -y && \ - apt-get install curl -y && \ - apt-get install python3.10 -y && \ - apt-get install python3-pip -y - -RUN curl -sS https://bootstrap.pypa.io/get-pip.py | python3.10 - -COPY ./TensoRF/requirements.txt requirements.txt -RUN python3.10 -m pip install --upgrade -r requirements.txt - -# Overwritten by compose -COPY . . - -# TODO add config support -CMD ["python3.1-0", "main.py"] \ No newline at end of file diff --git a/TensoRF/LICENSE b/TensoRF/LICENSE deleted file mode 100644 index 3eac45c..0000000 --- a/TensoRF/LICENSE +++ /dev/null @@ -1,21 +0,0 @@ -MIT License - -Copyright (c) 2022 Anpei Chen - -Permission is hereby granted, free of charge, to any person obtaining a copy -of this software and associated documentation files (the "Software"), to deal -in the Software without restriction, including without limitation the rights -to use, copy, modify, merge, publish, distribute, sublicense, and/or sell -copies of the Software, and to permit persons to whom the Software is -furnished to do so, subject to the following conditions: - -The above copyright notice and this permission notice shall be included in all -copies or substantial portions of the Software. - -THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR -IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, -FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE -AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER -LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, -OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE -SOFTWARE. diff --git a/TensoRF/README.md b/TensoRF/README.md deleted file mode 100644 index fbf1522..0000000 --- a/TensoRF/README.md +++ /dev/null @@ -1,151 +0,0 @@ -# TensoRF -### This code is based on a research project located [here](https://apchenstu.github.io/TensoRF/) with the readme included below. -A single threaded worker that runs this project consuming jobs from RabbitMQ and submitting it back to a seperate completed job queue is located at `worker.py`. This worker loads the config file at `configs/workerconfig.txt` that defines the settings TensoRF should be run at to process each job. Right now the config is static but in the future these settings can be modified based on the job. - -The worker consumes jobs from RabbitMQ described via a json template that contains the following: -``` -{ - "id": String, - "vid_width": int, - "vid_height": int, - "trained_model_file": String(optional), - "intrinsic_matrix": float[[]], - "frames": [ - { - "file_path": String - "extrinsic_matrix": float[[]] - }, - ... - ] - } - ``` - -Once the worker is done generating the trained NeRF and rendering the desired video it submits a complete forum to RabbitMQ also in the json format that contains the following: -``` -{ - "id": String, - "model_file": String, - "video_file": String -} -``` - -# Usage of Local Worker -Here are some basic instructions on how to use the worker.py in local mode: -### Running worker.py -To run worker.py to train a new TensoRF and render a new video use the command: `python worker.py --config configs/localworkerconfig.txt`. - -If you only want to render a new video from a TensoRF model that has already been trained use the command: -`python worker.py --config configs/localworkerconfig.txt --ckpt [PATH TO TENSORF MODEL] --render_only 1` -This will load a model from the specified path and use it to render the camera motion specified in the `transforms_render.json` input file. - -Example for render only: `python worker.py --config configs/localworkerconfig.txt --ckpt log/tensorf_sfm_data_VM/tensorf_sfm_data_VM.th --render_only 1` -### Input data -The worker takes input from `data/sfm_data/`. Within this folder you should provide a json file named `transforms_train.json` which will contain the transformation data from structure from motion along with a subfolder labeled `train` that will contain all of the image files referenced in `transforms_train.json`. This will provide the worker with all the data it needs to train a TensoRF. Then once the TensoRF model is trained the worker will load the final file from the input data `transforms_render.json` which contains the desired camera path to be rendered in the same format as the training json (template above) - -Example input file structure: - -![Screenshot_20220729_065836](https://user-images.githubusercontent.com/49171429/181745902-920d5483-28e6-4412-bc07-9c770544057f.png) - -### Output data -The worker outputs final results to `log/tensorf_sfm_data_VM`. - -Within this folder the only relevate outputs for the worker are the rendered images and final video in the `imgs_render_all` folder and the trained TensoRF model that is saved at `tensorf_sfm_data.th`. This trained model can be reused by the worker using the checkpoint `--ckpt` flag. - - -## [Project page](https://apchenstu.github.io/TensoRF/) | [Paper](https://arxiv.org/abs/2203.09517) -This repository contains a pytorch implementation for the paper: [TensoRF: Tensorial Radiance Fields](https://arxiv.org/abs/2203.09517). Our work present a novel approach to model and reconstruct radiance fields, which achieves super -**fast** training process, **compact** memory footprint and **state-of-the-art** rendering quality.

- - -https://user-images.githubusercontent.com/16453770/158920837-3fafaa17-6ed9-4414-a0b1-a80dc9e10301.mp4 -## Installation - -#### Tested on Ubuntu 20.04 + Pytorch 1.10.1 - -Install environment: -``` -conda create -n TensoRF python=3.8 -conda activate TensoRF -pip install torch torchvision -pip install tqdm scikit-image opencv-python configargparse lpips imageio-ffmpeg kornia lpips tensorboard -pip install -r requirements.txt -``` - - -## Dataset -* [Synthetic-NeRF](https://drive.google.com/drive/folders/128yBriW1IG_3NJ5Rp7APSTZsJqdJdfc1) -* [Synthetic-NSVF](https://dl.fbaipublicfiles.com/nsvf/dataset/Synthetic_NSVF.zip) -* [Tanks&Temples](https://dl.fbaipublicfiles.com/nsvf/dataset/TanksAndTemple.zip) -* [Forward-facing](https://drive.google.com/drive/folders/128yBriW1IG_3NJ5Rp7APSTZsJqdJdfc1) - - - -## Quick Start -The training script is in `train.py`, to train a TensoRF: - -``` -python train.py --config configs/lego.txt -``` - - -we provide a few examples in the configuration folder, please note: - - `dataset_name`, choices = ['blender', 'llff', 'nsvf', 'tankstemple']; - - `shadingMode`, choices = ['MLP_Fea', 'SH']; - - `model_name`, choices = ['TensorVMSplit', 'TensorCP'], corresponding to the VM and CP decomposition. - You need to uncomment the last a few rows of the configuration file if you want to training with the TensorCP model; - - `n_lamb_sigma` and `n_lamb_sh` are string type refer to the basis number of density and appearance along XYZ -dimension; - - `N_voxel_init` and `N_voxel_final` control the resolution of matrix and vector; - - `N_vis` and `vis_every` control the visualization during training; - - You need to set `--render_test 1`/`--render_path 1` if you want to render testing views or path after training. - -More options refer to the `opt.py`. - -### For pretrained checkpoints and results please see: -[https://1drv.ms/u/s!Ard0t_p4QWIMgQ2qSEAs7MUk8hVw?e=dc6hBm](https://1drv.ms/u/s!Ard0t_p4QWIMgQ2qSEAs7MUk8hVw?e=dc6hBm) - - - -## Rendering - -``` -python train.py --config configs/lego.txt --ckpt path/to/your/checkpoint --render_only 1 --render_test 1 -``` - -You can just simply pass `--render_only 1` and `--ckpt path/to/your/checkpoint` to render images from a pre-trained -checkpoint. You may also need to specify what you want to render, like `--render_test 1`, `--render_train 1` or `--render_path 1`. -The rendering results are located in your checkpoint folder. - -## Extracting mesh -You can also export the mesh by passing `--export_mesh 1`: -``` -python train.py --config configs/lego.txt --ckpt path/to/your/checkpoint --export_mesh 1 -``` -Note: Please re-train the model and don't use the pretrained checkpoints provided by us for mesh extraction, -because some render parameters has changed. - -## Training with your own data -We provide two options for training on your own image set: - -1. Following the instructions in the [NSVF repo](https://github.com/facebookresearch/NSVF#prepare-your-own-dataset), then set the dataset_name to 'tankstemple'. -2. Calibrating images with the script from [NGP](https://github.com/NVlabs/instant-ngp/blob/master/docs/nerf_dataset_tips.md): -`python dataLoader/colmap2nerf.py --colmap_matcher exhaustive --run_colmap`, then adjust the datadir in `configs/your_own_data.txt`. Please check the `scene_bbox` and `near_far` if you get abnormal results. - - -## Citation -If you find our code or paper helps, please consider citing: -``` -@INPROCEEDINGS{Chen2022ECCV, - author = {Anpei Chen and Zexiang Xu and Andreas Geiger and Jingyi Yu and Hao Su}, - title = {TensoRF: Tensorial Radiance Fields}, - booktitle = {European Conference on Computer Vision (ECCV)}, - year = {2022} -} -``` diff --git a/TensoRF/configs/drums.txt b/TensoRF/configs/drums.txt deleted file mode 100644 index a32c93b..0000000 --- a/TensoRF/configs/drums.txt +++ /dev/null @@ -1,41 +0,0 @@ - -dataset_name = blender -datadir = ./data/nerf_synthetic/drums -expname = tensorf_lego_VM -basedir = ./log - -n_iters = 30000 -batch_size = 4096 - -N_voxel_init = 2097156 # 128**3 -N_voxel_final = 27000000 # 300**3 -upsamp_list = [2000,3000,4000,5500,7000] -update_AlphaMask_list = [2000,4000] - -N_vis = 5 -vis_every = 10000 - -render_test = 1 - -n_lamb_sigma = [16,16,16] -n_lamb_sh = [48,48,48] -model_name = TensorVMSplit - - -shadingMode = MLP_Fea -fea2denseAct = softplus - -view_pe = 2 -fea_pe = 2 - -L1_weight_inital = 8e-5 -L1_weight_rest = 4e-5 -rm_weight_mask_thre = 1e-4 - -## please uncomment following configuration if hope to training on cp model -#model_name = TensorCP -#n_lamb_sigma = [96] -#n_lamb_sh = [288] -#N_voxel_final = 125000000 # 500**3 -#L1_weight_inital = 1e-5 -#L1_weight_rest = 1e-5 diff --git a/TensoRF/configs/flower.txt b/TensoRF/configs/flower.txt deleted file mode 100644 index 3a1c8ee..0000000 --- a/TensoRF/configs/flower.txt +++ /dev/null @@ -1,35 +0,0 @@ - -dataset_name = llff -datadir = ./data/nerf_llff_data/flower -expname = tensorf_flower_VM -basedir = ./log - -downsample_train = 4.0 -ndc_ray = 1 - -n_iters = 25000 -batch_size = 4096 - -N_voxel_init = 2097156 # 128**3 -N_voxel_final = 262144000 # 640**3 -upsamp_list = [2000,3000,4000,5500] -update_AlphaMask_list = [2500] - -N_vis = -1 # vis all testing images -vis_every = 10000 - -render_test = 1 -render_path = 1 - -n_lamb_sigma = [16,4,4] -n_lamb_sh = [48,12,12] - -shadingMode = MLP_Fea -fea2denseAct = relu - -view_pe = 0 -fea_pe = 0 - -TV_weight_density = 1.0 -TV_weight_app = 1.0 - diff --git a/TensoRF/configs/lego.txt b/TensoRF/configs/lego.txt deleted file mode 100644 index a8fa9f2..0000000 --- a/TensoRF/configs/lego.txt +++ /dev/null @@ -1,41 +0,0 @@ - -dataset_name = blender -datadir = ./data/nerf_synthetic/lego -expname = tensorf_lego_VM -basedir = ./log - -n_iters = 3000 -batch_size = 4096 - -N_voxel_init = 2097156 # 128**3 -N_voxel_final = 27000000 # 300**3 -upsamp_list = [2000,3000,4000,5500,7000] -update_AlphaMask_list = [2000,4000] - -N_vis = 5 -vis_every = 10000 - -render_test = 1 - -n_lamb_sigma = [16,16,16] -n_lamb_sh = [48,48,48] -model_name = TensorVMSplit - - -shadingMode = MLP_Fea -fea2denseAct = softplus - -view_pe = 2 -fea_pe = 2 - -L1_weight_inital = 8e-5 -L1_weight_rest = 4e-5 -rm_weight_mask_thre = 1e-4 - -## please uncomment following configuration if hope to training on cp model -#model_name = TensorCP -#n_lamb_sigma = [96] -#n_lamb_sh = [288] -#N_voxel_final = 125000000 # 500**3 -#L1_weight_inital = 1e-5 -#L1_weight_rest = 1e-5 diff --git a/TensoRF/configs/localworkerconfig.txt b/TensoRF/configs/localworkerconfig.txt deleted file mode 100644 index 1f8bd2a..0000000 --- a/TensoRF/configs/localworkerconfig.txt +++ /dev/null @@ -1,36 +0,0 @@ - -dataset_name = sfm2nerf -datadir = ./data/sfm_data -expname = tensorf_sfm_data_VM -basedir = ./log - -n_iters = 3000 -batch_size = 4096 - -N_voxel_init = 2097156 # 128**3 -N_voxel_final = 27000000 # 300**3 -upsamp_list = [2000,3000,4000,5500,7000] -update_AlphaMask_list = [2000,4000] - -N_vis = 5 -vis_every = 10000 - -render_test = 1 - -n_lamb_sigma = [16,16,16] -n_lamb_sh = [48,48,48] -model_name = TensorVMSplit - - -shadingMode = MLP_Fea -fea2denseAct = softplus - -view_pe = 2 -fea_pe = 2 - -TV_weight_density = 0.1 -TV_weight_app = 0.01 - -#L1_weight_inital = 8e-5 -#L1_weight_rest = 4e-5 -rm_weight_mask_thre = 1e-4 diff --git a/TensoRF/configs/localworkerconfig_testsimon.txt b/TensoRF/configs/localworkerconfig_testsimon.txt deleted file mode 100644 index a31aec8..0000000 --- a/TensoRF/configs/localworkerconfig_testsimon.txt +++ /dev/null @@ -1,38 +0,0 @@ - -dataset_name = sfm2nerf -datadir = ./data/sfm_data -expname = tensorf_sfm_data_VM -basedir = ./log - -n_iters = 1000 -progress_refresh_rate = 100 -batch_size = 4096 - -N_voxel_init = 2097156 # 128**3 -N_voxel_final = 27000000 # 300**3 -upsamp_list = [2000,3000,4000,5500,7000] -update_AlphaMask_list = [2000,4000] - -N_vis = 5 -vis_every = 10000 - -#render_test = 1 -render_path = 0 - - -n_lamb_sigma = [16,16,16] -n_lamb_sh = [48,48,48] -model_name = TensorVMSplit - -shadingMode = MLP_Fea -fea2denseAct = softplus - -view_pe = 2 -fea_pe = 2 - -TV_weight_density = 0.1 -TV_weight_app = 0.01 - -#L1_weight_inital = 8e-5 -#L1_weight_rest = 4e-5 -rm_weight_mask_thre = 1e-4 diff --git a/TensoRF/configs/truck.txt b/TensoRF/configs/truck.txt deleted file mode 100644 index 6a4545b..0000000 --- a/TensoRF/configs/truck.txt +++ /dev/null @@ -1,40 +0,0 @@ - - -dataset_name = tankstemple -datadir = ./data/TanksAndTemple/Truck -expname = tensorf_truck_VM -basedir = ./log - -n_iters = 30000 -batch_size = 4096 - -N_voxel_init = 2097156 # 128**3 -N_voxel_final = 27000000 # 300**3 -upsamp_list = [2000,3000,4000,5500,7000] -update_AlphaMask_list = [2000,4000] - -N_vis = 5 -vis_every = 10000 - -render_test = 1 - -n_lamb_sigma = [16,16,16] -n_lamb_sh = [48,48,48] - -shadingMode = MLP_Fea -fea2denseAct = softplus - -view_pe = 2 -fea_pe = 2 - -TV_weight_density = 0.1 -TV_weight_app = 0.01 - -## please uncomment following configuration if hope to training on cp model -#model_name = TensorCP -#n_lamb_sigma = [96] -#n_lamb_sh = [288] -#N_voxel_final = 125000000 # 500**3 -#L1_weight_inital = 1e-5 -#L1_weight_rest = 1e-5 - diff --git a/TensoRF/configs/wineholder.txt b/TensoRF/configs/wineholder.txt deleted file mode 100644 index 4b945ea..0000000 --- a/TensoRF/configs/wineholder.txt +++ /dev/null @@ -1,39 +0,0 @@ - -dataset_name = nsvf -datadir = ./data/Synthetic_NSVF/Wineholder -expname = tensorf_Wineholder_VM -basedir = ./log - -n_iters = 30000 -batch_size = 4096 - -N_voxel_init = 2097156 # 128**3 -N_voxel_final = 27000000 # 300**3 -upsamp_list = [2000,3000,4000,5500,7000] -update_AlphaMask_list = [2000,4000] - -N_vis = 5 -vis_every = 10000 - -render_test = 1 - -n_lamb_sigma = [16,16,16] -n_lamb_sh = [48,48,48] - -shadingMode = MLP_Fea -fea2denseAct = softplus - -view_pe = 2 -fea_pe = 2 - -L1_weight_inital = 8e-5 -L1_weight_rest = 4e-5 -rm_weight_mask_thre = 1e-4 - -## please uncomment following configuration if hope to training on cp model -#model_name = TensorCP -#n_lamb_sigma = [96] -#n_lamb_sh = [288] -#N_voxel_final = 125000000 # 500**3 -#L1_weight_inital = 1e-5 -#L1_weight_rest = 1e-5 diff --git a/TensoRF/configs/workerconfig.txt b/TensoRF/configs/workerconfig.txt deleted file mode 100644 index 6f83234..0000000 --- a/TensoRF/configs/workerconfig.txt +++ /dev/null @@ -1,37 +0,0 @@ - -dataset_name = own_data -datadir = ./data/xxx -expname = tensorf_xxx_VM -basedir = ./log - -n_iters = 30000 -batch_size = 4096 - -N_voxel_init = 2097156 # 128**3 -N_voxel_final = 27000000 # 300**3 -upsamp_list = [2000,3000,4000,5500,7000] -update_AlphaMask_list = [2000,4000] - -N_vis = 5 -vis_every = 10000 - -render_test = 1 - -n_lamb_sigma = [16,16,16] -n_lamb_sh = [48,48,48] -model_name = TensorVMSplit - - -shadingMode = MLP_Fea -fea2denseAct = softplus - -view_pe = 2 -fea_pe = 2 - -view_pe = 2 -fea_pe = 2 - -TV_weight_density = 0.1 -TV_weight_app = 0.01 - -rm_weight_mask_thre = 1e-4 diff --git a/TensoRF/configs/your_own_data.txt b/TensoRF/configs/your_own_data.txt deleted file mode 100644 index 6d3b0a2..0000000 --- a/TensoRF/configs/your_own_data.txt +++ /dev/null @@ -1,45 +0,0 @@ - -dataset_name = own_data -datadir = ./data/xxx -expname = tensorf_xxx_VM -basedir = ./log - -n_iters = 30000 -batch_size = 4096 - -N_voxel_init = 2097156 # 128**3 -N_voxel_final = 27000000 # 300**3 -upsamp_list = [2000,3000,4000,5500,7000] -update_AlphaMask_list = [2000,4000] - -N_vis = 5 -vis_every = 10000 - -render_test = 1 - -n_lamb_sigma = [16,16,16] -n_lamb_sh = [48,48,48] -model_name = TensorVMSplit - - -shadingMode = MLP_Fea -fea2denseAct = softplus - -view_pe = 2 -fea_pe = 2 - -view_pe = 2 -fea_pe = 2 - -TV_weight_density = 0.1 -TV_weight_app = 0.01 - -rm_weight_mask_thre = 1e-4 - -## please uncomment following configuration if hope to training on cp model -#model_name = TensorCP -#n_lamb_sigma = [96] -#n_lamb_sh = [288] -#N_voxel_final = 125000000 # 500**3 -#L1_weight_inital = 1e-5 -#L1_weight_rest = 1e-5 diff --git a/TensoRF/dataLoader/__init__.py b/TensoRF/dataLoader/__init__.py deleted file mode 100644 index 8ebab6a..0000000 --- a/TensoRF/dataLoader/__init__.py +++ /dev/null @@ -1,14 +0,0 @@ -from .llff import LLFFDataset -from .blender import BlenderDataset -from .nsvf import NSVF -from .tankstemple import TanksTempleDataset -from .your_own_data import YourOwnDataset -from .sfm2nerf import Sfm2Nerf - - -dataset_dict = {'blender': BlenderDataset, - 'llff':LLFFDataset, - 'tankstemple':TanksTempleDataset, - 'nsvf':NSVF, - 'own_data':YourOwnDataset, - 'sfm2nerf':Sfm2Nerf} \ No newline at end of file diff --git a/TensoRF/dataLoader/blender.py b/TensoRF/dataLoader/blender.py deleted file mode 100644 index 41ae903..0000000 --- a/TensoRF/dataLoader/blender.py +++ /dev/null @@ -1,127 +0,0 @@ -import torch,cv2 -from torch.utils.data import Dataset -import json -from tqdm import tqdm -import os -from PIL import Image -from torchvision import transforms as T - - -from .ray_utils import * - - -class BlenderDataset(Dataset): - def __init__(self, datadir, split='train', downsample=1.0, is_stack=False, N_vis=-1): - - self.N_vis = N_vis - self.root_dir = datadir - self.split = split - self.is_stack = is_stack - self.img_wh = (int(800/downsample),int(800/downsample)) - self.define_transforms() - - self.scene_bbox = torch.tensor([[-1.5, -1.5, -1.5], [1.5, 1.5, 1.5]]) - self.blender2opencv = np.array([[1, 0, 0, 0], [0, -1, 0, 0], [0, 0, -1, 0], [0, 0, 0, 1]]) - self.downsample=downsample - self.read_meta() - self.define_proj_mat() - - self.white_bg = True - self.near_far = [2.0,6.0] - - self.center = torch.mean(self.scene_bbox, axis=0).float().view(1, 1, 3) - self.radius = (self.scene_bbox[1] - self.center).float().view(1, 1, 3) - - def read_depth(self, filename): - depth = np.array(read_pfm(filename)[0], dtype=np.float32) # (800, 800) - return depth - - def read_meta(self): - - with open(os.path.join(self.root_dir, f"transforms_{self.split}.json"), 'r') as f: - self.meta = json.load(f) - - w, h = self.img_wh - self.focal = 0.5 * 800 / np.tan(0.5 * self.meta['camera_angle_x']) # original focal length - self.focal *= self.img_wh[0] / 800 # modify focal length to match size self.img_wh - - - # ray directions for all pixels, same for all images (same H, W, focal) - self.directions = get_ray_directions(h, w, [self.focal,self.focal]) # (h, w, 3) - self.directions = self.directions / torch.norm(self.directions, dim=-1, keepdim=True) - self.intrinsics = torch.tensor([[self.focal,0,w/2],[0,self.focal,h/2],[0,0,1]]).float() - - self.image_paths = [] - self.poses = [] - self.all_rays = [] - self.all_rgbs = [] - self.all_masks = [] - self.all_depth = [] - #self.downsample=1.0 - - img_eval_interval = 1 if self.N_vis < 0 else len(self.meta['frames']) // self.N_vis - idxs = list(range(0, len(self.meta['frames']), img_eval_interval)) - for i in tqdm(idxs, desc=f'Loading data {self.split} ({len(idxs)})'):#img_list:# - - frame = self.meta['frames'][i] - pose = np.array(frame['transform_matrix']) @ self.blender2opencv - c2w = torch.FloatTensor(pose) - self.poses += [c2w] - - image_path = os.path.join(self.root_dir, f"{frame['file_path']}.png") - self.image_paths += [image_path] - img = Image.open(image_path) - - if self.downsample!=1.0: - img = img.resize(self.img_wh, Image.LANCZOS) - img = self.transform(img) # (4, h, w) - img = img.view(4, -1).permute(1, 0) # (h*w, 4) RGBA - img = img[:, :3] * img[:, -1:] + (1 - img[:, -1:]) # blend A to RGB - self.all_rgbs += [img] - - - rays_o, rays_d = get_rays(self.directions, c2w) # both (h*w, 3) - self.all_rays += [torch.cat([rays_o, rays_d], 1)] # (h*w, 6) - - - self.poses = torch.stack(self.poses) - if not self.is_stack: - self.all_rays = torch.cat(self.all_rays, 0) # (len(self.meta['frames])*h*w, 3) - self.all_rgbs = torch.cat(self.all_rgbs, 0) # (len(self.meta['frames])*h*w, 3) - -# self.all_depth = torch.cat(self.all_depth, 0) # (len(self.meta['frames])*h*w, 3) - else: - self.all_rays = torch.stack(self.all_rays, 0) # (len(self.meta['frames]),h*w, 3) - self.all_rgbs = torch.stack(self.all_rgbs, 0).reshape(-1,*self.img_wh[::-1], 3) # (len(self.meta['frames]),h,w,3) - # self.all_masks = torch.stack(self.all_masks, 0).reshape(-1,*self.img_wh[::-1]) # (len(self.meta['frames]),h,w,3) - - - def define_transforms(self): - self.transform = T.ToTensor() - - def define_proj_mat(self): - self.proj_mat = self.intrinsics.unsqueeze(0) @ torch.inverse(self.poses)[:,:3] - - def world2ndc(self,points,lindisp=None): - device = points.device - return (points - self.center.to(device)) / self.radius.to(device) - - def __len__(self): - return len(self.all_rgbs) - - def __getitem__(self, idx): - - if self.split == 'train': # use data in the buffers - sample = {'rays': self.all_rays[idx], - 'rgbs': self.all_rgbs[idx]} - - else: # create data for each image separately - - img = self.all_rgbs[idx] - rays = self.all_rays[idx] - mask = self.all_masks[idx] # for quantity evaluation - - sample = {'rays': rays, - 'rgbs': img, - 'mask': mask} - return sample diff --git a/TensoRF/dataLoader/colmap2nerf.py b/TensoRF/dataLoader/colmap2nerf.py deleted file mode 100644 index b91bbf0..0000000 --- a/TensoRF/dataLoader/colmap2nerf.py +++ /dev/null @@ -1,305 +0,0 @@ -#!/usr/bin/env python3 - -# Copyright (c) 2020-2022, NVIDIA CORPORATION. All rights reserved. -# -# NVIDIA CORPORATION and its licensors retain all intellectual property -# and proprietary rights in and to this software, related documentation -# and any modifications thereto. Any use, reproduction, disclosure or -# distribution of this software and related documentation without an express -# license agreement from NVIDIA CORPORATION is strictly prohibited. - -import argparse -import os -from pathlib import Path, PurePosixPath - -import numpy as np -import json -import sys -import math -import cv2 -import os -import shutil - -def parse_args(): - parser = argparse.ArgumentParser(description="convert a text colmap export to nerf format transforms.json; optionally convert video to images, and optionally run colmap in the first place") - - parser.add_argument("--video_in", default="", help="run ffmpeg first to convert a provided video file into a set of images. uses the video_fps parameter also") - parser.add_argument("--video_fps", default=2) - parser.add_argument("--time_slice", default="", help="time (in seconds) in the format t1,t2 within which the images should be generated from the video. eg: \"--time_slice '10,300'\" will generate images only from 10th second to 300th second of the video") - parser.add_argument("--run_colmap", action="store_true", help="run colmap first on the image folder") - parser.add_argument("--colmap_matcher", default="sequential", choices=["exhaustive","sequential","spatial","transitive","vocab_tree"], help="select which matcher colmap should use. sequential for videos, exhaustive for adhoc images") - parser.add_argument("--colmap_db", default="colmap.db", help="colmap database filename") - parser.add_argument("--images", default="images", help="input path to the images") - parser.add_argument("--text", default="colmap_text", help="input path to the colmap text files (set automatically if run_colmap is used)") - parser.add_argument("--aabb_scale", default=16, choices=["1","2","4","8","16"], help="large scene scale factor. 1=scene fits in unit cube; power of 2 up to 16") - parser.add_argument("--skip_early", default=0, help="skip this many images from the start") - parser.add_argument("--out", default="transforms.json", help="output path") - args = parser.parse_args() - return args - -def do_system(arg): - print(f"==== running: {arg}") - err = os.system(arg) - if err: - print("FATAL: command failed") - sys.exit(err) - -def run_ffmpeg(args): - if not os.path.isabs(args.images): - args.images = os.path.join(os.path.dirname(args.video_in), args.images) - images = args.images - video = args.video_in - fps = float(args.video_fps) or 1.0 - print(f"running ffmpeg with input video file={video}, output image folder={images}, fps={fps}.") - if (input(f"warning! folder '{images}' will be deleted/replaced. continue? (Y/n)").lower().strip()+"y")[:1] != "y": - sys.exit(1) - try: - shutil.rmtree(images) - except: - pass - do_system(f"mkdir {images}") - - time_slice_value = "" - time_slice = args.time_slice - if time_slice: - start, end = time_slice.split(",") - time_slice_value = f",select='between(t\,{start}\,{end})'" - do_system(f"ffmpeg -i {video} -qscale:v 1 -qmin 1 -vf \"fps={fps}{time_slice_value}\" {images}/%04d.jpg") - -def run_colmap(args): - db=args.colmap_db - images=args.images - db_noext=str(Path(db).with_suffix("")) - - if args.text=="text": - args.text=db_noext+"_text" - text=args.text - sparse=db_noext+"_sparse" - print(f"running colmap with:\n\tdb={db}\n\timages={images}\n\tsparse={sparse}\n\ttext={text}") - if (input(f"warning! folders '{sparse}' and '{text}' will be deleted/replaced. continue? (Y/n)").lower().strip()+"y")[:1] != "y": - sys.exit(1) - if os.path.exists(db): - os.remove(db) - do_system(f"colmap feature_extractor --ImageReader.camera_model OPENCV --SiftExtraction.estimate_affine_shape=true --SiftExtraction.domain_size_pooling=true --ImageReader.single_camera 1 --database_path {db} --image_path {images}") - do_system(f"colmap {args.colmap_matcher}_matcher --SiftMatching.guided_matching=true --database_path {db}") - try: - shutil.rmtree(sparse) - except: - pass - do_system(f"mkdir {sparse}") - do_system(f"colmap mapper --database_path {db} --image_path {images} --output_path {sparse}") - do_system(f"colmap bundle_adjuster --input_path {sparse}/0 --output_path {sparse}/0 --BundleAdjustment.refine_principal_point 1") - try: - shutil.rmtree(text) - except: - pass - do_system(f"mkdir {text}") - do_system(f"colmap model_converter --input_path {sparse}/0 --output_path {text} --output_type TXT") - -def variance_of_laplacian(image): - return cv2.Laplacian(image, cv2.CV_64F).var() - -def sharpness(imagePath): - image = cv2.imread(imagePath) - gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY) - fm = variance_of_laplacian(gray) - return fm - -def qvec2rotmat(qvec): - return np.array([ - [ - 1 - 2 * qvec[2]**2 - 2 * qvec[3]**2, - 2 * qvec[1] * qvec[2] - 2 * qvec[0] * qvec[3], - 2 * qvec[3] * qvec[1] + 2 * qvec[0] * qvec[2] - ], [ - 2 * qvec[1] * qvec[2] + 2 * qvec[0] * qvec[3], - 1 - 2 * qvec[1]**2 - 2 * qvec[3]**2, - 2 * qvec[2] * qvec[3] - 2 * qvec[0] * qvec[1] - ], [ - 2 * qvec[3] * qvec[1] - 2 * qvec[0] * qvec[2], - 2 * qvec[2] * qvec[3] + 2 * qvec[0] * qvec[1], - 1 - 2 * qvec[1]**2 - 2 * qvec[2]**2 - ] - ]) - -def rotmat(a, b): - a, b = a / np.linalg.norm(a), b / np.linalg.norm(b) - v = np.cross(a, b) - c = np.dot(a, b) - s = np.linalg.norm(v) - kmat = np.array([[0, -v[2], v[1]], [v[2], 0, -v[0]], [-v[1], v[0], 0]]) - return np.eye(3) + kmat + kmat.dot(kmat) * ((1 - c) / (s ** 2 + 1e-10)) - -def closest_point_2_lines(oa, da, ob, db): # returns point closest to both rays of form o+t*d, and a weight factor that goes to 0 if the lines are parallel - da = da / np.linalg.norm(da) - db = db / np.linalg.norm(db) - c = np.cross(da, db) - denom = np.linalg.norm(c)**2 - t = ob - oa - ta = np.linalg.det([t, db, c]) / (denom + 1e-10) - tb = np.linalg.det([t, da, c]) / (denom + 1e-10) - if ta > 0: - ta = 0 - if tb > 0: - tb = 0 - return (oa+ta*da+ob+tb*db) * 0.5, denom - -if __name__ == "__main__": - args = parse_args() - if args.video_in != "": - run_ffmpeg(args) - if args.run_colmap: - run_colmap(args) - AABB_SCALE = int(args.aabb_scale) - SKIP_EARLY = int(args.skip_early) - IMAGE_FOLDER = args.images - TEXT_FOLDER = args.text - OUT_PATH = args.out - print(f"outputting to {OUT_PATH}...") - with open(os.path.join(TEXT_FOLDER,"cameras.txt"), "r") as f: - angle_x = math.pi / 2 - for line in f: - # 1 SIMPLE_RADIAL 2048 1536 1580.46 1024 768 0.0045691 - # 1 OPENCV 3840 2160 3178.27 3182.09 1920 1080 0.159668 -0.231286 -0.00123982 0.00272224 - # 1 RADIAL 1920 1080 1665.1 960 540 0.0672856 -0.0761443 - if line[0] == "#": - continue - els = line.split(" ") - w = float(els[2]) - h = float(els[3]) - fl_x = float(els[4]) - fl_y = float(els[4]) - k1 = 0 - k2 = 0 - p1 = 0 - p2 = 0 - cx = w / 2 - cy = h / 2 - if els[1] == "SIMPLE_PINHOLE": - cx = float(els[5]) - cy = float(els[6]) - elif els[1] == "PINHOLE": - fl_y = float(els[5]) - cx = float(els[6]) - cy = float(els[7]) - elif els[1] == "SIMPLE_RADIAL": - cx = float(els[5]) - cy = float(els[6]) - k1 = float(els[7]) - elif els[1] == "RADIAL": - cx = float(els[5]) - cy = float(els[6]) - k1 = float(els[7]) - k2 = float(els[8]) - elif els[1] == "OPENCV": - fl_y = float(els[5]) - cx = float(els[6]) - cy = float(els[7]) - k1 = float(els[8]) - k2 = float(els[9]) - p1 = float(els[10]) - p2 = float(els[11]) - else: - print("unknown camera model ", els[1]) - # fl = 0.5 * w / tan(0.5 * angle_x); - angle_x = math.atan(w / (fl_x * 2)) * 2 - angle_y = math.atan(h / (fl_y * 2)) * 2 - fovx = angle_x * 180 / math.pi - fovy = angle_y * 180 / math.pi - - print(f"camera:\n\tres={w,h}\n\tcenter={cx,cy}\n\tfocal={fl_x,fl_y}\n\tfov={fovx,fovy}\n\tk={k1,k2} p={p1,p2} ") - - with open(os.path.join(TEXT_FOLDER,"images.txt"), "r") as f: - i = 0 - bottom = np.array([0.0, 0.0, 0.0, 1.0]).reshape([1, 4]) - out = { - "camera_angle_x": angle_x, - "camera_angle_y": angle_y, - "fl_x": fl_x, - "fl_y": fl_y, - "k1": k1, - "k2": k2, - "p1": p1, - "p2": p2, - "cx": cx, - "cy": cy, - "w": w, - "h": h, - "aabb_scale": AABB_SCALE, - "frames": [], - } - - up = np.zeros(3) - for line in f: - line = line.strip() - if line[0] == "#": - continue - i = i + 1 - if i < SKIP_EARLY*2: - continue - if i % 2 == 1: - elems=line.split(" ") # 1-4 is quat, 5-7 is trans, 9ff is filename (9, if filename contains no spaces) - #name = str(PurePosixPath(Path(IMAGE_FOLDER, elems[9]))) - # why is this requireing a relitive path while using ^ - image_rel = os.path.relpath(IMAGE_FOLDER) - name = str(f"./{image_rel}/{'_'.join(elems[9:])}") - b=sharpness(name) - print(name, "sharpness=",b) - image_id = int(elems[0]) - qvec = np.array(tuple(map(float, elems[1:5]))) - tvec = np.array(tuple(map(float, elems[5:8]))) - R = qvec2rotmat(-qvec) - t = tvec.reshape([3,1]) - m = np.concatenate([np.concatenate([R, t], 1), bottom], 0) - c2w = np.linalg.inv(m) - c2w[0:3,2] *= -1 # flip the y and z axis - c2w[0:3,1] *= -1 - c2w = c2w[[1,0,2,3],:] # swap y and z - c2w[2,:] *= -1 # flip whole world upside down - - up += c2w[0:3,1] - - frame={"file_path":name,"sharpness":b,"transform_matrix": c2w} - out["frames"].append(frame) - nframes = len(out["frames"]) - up = up / np.linalg.norm(up) - print("up vector was", up) - R = rotmat(up,[0,0,1]) # rotate up vector to [0,0,1] - R = np.pad(R,[0,1]) - R[-1, -1] = 1 - - - for f in out["frames"]: - f["transform_matrix"] = np.matmul(R, f["transform_matrix"]) # rotate up to be the z axis - - # find a central point they are all looking at - print("computing center of attention...") - totw = 0.0 - totp = np.array([0.0, 0.0, 0.0]) - for f in out["frames"]: - mf = f["transform_matrix"][0:3,:] - for g in out["frames"]: - mg = g["transform_matrix"][0:3,:] - p, w = closest_point_2_lines(mf[:,3], mf[:,2], mg[:,3], mg[:,2]) - if w > 0.01: - totp += p*w - totw += w - totp /= totw - print(totp) # the cameras are looking at totp - for f in out["frames"]: - f["transform_matrix"][0:3,3] -= totp - - avglen = 0. - for f in out["frames"]: - avglen += np.linalg.norm(f["transform_matrix"][0:3,3]) - avglen /= nframes - print("avg camera distance from origin", avglen) - for f in out["frames"]: - f["transform_matrix"][0:3,3] *= 4.0 / avglen # scale to "nerf sized" - - for f in out["frames"]: - f["transform_matrix"] = f["transform_matrix"].tolist() - print(nframes,"frames") - print(f"writing {OUT_PATH}") - with open(OUT_PATH, "w") as outfile: - json.dump(out, outfile, indent=2) \ No newline at end of file diff --git a/TensoRF/dataLoader/llff.py b/TensoRF/dataLoader/llff.py deleted file mode 100644 index 3b31db9..0000000 --- a/TensoRF/dataLoader/llff.py +++ /dev/null @@ -1,242 +0,0 @@ -import torch -from torch.utils.data import Dataset -import glob -import numpy as np -import os -from PIL import Image -from torchvision import transforms as T - -from .ray_utils import * - - -def normalize(v): - """Normalize a vector.""" - return v / np.linalg.norm(v) - - -def average_poses(poses): - """ - Calculate the average pose, which is then used to center all poses - using @center_poses. Its computation is as follows: - 1. Compute the center: the average of pose centers. - 2. Compute the z axis: the normalized average z axis. - 3. Compute axis y': the average y axis. - 4. Compute x' = y' cross product z, then normalize it as the x axis. - 5. Compute the y axis: z cross product x. - - Note that at step 3, we cannot directly use y' as y axis since it's - not necessarily orthogonal to z axis. We need to pass from x to y. - Inputs: - poses: (N_images, 3, 4) - Outputs: - pose_avg: (3, 4) the average pose - """ - # 1. Compute the center - center = poses[..., 3].mean(0) # (3) - - # 2. Compute the z axis - z = normalize(poses[..., 2].mean(0)) # (3) - - # 3. Compute axis y' (no need to normalize as it's not the final output) - y_ = poses[..., 1].mean(0) # (3) - - # 4. Compute the x axis - x = normalize(np.cross(z, y_)) # (3) - - # 5. Compute the y axis (as z and x are normalized, y is already of norm 1) - y = np.cross(x, z) # (3) - - pose_avg = np.stack([x, y, z, center], 1) # (3, 4) - - return pose_avg - - -def center_poses(poses, blender2opencv): - """ - Center the poses so that we can use NDC. - See https://github.com/bmild/nerf/issues/34 - Inputs: - poses: (N_images, 3, 4) - Outputs: - poses_centered: (N_images, 3, 4) the centered poses - pose_avg: (3, 4) the average pose - """ - poses = poses @ blender2opencv - pose_avg = average_poses(poses) # (3, 4) - pose_avg_homo = np.eye(4) - pose_avg_homo[:3] = pose_avg # convert to homogeneous coordinate for faster computation - pose_avg_homo = pose_avg_homo - # by simply adding 0, 0, 0, 1 as the last row - last_row = np.tile(np.array([0, 0, 0, 1]), (len(poses), 1, 1)) # (N_images, 1, 4) - poses_homo = \ - np.concatenate([poses, last_row], 1) # (N_images, 4, 4) homogeneous coordinate - - poses_centered = np.linalg.inv(pose_avg_homo) @ poses_homo # (N_images, 4, 4) - # poses_centered = poses_centered @ blender2opencv - poses_centered = poses_centered[:, :3] # (N_images, 3, 4) - - return poses_centered, pose_avg_homo - - -def viewmatrix(z, up, pos): - vec2 = normalize(z) - vec1_avg = up - vec0 = normalize(np.cross(vec1_avg, vec2)) - vec1 = normalize(np.cross(vec2, vec0)) - m = np.eye(4) - m[:3] = np.stack([-vec0, vec1, vec2, pos], 1) - return m - - -def render_path_spiral(c2w, up, rads, focal, zdelta, zrate, N_rots=2, N=120): - render_poses = [] - rads = np.array(list(rads) + [1.]) - - for theta in np.linspace(0., 2. * np.pi * N_rots, N + 1)[:-1]: - c = np.dot(c2w[:3, :4], np.array([np.cos(theta), -np.sin(theta), -np.sin(theta * zrate), 1.]) * rads) - z = normalize(c - np.dot(c2w[:3, :4], np.array([0, 0, -focal, 1.]))) - render_poses.append(viewmatrix(z, up, c)) - return render_poses - - -def get_spiral(c2ws_all, near_fars, rads_scale=1.0, N_views=120): - # center pose - c2w = average_poses(c2ws_all) - - # Get average pose - up = normalize(c2ws_all[:, :3, 1].sum(0)) - - # Find a reasonable "focus depth" for this dataset - dt = 0.75 - close_depth, inf_depth = near_fars.min() * 0.9, near_fars.max() * 5.0 - focal = 1.0 / (((1.0 - dt) / close_depth + dt / inf_depth)) - - # Get radii for spiral path - zdelta = near_fars.min() * .2 - tt = c2ws_all[:, :3, 3] - rads = np.percentile(np.abs(tt), 90, 0) * rads_scale - render_poses = render_path_spiral(c2w, up, rads, focal, zdelta, zrate=.5, N=N_views) - return np.stack(render_poses) - - -class LLFFDataset(Dataset): - def __init__(self, datadir, split='train', downsample=4, is_stack=False, hold_every=8): - """ - spheric_poses: whether the images are taken in a spheric inward-facing manner - default: False (forward-facing) - val_num: number of val images (used for multigpu training, validate same image for all gpus) - """ - - self.root_dir = datadir - self.split = split - self.hold_every = hold_every - self.is_stack = is_stack - self.downsample = downsample - self.define_transforms() - - self.blender2opencv = np.eye(4)#np.array([[1, 0, 0, 0], [0, -1, 0, 0], [0, 0, -1, 0], [0, 0, 0, 1]]) - self.read_meta() - self.white_bg = False - - # self.near_far = [np.min(self.near_fars[:,0]),np.max(self.near_fars[:,1])] - self.near_far = [0.0, 1.0] - self.scene_bbox = torch.tensor([[-1.5, -1.67, -1.0], [1.5, 1.67, 1.0]]) - # self.scene_bbox = torch.tensor([[-1.67, -1.5, -1.0], [1.67, 1.5, 1.0]]) - self.center = torch.mean(self.scene_bbox, dim=0).float().view(1, 1, 3) - self.invradius = 1.0 / (self.scene_bbox[1] - self.center).float().view(1, 1, 3) - - def read_meta(self): - - - poses_bounds = np.load(os.path.join(self.root_dir, 'poses_bounds.npy')) # (N_images, 17) - self.image_paths = sorted(glob.glob(os.path.join(self.root_dir, 'images_4/*'))) - # load full resolution image then resize - if self.split in ['train', 'test']: - assert len(poses_bounds) == len(self.image_paths), \ - 'Mismatch between number of images and number of poses! Please rerun COLMAP!' - - poses = poses_bounds[:, :15].reshape(-1, 3, 5) # (N_images, 3, 5) - self.near_fars = poses_bounds[:, -2:] # (N_images, 2) - hwf = poses[:, :, -1] - - # Step 1: rescale focal length according to training resolution - H, W, self.focal = poses[0, :, -1] # original intrinsics, same for all images - self.img_wh = np.array([int(W / self.downsample), int(H / self.downsample)]) - self.focal = [self.focal * self.img_wh[0] / W, self.focal * self.img_wh[1] / H] - - # Step 2: correct poses - # Original poses has rotation in form "down right back", change to "right up back" - # See https://github.com/bmild/nerf/issues/34 - poses = np.concatenate([poses[..., 1:2], -poses[..., :1], poses[..., 2:4]], -1) - # (N_images, 3, 4) exclude H, W, focal - self.poses, self.pose_avg = center_poses(poses, self.blender2opencv) - - # Step 3: correct scale so that the nearest depth is at a little more than 1.0 - # See https://github.com/bmild/nerf/issues/34 - near_original = self.near_fars.min() - scale_factor = near_original * 0.75 # 0.75 is the default parameter - # the nearest depth is at 1/0.75=1.33 - self.near_fars /= scale_factor - self.poses[..., 3] /= scale_factor - - # build rendering path - N_views, N_rots = 120, 2 - tt = self.poses[:, :3, 3] # ptstocam(poses[:3,3,:].T, c2w).T - up = normalize(self.poses[:, :3, 1].sum(0)) - rads = np.percentile(np.abs(tt), 90, 0) - - self.render_path = get_spiral(self.poses, self.near_fars, N_views=N_views) - - # distances_from_center = np.linalg.norm(self.poses[..., 3], axis=1) - # val_idx = np.argmin(distances_from_center) # choose val image as the closest to - # center image - - # ray directions for all pixels, same for all images (same H, W, focal) - W, H = self.img_wh - self.directions = get_ray_directions_blender(H, W, self.focal) # (H, W, 3) - - average_pose = average_poses(self.poses) - dists = np.sum(np.square(average_pose[:3, 3] - self.poses[:, :3, 3]), -1) - i_test = np.arange(0, self.poses.shape[0], self.hold_every) # [np.argmin(dists)] - img_list = i_test if self.split != 'train' else list(set(np.arange(len(self.poses))) - set(i_test)) - - # use first N_images-1 to train, the LAST is val - self.all_rays = [] - self.all_rgbs = [] - for i in img_list: - image_path = self.image_paths[i] - c2w = torch.FloatTensor(self.poses[i]) - - img = Image.open(image_path).convert('RGB') - if self.downsample != 1.0: - img = img.resize(self.img_wh, Image.LANCZOS) - img = self.transform(img) # (3, h, w) - - img = img.view(3, -1).permute(1, 0) # (h*w, 3) RGB - self.all_rgbs += [img] - rays_o, rays_d = get_rays(self.directions, c2w) # both (h*w, 3) - rays_o, rays_d = ndc_rays_blender(H, W, self.focal[0], 1.0, rays_o, rays_d) - # viewdir = rays_d / torch.norm(rays_d, dim=-1, keepdim=True) - - self.all_rays += [torch.cat([rays_o, rays_d], 1)] # (h*w, 6) - - if not self.is_stack: - self.all_rays = torch.cat(self.all_rays, 0) # (len(self.meta['frames])*h*w, 3) - self.all_rgbs = torch.cat(self.all_rgbs, 0) # (len(self.meta['frames])*h*w,3) - else: - self.all_rays = torch.stack(self.all_rays, 0) # (len(self.meta['frames]),h,w, 3) - self.all_rgbs = torch.stack(self.all_rgbs, 0).reshape(-1,*self.img_wh[::-1], 3) # (len(self.meta['frames]),h,w,3) - - - def define_transforms(self): - self.transform = T.ToTensor() - - def __len__(self): - return len(self.all_rgbs) - - def __getitem__(self, idx): - - sample = {'rays': self.all_rays[idx], - 'rgbs': self.all_rgbs[idx]} - - return sample \ No newline at end of file diff --git a/TensoRF/dataLoader/nsvf.py b/TensoRF/dataLoader/nsvf.py deleted file mode 100644 index f9dc0a9..0000000 --- a/TensoRF/dataLoader/nsvf.py +++ /dev/null @@ -1,160 +0,0 @@ -import torch -from torch.utils.data import Dataset -from tqdm import tqdm -import os -from PIL import Image -from torchvision import transforms as T - -from .ray_utils import * - -trans_t = lambda t : torch.Tensor([ - [1,0,0,0], - [0,1,0,0], - [0,0,1,t], - [0,0,0,1]]).float() - -rot_phi = lambda phi : torch.Tensor([ - [1,0,0,0], - [0,np.cos(phi),-np.sin(phi),0], - [0,np.sin(phi), np.cos(phi),0], - [0,0,0,1]]).float() - -rot_theta = lambda th : torch.Tensor([ - [np.cos(th),0,-np.sin(th),0], - [0,1,0,0], - [np.sin(th),0, np.cos(th),0], - [0,0,0,1]]).float() - - -def pose_spherical(theta, phi, radius): - c2w = trans_t(radius) - c2w = rot_phi(phi/180.*np.pi) @ c2w - c2w = rot_theta(theta/180.*np.pi) @ c2w - c2w = torch.Tensor(np.array([[-1,0,0,0],[0,0,1,0],[0,1,0,0],[0,0,0,1]])) @ c2w - return c2w - -class NSVF(Dataset): - """NSVF Generic Dataset.""" - def __init__(self, datadir, split='train', downsample=1.0, wh=[800,800], is_stack=False): - self.root_dir = datadir - self.split = split - self.is_stack = is_stack - self.downsample = downsample - self.img_wh = (int(wh[0]/downsample),int(wh[1]/downsample)) - self.define_transforms() - - self.white_bg = True - self.near_far = [0.5,6.0] - self.scene_bbox = torch.from_numpy(np.loadtxt(f'{self.root_dir}/bbox.txt')).float()[:6].view(2,3) - self.blender2opencv = np.array([[1, 0, 0, 0], [0, -1, 0, 0], [0, 0, -1, 0], [0, 0, 0, 1]]) - self.read_meta() - self.define_proj_mat() - - self.center = torch.mean(self.scene_bbox, axis=0).float().view(1, 1, 3) - self.radius = (self.scene_bbox[1] - self.center).float().view(1, 1, 3) - - def bbox2corners(self): - corners = self.scene_bbox.unsqueeze(0).repeat(4,1,1) - for i in range(3): - corners[i,[0,1],i] = corners[i,[1,0],i] - return corners.view(-1,3) - - - def read_meta(self): - with open(os.path.join(self.root_dir, "intrinsics.txt")) as f: - focal = float(f.readline().split()[0]) - self.intrinsics = np.array([[focal,0,400.0],[0,focal,400.0],[0,0,1]]) - self.intrinsics[:2] *= (np.array(self.img_wh)/np.array([800,800])).reshape(2,1) - - pose_files = sorted(os.listdir(os.path.join(self.root_dir, 'pose'))) - img_files = sorted(os.listdir(os.path.join(self.root_dir, 'rgb'))) - - if self.split == 'train': - pose_files = [x for x in pose_files if x.startswith('0_')] - img_files = [x for x in img_files if x.startswith('0_')] - elif self.split == 'val': - pose_files = [x for x in pose_files if x.startswith('1_')] - img_files = [x for x in img_files if x.startswith('1_')] - elif self.split == 'test': - test_pose_files = [x for x in pose_files if x.startswith('2_')] - test_img_files = [x for x in img_files if x.startswith('2_')] - if len(test_pose_files) == 0: - test_pose_files = [x for x in pose_files if x.startswith('1_')] - test_img_files = [x for x in img_files if x.startswith('1_')] - pose_files = test_pose_files - img_files = test_img_files - - # ray directions for all pixels, same for all images (same H, W, focal) - self.directions = get_ray_directions(self.img_wh[1], self.img_wh[0], [self.intrinsics[0,0],self.intrinsics[1,1]], center=self.intrinsics[:2,2]) # (h, w, 3) - self.directions = self.directions / torch.norm(self.directions, dim=-1, keepdim=True) - - - self.render_path = torch.stack([pose_spherical(angle, -30.0, 4.0) for angle in np.linspace(-180,180,40+1)[:-1]], 0) - - self.poses = [] - self.all_rays = [] - self.all_rgbs = [] - - assert len(img_files) == len(pose_files) - for img_fname, pose_fname in tqdm(zip(img_files, pose_files), desc=f'Loading data {self.split} ({len(img_files)})'): - image_path = os.path.join(self.root_dir, 'rgb', img_fname) - img = Image.open(image_path) - if self.downsample!=1.0: - img = img.resize(self.img_wh, Image.LANCZOS) - img = self.transform(img) # (4, h, w) - img = img.view(img.shape[0], -1).permute(1, 0) # (h*w, 4) RGBA - if img.shape[-1]==4: - img = img[:, :3] * img[:, -1:] + (1 - img[:, -1:]) # blend A to RGB - self.all_rgbs += [img] - - c2w = np.loadtxt(os.path.join(self.root_dir, 'pose', pose_fname)) #@ self.blender2opencv - c2w = torch.FloatTensor(c2w) - self.poses.append(c2w) # C2W - rays_o, rays_d = get_rays(self.directions, c2w) # both (h*w, 3) - self.all_rays += [torch.cat([rays_o, rays_d], 1)] # (h*w, 8) - -# w2c = torch.inverse(c2w) -# - - self.poses = torch.stack(self.poses) - if 'train' == self.split: - if self.is_stack: - self.all_rays = torch.stack(self.all_rays, 0).reshape(-1,*self.img_wh[::-1], 6) # (len(self.meta['frames])*h*w, 3) - self.all_rgbs = torch.stack(self.all_rgbs, 0).reshape(-1,*self.img_wh[::-1], 3) # (len(self.meta['frames])*h*w, 3) - else: - self.all_rays = torch.cat(self.all_rays, 0) # (len(self.meta['frames])*h*w, 3) - self.all_rgbs = torch.cat(self.all_rgbs, 0) # (len(self.meta['frames])*h*w, 3) - else: - self.all_rays = torch.stack(self.all_rays, 0) # (len(self.meta['frames]),h*w, 3) - self.all_rgbs = torch.stack(self.all_rgbs, 0).reshape(-1,*self.img_wh[::-1], 3) # (len(self.meta['frames]),h,w,3) - - - def define_transforms(self): - self.transform = T.ToTensor() - - def define_proj_mat(self): - self.proj_mat = torch.from_numpy(self.intrinsics[:3,:3]).unsqueeze(0).float() @ torch.inverse(self.poses)[:,:3] - - def world2ndc(self, points): - device = points.device - return (points - self.center.to(device)) / self.radius.to(device) - - def __len__(self): - if self.split == 'train': - return len(self.all_rays) - return len(self.all_rgbs) - - def __getitem__(self, idx): - - if self.split == 'train': # use data in the buffers - sample = {'rays': self.all_rays[idx], - 'rgbs': self.all_rgbs[idx]} - - else: # create data for each image separately - - img = self.all_rgbs[idx] - rays = self.all_rays[idx] - - sample = {'rays': rays, - 'rgbs': img} - return sample \ No newline at end of file diff --git a/TensoRF/dataLoader/ray_utils.py b/TensoRF/dataLoader/ray_utils.py deleted file mode 100644 index c7f0437..0000000 --- a/TensoRF/dataLoader/ray_utils.py +++ /dev/null @@ -1,275 +0,0 @@ -import torch, re -import numpy as np -from torch import searchsorted -from kornia import create_meshgrid - - -# from utils import index_point_feature - -def depth2dist(z_vals, cos_angle): - # z_vals: [N_ray N_sample] - device = z_vals.device - dists = z_vals[..., 1:] - z_vals[..., :-1] - dists = torch.cat([dists, torch.Tensor([1e10]).to(device).expand(dists[..., :1].shape)], -1) # [N_rays, N_samples] - dists = dists * cos_angle.unsqueeze(-1) - return dists - - -def ndc2dist(ndc_pts, cos_angle): - dists = torch.norm(ndc_pts[:, 1:] - ndc_pts[:, :-1], dim=-1) - dists = torch.cat([dists, 1e10 * cos_angle.unsqueeze(-1)], -1) # [N_rays, N_samples] - return dists - - -def get_ray_directions(H, W, focal, center=None): - """ - Get ray directions for all pixels in camera coordinate. - Reference: https://www.scratchapixel.com/lessons/3d-basic-rendering/ - ray-tracing-generating-camera-rays/standard-coordinate-systems - Inputs: - H, W, focal: image height, width and focal length - Outputs: - directions: (H, W, 3), the direction of the rays in camera coordinate - """ - grid = create_meshgrid(H, W, normalized_coordinates=False)[0] + 0.5 - - i, j = grid.unbind(-1) - # the direction here is without +0.5 pixel centering as calibration is not so accurate - # see https://github.com/bmild/nerf/issues/24 - cent = center if center is not None else [W / 2, H / 2] - directions = torch.stack([(i - cent[0]) / focal[0], (j - cent[1]) / focal[1], torch.ones_like(i)], -1) # (H, W, 3) - - return directions - - -def get_ray_directions_blender(H, W, focal, center=None): - """ - Get ray directions for all pixels in camera coordinate. - Reference: https://www.scratchapixel.com/lessons/3d-basic-rendering/ - ray-tracing-generating-camera-rays/standard-coordinate-systems - Inputs: - H, W, focal: image height, width and focal length - Outputs: - directions: (H, W, 3), the direction of the rays in camera coordinate - """ - grid = create_meshgrid(H, W, normalized_coordinates=False)[0]+0.5 - i, j = grid.unbind(-1) - # the direction here is without +0.5 pixel centering as calibration is not so accurate - # see https://github.com/bmild/nerf/issues/24 - cent = center if center is not None else [W / 2, H / 2] - directions = torch.stack([(i - cent[0]) / focal[0], -(j - cent[1]) / focal[1], -torch.ones_like(i)], - -1) # (H, W, 3) - - return directions - - -def get_rays(directions, c2w): - """ - Get ray origin and normalized directions in world coordinate for all pixels in one image. - Reference: https://www.scratchapixel.com/lessons/3d-basic-rendering/ - ray-tracing-generating-camera-rays/standard-coordinate-systems - Inputs: - directions: (H, W, 3) precomputed ray directions in camera coordinate - c2w: (3, 4) transformation matrix from camera coordinate to world coordinate - Outputs: - rays_o: (H*W, 3), the origin of the rays in world coordinate - rays_d: (H*W, 3), the normalized direction of the rays in world coordinate - """ - # Rotate ray directions from camera coordinate to the world coordinate - rays_d = directions @ c2w[:3, :3].T # (H, W, 3) - # rays_d = rays_d / torch.norm(rays_d, dim=-1, keepdim=True) - # The origin of all rays is the camera origin in world coordinate - rays_o = c2w[:3, 3].expand(rays_d.shape) # (H, W, 3) - - rays_d = rays_d.view(-1, 3) - rays_o = rays_o.view(-1, 3) - - return rays_o, rays_d - - -def ndc_rays_blender(H, W, focal, near, rays_o, rays_d): - # Shift ray origins to near plane - t = -(near + rays_o[..., 2]) / rays_d[..., 2] - rays_o = rays_o + t[..., None] * rays_d - - # Projection - o0 = -1. / (W / (2. * focal)) * rays_o[..., 0] / rays_o[..., 2] - o1 = -1. / (H / (2. * focal)) * rays_o[..., 1] / rays_o[..., 2] - o2 = 1. + 2. * near / rays_o[..., 2] - - d0 = -1. / (W / (2. * focal)) * (rays_d[..., 0] / rays_d[..., 2] - rays_o[..., 0] / rays_o[..., 2]) - d1 = -1. / (H / (2. * focal)) * (rays_d[..., 1] / rays_d[..., 2] - rays_o[..., 1] / rays_o[..., 2]) - d2 = -2. * near / rays_o[..., 2] - - rays_o = torch.stack([o0, o1, o2], -1) - rays_d = torch.stack([d0, d1, d2], -1) - - return rays_o, rays_d - -def ndc_rays(H, W, focal, near, rays_o, rays_d): - # Shift ray origins to near plane - t = (near - rays_o[..., 2]) / rays_d[..., 2] - rays_o = rays_o + t[..., None] * rays_d - - # Projection - o0 = 1. / (W / (2. * focal)) * rays_o[..., 0] / rays_o[..., 2] - o1 = 1. / (H / (2. * focal)) * rays_o[..., 1] / rays_o[..., 2] - o2 = 1. - 2. * near / rays_o[..., 2] - - d0 = 1. / (W / (2. * focal)) * (rays_d[..., 0] / rays_d[..., 2] - rays_o[..., 0] / rays_o[..., 2]) - d1 = 1. / (H / (2. * focal)) * (rays_d[..., 1] / rays_d[..., 2] - rays_o[..., 1] / rays_o[..., 2]) - d2 = 2. * near / rays_o[..., 2] - - rays_o = torch.stack([o0, o1, o2], -1) - rays_d = torch.stack([d0, d1, d2], -1) - - return rays_o, rays_d - -# Hierarchical sampling (section 5.2) -def sample_pdf(bins, weights, N_samples, det=False, pytest=False): - device = weights.device - # Get pdf - weights = weights + 1e-5 # prevent nans - pdf = weights / torch.sum(weights, -1, keepdim=True) - cdf = torch.cumsum(pdf, -1) - cdf = torch.cat([torch.zeros_like(cdf[..., :1]), cdf], -1) # (batch, len(bins)) - - # Take uniform samples - if det: - u = torch.linspace(0., 1., steps=N_samples, device=device) - u = u.expand(list(cdf.shape[:-1]) + [N_samples]) - else: - u = torch.rand(list(cdf.shape[:-1]) + [N_samples], device=device) - - # Pytest, overwrite u with numpy's fixed random numbers - if pytest: - np.random.seed(0) - new_shape = list(cdf.shape[:-1]) + [N_samples] - if det: - u = np.linspace(0., 1., N_samples) - u = np.broadcast_to(u, new_shape) - else: - u = np.random.rand(*new_shape) - u = torch.Tensor(u) - - # Invert CDF - u = u.contiguous() - inds = searchsorted(cdf.detach(), u, right=True) - below = torch.max(torch.zeros_like(inds - 1), inds - 1) - above = torch.min((cdf.shape[-1] - 1) * torch.ones_like(inds), inds) - inds_g = torch.stack([below, above], -1) # (batch, N_samples, 2) - - matched_shape = [inds_g.shape[0], inds_g.shape[1], cdf.shape[-1]] - cdf_g = torch.gather(cdf.unsqueeze(1).expand(matched_shape), 2, inds_g) - bins_g = torch.gather(bins.unsqueeze(1).expand(matched_shape), 2, inds_g) - - denom = (cdf_g[..., 1] - cdf_g[..., 0]) - denom = torch.where(denom < 1e-5, torch.ones_like(denom), denom) - t = (u - cdf_g[..., 0]) / denom - samples = bins_g[..., 0] + t * (bins_g[..., 1] - bins_g[..., 0]) - - return samples - - -def dda(rays_o, rays_d, bbox_3D): - inv_ray_d = 1.0 / (rays_d + 1e-6) - t_min = (bbox_3D[:1] - rays_o) * inv_ray_d # N_rays 3 - t_max = (bbox_3D[1:] - rays_o) * inv_ray_d - t = torch.stack((t_min, t_max)) # 2 N_rays 3 - t_min = torch.max(torch.min(t, dim=0)[0], dim=-1, keepdim=True)[0] - t_max = torch.min(torch.max(t, dim=0)[0], dim=-1, keepdim=True)[0] - return t_min, t_max - - -def ray_marcher(rays, - N_samples=64, - lindisp=False, - perturb=0, - bbox_3D=None): - """ - sample points along the rays - Inputs: - rays: () - - Returns: - - """ - - # Decompose the inputs - N_rays = rays.shape[0] - rays_o, rays_d = rays[:, 0:3], rays[:, 3:6] # both (N_rays, 3) - near, far = rays[:, 6:7], rays[:, 7:8] # both (N_rays, 1) - - if bbox_3D is not None: - # cal aabb boundles - near, far = dda(rays_o, rays_d, bbox_3D) - - # Sample depth points - z_steps = torch.linspace(0, 1, N_samples, device=rays.device) # (N_samples) - if not lindisp: # use linear sampling in depth space - z_vals = near * (1 - z_steps) + far * z_steps - else: # use linear sampling in disparity space - z_vals = 1 / (1 / near * (1 - z_steps) + 1 / far * z_steps) - - z_vals = z_vals.expand(N_rays, N_samples) - - if perturb > 0: # perturb sampling depths (z_vals) - z_vals_mid = 0.5 * (z_vals[:, :-1] + z_vals[:, 1:]) # (N_rays, N_samples-1) interval mid points - # get intervals between samples - upper = torch.cat([z_vals_mid, z_vals[:, -1:]], -1) - lower = torch.cat([z_vals[:, :1], z_vals_mid], -1) - - perturb_rand = perturb * torch.rand(z_vals.shape, device=rays.device) - z_vals = lower + (upper - lower) * perturb_rand - - xyz_coarse_sampled = rays_o.unsqueeze(1) + \ - rays_d.unsqueeze(1) * z_vals.unsqueeze(2) # (N_rays, N_samples, 3) - - return xyz_coarse_sampled, rays_o, rays_d, z_vals - - -def read_pfm(filename): - file = open(filename, 'rb') - color = None - width = None - height = None - scale = None - endian = None - - header = file.readline().decode('utf-8').rstrip() - if header == 'PF': - color = True - elif header == 'Pf': - color = False - else: - raise Exception('Not a PFM file.') - - dim_match = re.match(r'^(\d+)\s(\d+)\s$', file.readline().decode('utf-8')) - if dim_match: - width, height = map(int, dim_match.groups()) - else: - raise Exception('Malformed PFM header.') - - scale = float(file.readline().rstrip()) - if scale < 0: # little-endian - endian = '<' - scale = -scale - else: - endian = '>' # big-endian - - data = np.fromfile(file, endian + 'f') - shape = (height, width, 3) if color else (height, width) - - data = np.reshape(data, shape) - data = np.flipud(data) - file.close() - return data, scale - - -def ndc_bbox(all_rays): - near_min = torch.min(all_rays[...,:3].view(-1,3),dim=0)[0] - near_max = torch.max(all_rays[..., :3].view(-1, 3), dim=0)[0] - far_min = torch.min((all_rays[...,:3]+all_rays[...,3:6]).view(-1,3),dim=0)[0] - far_max = torch.max((all_rays[...,:3]+all_rays[...,3:6]).view(-1, 3), dim=0)[0] - print(f'===> ndc bbox near_min:{near_min} near_max:{near_max} far_min:{far_min} far_max:{far_max}') - return torch.stack((torch.minimum(near_min,far_min),torch.maximum(near_max,far_max))) \ No newline at end of file diff --git a/TensoRF/dataLoader/sfm2nerf.py b/TensoRF/dataLoader/sfm2nerf.py deleted file mode 100644 index 8298672..0000000 --- a/TensoRF/dataLoader/sfm2nerf.py +++ /dev/null @@ -1,179 +0,0 @@ -import torch,cv2 -from torch.utils.data import Dataset -import json -from tqdm import tqdm -import os -from PIL import Image -from torchvision import transforms as T -import numpy as np - -import logging - -from .ray_utils import * - -trans_t = lambda t : torch.Tensor([ - [1,0,0,0], - [0,1,0,0], - [0,0,1,t], - [0,0,0,1]]).float() - -rot_phi = lambda phi : torch.Tensor([ - [1,0,0,0], - [0,np.cos(phi),-np.sin(phi),0], - [0,np.sin(phi), np.cos(phi),0], - [0,0,0,1]]).float() - -rot_theta = lambda th : torch.Tensor([ - [np.cos(th),0,-np.sin(th),0], - [0,1,0,0], - [np.sin(th),0, np.cos(th),0], - [0,0,0,1]]).float() - - -def pose_spherical(theta, phi, radius): - c2w = trans_t(radius) - c2w = rot_phi(phi/180.*np.pi) @ c2w - c2w = rot_theta(theta/180.*np.pi) @ c2w - c2w = torch.Tensor(np.array([[-1,0,0,0],[0,0,1,0],[0,1,0,0],[0,0,0,1]])) @ c2w - return c2w - -class Sfm2Nerf(Dataset): - def __init__(self, datadir, split='train', downsample=1.0, is_stack=False, N_vis=-1): - self.logger = logging.getLogger("nerf-worker") - self.logger.info("LOADING DATASET OF TYPE Sfm2Nerf:") - - self.N_vis = N_vis - self.root_dir = datadir - self.split = split - self.is_stack = is_stack - self.logger.info("DOWNSAMPLE: {}".format(downsample)) - self.downsample = downsample - self.define_transforms() - - if split == "train": - self.scene_bbox = torch.tensor([[-1.5, -1.5, -1.5], [1.5, 1.5, 1.5]]) - self.near_far = [0.1,100.0] - else: - self.scene_bbox = torch.tensor([[-1.5, -1.5, -1.5], - [1.0, 1.0, -0.25]]) - #self.scene_bbox = torch.tensor([[-1.5, -1.5, -1.5], [1.5, 1.5, 1.5]]) - #self.near_far = [1.0,1.5] - self.near_far = [0.1,100.0] - - self.blender2opencv = np.array([[1, 0, 0, 0], [0, -1, 0, 0], [0, 0, -1, 0], [0, 0, 0, 1]]) - self.read_meta() - self.define_proj_mat() - - self.white_bg = True - #self.near_far = [0.1,100.0] - #self.near_far = [2.0,6.0] - - self.center = torch.mean(self.scene_bbox, axis=0).float().view(1, 1, 3) - self.radius = (self.scene_bbox[1] - self.center).float().view(1, 1, 3) - self.downsample=downsample - - def read_depth(self, filename): - depth = np.array(read_pfm(filename)[0], dtype=np.float32) # (800, 800) - return depth - - def read_meta(self): - self.logger.info("LOADING META DATA") - with open(os.path.join(self.root_dir, f"transforms_{self.split}.json"), 'r') as f: - self.meta = json.load(f) - - #w, h = int(800/self.downsample), int(800/self.downsample) - w, h = int(self.meta['vid_width']/self.downsample), int(self.meta['vid_height']/self.downsample) - self.img_wh = [w,h] - #self.focal_x = 0.5 * w / np.tan(0.5 * self.meta['intrinsic_matrix'][0][0]) # original focal length - #self.focal_y = 0.5 * h / np.tan(0.5 * self.meta['intrinsic_matrix'][1][1]) # original focal length - self.focal_x = float(self.meta['intrinsic_matrix'][0][0]) - self.focal_y = float(self.meta['intrinsic_matrix'][1][1]) - self.cx, self.cy = self.meta['intrinsic_matrix'][0][2],self.meta['intrinsic_matrix'][1][2] - - - # ray directions for all pixels, same for all images (same H, W, focal) - self.directions = get_ray_directions(h, w, [self.focal_x,self.focal_y], center=[self.cx, self.cy]) # (h, w, 3) - self.directions = self.directions / torch.norm(self.directions, dim=-1, keepdim=True) - self.intrinsics = torch.tensor([[self.focal_x,0,self.cx],[0,self.focal_y,self.cy],[0,0,1]]).float() - - self.render_path = torch.stack([pose_spherical(angle, 25.0, 1) for angle in np.linspace(-180,180,40+1)[:-1]], 0) - - self.image_paths = [] - self.poses = [] - self.all_rays = [] - self.all_rgbs = [] - self.all_masks = [] - self.all_depth = [] - - - img_eval_interval = 1 if self.N_vis < 0 else len(self.meta['frames']) // self.N_vis - idxs = list(range(0, len(self.meta['frames']), img_eval_interval)) - for i in tqdm(idxs, desc=f'Loading data {self.split} ({len(idxs)})'):#img_list:# - - frame = self.meta['frames'][i] - pose = np.array(frame['extrinsic_matrix']) @ self.blender2opencv - c2w = torch.FloatTensor(pose) - self.poses += [c2w] - - if self.split != 'render': - image_path = os.path.join(self.root_dir, f"{frame['file_path']}") - self.image_paths += [image_path] - img = Image.open(image_path) - - if self.downsample!=1.0: - img = img.resize(self.img_wh, Image.LANCZOS) - img = self.transform(img) # (4, h, w) - img = img.view(-1, w*h).permute(1, 0) # (h*w, 4) RGBA - if img.shape[-1]==4: - img = img[:, :3] * img[:, -1:] + (1 - img[:, -1:]) # blend A to RGB - self.all_rgbs += [img] - - - rays_o, rays_d = get_rays(self.directions, c2w) # both (h*w, 3) - self.all_rays += [torch.cat([rays_o, rays_d], 1)] # (h*w, 6) - - - self.poses = torch.stack(self.poses) - if not self.is_stack: - self.all_rays = torch.cat(self.all_rays, 0) # (len(self.meta['frames])*h*w, 3) - if self.split != 'render': - self.all_rgbs = torch.cat(self.all_rgbs, 0) # (len(self.meta['frames])*h*w, 3) - -# self.all_depth = torch.cat(self.all_depth, 0) # (len(self.meta['frames])*h*w, 3) - else: - self.all_rays = torch.stack(self.all_rays, 0) # (len(self.meta['frames]),h*w, 3) - if self.split != 'render': - self.all_rgbs = torch.stack(self.all_rgbs, 0).reshape(-1,*self.img_wh[::-1], 3) # (len(self.meta['frames]),h,w,3) - # self.all_masks = torch.stack(self.all_masks, 0).reshape(-1,*self.img_wh[::-1]) # (len(self.meta['frames]),h,w,3) - - - def define_transforms(self): - self.transform = T.ToTensor() - - def define_proj_mat(self): - self.proj_mat = self.intrinsics.unsqueeze(0) @ torch.inverse(self.poses)[:,:3] - - def world2ndc(self,points,lindisp=None): - device = points.device - return (points - self.center.to(device)) / self.radius.to(device) - - def __len__(self): - return len(self.all_rgbs) - - def __getitem__(self, idx): - - if self.split == 'train': # use data in the buffers - sample = {'rays': self.all_rays[idx], - 'rgbs': self.all_rgbs[idx]} - elif self.split == 'render': - sample = {'rays': self.all_rays[idx]} - - else: # create data for each image separately - - img = self.all_rgbs[idx] - rays = self.all_rays[idx] - mask = self.all_masks[idx] # for quantity evaluation - - sample = {'rays': rays, - 'rgbs': img} - return sample diff --git a/TensoRF/dataLoader/tankstemple.py b/TensoRF/dataLoader/tankstemple.py deleted file mode 100644 index 4215803..0000000 --- a/TensoRF/dataLoader/tankstemple.py +++ /dev/null @@ -1,216 +0,0 @@ -import torch -from torch.utils.data import Dataset -from tqdm import tqdm -import os -from PIL import Image -from torchvision import transforms as T - -from .ray_utils import * - - -def circle(radius=3.5, h=0.0, axis='z', t0=0, r=1): - if axis == 'z': - return lambda t: [radius * np.cos(r * t + t0), radius * np.sin(r * t + t0), h] - elif axis == 'y': - return lambda t: [radius * np.cos(r * t + t0), h, radius * np.sin(r * t + t0)] - else: - return lambda t: [h, radius * np.cos(r * t + t0), radius * np.sin(r * t + t0)] - - -def cross(x, y, axis=0): - T = torch if isinstance(x, torch.Tensor) else np - return T.cross(x, y, axis) - - -def normalize(x, axis=-1, order=2): - if isinstance(x, torch.Tensor): - l2 = x.norm(p=order, dim=axis, keepdim=True) - return x / (l2 + 1e-8), l2 - - else: - l2 = np.linalg.norm(x, order, axis) - l2 = np.expand_dims(l2, axis) - l2[l2 == 0] = 1 - return x / l2, - - -def cat(x, axis=1): - if isinstance(x[0], torch.Tensor): - return torch.cat(x, dim=axis) - return np.concatenate(x, axis=axis) - - -def look_at_rotation(camera_position, at=None, up=None, inverse=False, cv=False): - """ - This function takes a vector 'camera_position' which specifies the location - of the camera in world coordinates and two vectors `at` and `up` which - indicate the position of the object and the up directions of the world - coordinate system respectively. The object is assumed to be centered at - the origin. - The output is a rotation matrix representing the transformation - from world coordinates -> view coordinates. - Input: - camera_position: 3 - at: 1 x 3 or N x 3 (0, 0, 0) in default - up: 1 x 3 or N x 3 (0, 1, 0) in default - """ - - if at is None: - at = torch.zeros_like(camera_position) - else: - at = torch.tensor(at).type_as(camera_position) - if up is None: - up = torch.zeros_like(camera_position) - up[2] = -1 - else: - up = torch.tensor(up).type_as(camera_position) - - z_axis = normalize(at - camera_position)[0] - x_axis = normalize(cross(up, z_axis))[0] - y_axis = normalize(cross(z_axis, x_axis))[0] - - R = cat([x_axis[:, None], y_axis[:, None], z_axis[:, None]], axis=1) - return R - - -def gen_path(pos_gen, at=(0, 0, 0), up=(0, -1, 0), frames=180): - c2ws = [] - for t in range(frames): - c2w = torch.eye(4) - cam_pos = torch.tensor(pos_gen(t * (360.0 / frames) / 180 * np.pi)) - cam_rot = look_at_rotation(cam_pos, at=at, up=up, inverse=False, cv=True) - c2w[:3, 3], c2w[:3, :3] = cam_pos, cam_rot - c2ws.append(c2w) - return torch.stack(c2ws) - -class TanksTempleDataset(Dataset): - """NSVF Generic Dataset.""" - def __init__(self, datadir, split='train', downsample=1.0, wh=[1920,1080], is_stack=False): - self.root_dir = datadir - self.split = split - self.is_stack = is_stack - self.downsample = downsample - self.img_wh = (int(wh[0]/downsample),int(wh[1]/downsample)) - self.define_transforms() - - self.white_bg = True - self.near_far = [0.01,6.0] - self.scene_bbox = torch.from_numpy(np.loadtxt(f'{self.root_dir}/bbox.txt')).float()[:6].view(2,3)*1.2 - - self.blender2opencv = np.array([[1, 0, 0, 0], [0, -1, 0, 0], [0, 0, -1, 0], [0, 0, 0, 1]]) - self.read_meta() - self.define_proj_mat() - - self.center = torch.mean(self.scene_bbox, axis=0).float().view(1, 1, 3) - self.radius = (self.scene_bbox[1] - self.center).float().view(1, 1, 3) - - def bbox2corners(self): - corners = self.scene_bbox.unsqueeze(0).repeat(4,1,1) - for i in range(3): - corners[i,[0,1],i] = corners[i,[1,0],i] - return corners.view(-1,3) - - - def read_meta(self): - - self.intrinsics = np.loadtxt(os.path.join(self.root_dir, "intrinsics.txt")) - self.intrinsics[:2] *= (np.array(self.img_wh)/np.array([1920,1080])).reshape(2,1) - pose_files = sorted(os.listdir(os.path.join(self.root_dir, 'pose'))) - img_files = sorted(os.listdir(os.path.join(self.root_dir, 'rgb'))) - - if self.split == 'train': - pose_files = [x for x in pose_files if x.startswith('0_')] - img_files = [x for x in img_files if x.startswith('0_')] - elif self.split == 'val': - pose_files = [x for x in pose_files if x.startswith('1_')] - img_files = [x for x in img_files if x.startswith('1_')] - elif self.split == 'test': - test_pose_files = [x for x in pose_files if x.startswith('2_')] - test_img_files = [x for x in img_files if x.startswith('2_')] - if len(test_pose_files) == 0: - test_pose_files = [x for x in pose_files if x.startswith('1_')] - test_img_files = [x for x in img_files if x.startswith('1_')] - pose_files = test_pose_files - img_files = test_img_files - - # ray directions for all pixels, same for all images (same H, W, focal) - self.directions = get_ray_directions(self.img_wh[1], self.img_wh[0], [self.intrinsics[0,0],self.intrinsics[1,1]], center=self.intrinsics[:2,2]) # (h, w, 3) - self.directions = self.directions / torch.norm(self.directions, dim=-1, keepdim=True) - - - - self.poses = [] - self.all_rays = [] - self.all_rgbs = [] - - assert len(img_files) == len(pose_files) - for img_fname, pose_fname in tqdm(zip(img_files, pose_files), desc=f'Loading data {self.split} ({len(img_files)})'): - image_path = os.path.join(self.root_dir, 'rgb', img_fname) - img = Image.open(image_path) - if self.downsample!=1.0: - img = img.resize(self.img_wh, Image.LANCZOS) - img = self.transform(img) # (4, h, w) - img = img.view(img.shape[0], -1).permute(1, 0) # (h*w, 4) RGBA - if img.shape[-1]==4: - img = img[:, :3] * img[:, -1:] + (1 - img[:, -1:]) # blend A to RGB - self.all_rgbs.append(img) - - - c2w = np.loadtxt(os.path.join(self.root_dir, 'pose', pose_fname))# @ cam_trans - c2w = torch.FloatTensor(c2w) - self.poses.append(c2w) # C2W - rays_o, rays_d = get_rays(self.directions, c2w) # both (h*w, 3) - self.all_rays += [torch.cat([rays_o, rays_d], 1)] # (h*w, 8) - - self.poses = torch.stack(self.poses) - - center = torch.mean(self.scene_bbox, dim=0) - radius = torch.norm(self.scene_bbox[1]-center)*1.2 - up = torch.mean(self.poses[:, :3, 1], dim=0).tolist() - pos_gen = circle(radius=radius, h=-0.2*up[1], axis='y') - self.render_path = gen_path(pos_gen, up=up,frames=200) - self.render_path[:, :3, 3] += center - - - - if 'train' == self.split: - if self.is_stack: - self.all_rays = torch.stack(self.all_rays, 0).reshape(-1,*self.img_wh[::-1], 6) # (len(self.meta['frames])*h*w, 3) - self.all_rgbs = torch.stack(self.all_rgbs, 0).reshape(-1,*self.img_wh[::-1], 3) # (len(self.meta['frames])*h*w, 3) - else: - self.all_rays = torch.cat(self.all_rays, 0) # (len(self.meta['frames])*h*w, 3) - self.all_rgbs = torch.cat(self.all_rgbs, 0) # (len(self.meta['frames])*h*w, 3) - else: - self.all_rays = torch.stack(self.all_rays, 0) # (len(self.meta['frames]),h*w, 3) - self.all_rgbs = torch.stack(self.all_rgbs, 0).reshape(-1,*self.img_wh[::-1], 3) # (len(self.meta['frames]),h,w,3) - - - def define_transforms(self): - self.transform = T.ToTensor() - - def define_proj_mat(self): - self.proj_mat = torch.from_numpy(self.intrinsics[:3,:3]).unsqueeze(0).float() @ torch.inverse(self.poses)[:,:3] - - def world2ndc(self, points): - device = points.device - return (points - self.center.to(device)) / self.radius.to(device) - - def __len__(self): - if self.split == 'train': - return len(self.all_rays) - return len(self.all_rgbs) - - def __getitem__(self, idx): - - if self.split == 'train': # use data in the buffers - sample = {'rays': self.all_rays[idx], - 'rgbs': self.all_rgbs[idx]} - - else: # create data for each image separately - - img = self.all_rgbs[idx] - rays = self.all_rays[idx] - - sample = {'rays': rays, - 'rgbs': img} - return sample \ No newline at end of file diff --git a/TensoRF/dataLoader/your_own_data.py b/TensoRF/dataLoader/your_own_data.py deleted file mode 100644 index 79313e2..0000000 --- a/TensoRF/dataLoader/your_own_data.py +++ /dev/null @@ -1,129 +0,0 @@ -import torch,cv2 -from torch.utils.data import Dataset -import json -from tqdm import tqdm -import os -from PIL import Image -from torchvision import transforms as T - - -from .ray_utils import * - - -class YourOwnDataset(Dataset): - def __init__(self, datadir, split='train', downsample=1.0, is_stack=False, N_vis=-1): - - self.N_vis = N_vis - self.root_dir = datadir - self.split = split - self.is_stack = is_stack - self.downsample = downsample - self.define_transforms() - - self.scene_bbox = torch.tensor([[-1.5, -1.5, -1.5], [1.5, 1.5, 1.5]]) - self.blender2opencv = np.array([[1, 0, 0, 0], [0, -1, 0, 0], [0, 0, -1, 0], [0, 0, 0, 1]]) - self.read_meta() - self.define_proj_mat() - - self.white_bg = True - self.near_far = [0.1,100.0] - - self.center = torch.mean(self.scene_bbox, axis=0).float().view(1, 1, 3) - self.radius = (self.scene_bbox[1] - self.center).float().view(1, 1, 3) - self.downsample=downsample - - def read_depth(self, filename): - depth = np.array(read_pfm(filename)[0], dtype=np.float32) # (800, 800) - return depth - - def read_meta(self): - - with open(os.path.join(self.root_dir, f"transforms_{self.split}.json"), 'r') as f: - self.meta = json.load(f) - - w, h = int(self.meta['w']/self.downsample), int(self.meta['h']/self.downsample) - self.img_wh = [w,h] - self.focal_x = 0.5 * w / np.tan(0.5 * self.meta['camera_angle_x']) # original focal length - self.focal_y = 0.5 * h / np.tan(0.5 * self.meta['camera_angle_y']) # original focal length - self.cx, self.cy = self.meta['cx'],self.meta['cy'] - - - # ray directions for all pixels, same for all images (same H, W, focal) - self.directions = get_ray_directions(h, w, [self.focal_x,self.focal_y], center=[self.cx, self.cy]) # (h, w, 3) - self.directions = self.directions / torch.norm(self.directions, dim=-1, keepdim=True) - self.intrinsics = torch.tensor([[self.focal_x,0,self.cx],[0,self.focal_y,self.cy],[0,0,1]]).float() - - self.image_paths = [] - self.poses = [] - self.all_rays = [] - self.all_rgbs = [] - self.all_masks = [] - self.all_depth = [] - - - img_eval_interval = 1 if self.N_vis < 0 else len(self.meta['frames']) // self.N_vis - idxs = list(range(0, len(self.meta['frames']), img_eval_interval)) - for i in tqdm(idxs, desc=f'Loading data {self.split} ({len(idxs)})'):#img_list:# - - frame = self.meta['frames'][i] - pose = np.array(frame['transform_matrix']) @ self.blender2opencv - c2w = torch.FloatTensor(pose) - self.poses += [c2w] - - image_path = os.path.join(self.root_dir, f"{frame['file_path']}.png") - self.image_paths += [image_path] - img = Image.open(image_path) - - if self.downsample!=1.0: - img = img.resize(self.img_wh, Image.LANCZOS) - img = self.transform(img) # (4, h, w) - img = img.view(-1, w*h).permute(1, 0) # (h*w, 4) RGBA - if img.shape[-1]==4: - img = img[:, :3] * img[:, -1:] + (1 - img[:, -1:]) # blend A to RGB - self.all_rgbs += [img] - - - rays_o, rays_d = get_rays(self.directions, c2w) # both (h*w, 3) - self.all_rays += [torch.cat([rays_o, rays_d], 1)] # (h*w, 6) - - - self.poses = torch.stack(self.poses) - if not self.is_stack: - self.all_rays = torch.cat(self.all_rays, 0) # (len(self.meta['frames])*h*w, 3) - self.all_rgbs = torch.cat(self.all_rgbs, 0) # (len(self.meta['frames])*h*w, 3) - -# self.all_depth = torch.cat(self.all_depth, 0) # (len(self.meta['frames])*h*w, 3) - else: - self.all_rays = torch.stack(self.all_rays, 0) # (len(self.meta['frames]),h*w, 3) - self.all_rgbs = torch.stack(self.all_rgbs, 0).reshape(-1,*self.img_wh[::-1], 3) # (len(self.meta['frames]),h,w,3) - # self.all_masks = torch.stack(self.all_masks, 0).reshape(-1,*self.img_wh[::-1]) # (len(self.meta['frames]),h,w,3) - - - def define_transforms(self): - self.transform = T.ToTensor() - - def define_proj_mat(self): - self.proj_mat = self.intrinsics.unsqueeze(0) @ torch.inverse(self.poses)[:,:3] - - def world2ndc(self,points,lindisp=None): - device = points.device - return (points - self.center.to(device)) / self.radius.to(device) - - def __len__(self): - return len(self.all_rgbs) - - def __getitem__(self, idx): - - if self.split == 'train': # use data in the buffers - sample = {'rays': self.all_rays[idx], - 'rgbs': self.all_rgbs[idx]} - - else: # create data for each image separately - - img = self.all_rgbs[idx] - rays = self.all_rays[idx] - mask = self.all_masks[idx] # for quantity evaluation - - sample = {'rays': rays, - 'rgbs': img} - return sample diff --git a/TensoRF/extra/auto_run_paramsets.py b/TensoRF/extra/auto_run_paramsets.py deleted file mode 100644 index 52b4f1a..0000000 --- a/TensoRF/extra/auto_run_paramsets.py +++ /dev/null @@ -1,207 +0,0 @@ -import os -import threading, queue -import numpy as np -import time - - -def getFolderLocker(logFolder): - while True: - try: - os.makedirs(logFolder+"/lockFolder") - break - except: - time.sleep(0.01) - -def releaseFolderLocker(logFolder): - os.removedirs(logFolder+"/lockFolder") - -def getStopFolder(logFolder): - return os.path.isdir(logFolder+"/stopFolder") - - -def get_param_str(key, val): - if key == 'data_name': - return f'--datadir {datafolder}/{val} ' - else: - return f'--{key} {val} ' - -def get_param_list(param_dict): - param_keys = list(param_dict.keys()) - param_modes = len(param_keys) - param_nums = [len(param_dict[key]) for key in param_keys] - - param_ids = np.zeros(param_nums+[param_modes], dtype=int) - for i in range(param_modes): - broad_tuple = np.ones(param_modes, dtype=int).tolist() - broad_tuple[i] = param_nums[i] - broad_tuple = tuple(broad_tuple) - print(broad_tuple) - param_ids[...,i] = np.arange(param_nums[i]).reshape(broad_tuple) - param_ids = param_ids.reshape(-1, param_modes) - # print(param_ids) - print(len(param_ids)) - - params = [] - expnames = [] - for i in range(param_ids.shape[0]): - one = "" - name = "" - param_id = param_ids[i] - for j in range(param_modes): - key = param_keys[j] - val = param_dict[key][param_id[j]] - if type(key) is tuple: - assert len(key) == len(val) - for k in range(len(key)): - one += get_param_str(key[k], val[k]) - name += f'{val[k]},' - name=name[:-1]+'-' - else: - one += get_param_str(key, val) - name += f'{val}-' - params.append(one) - name=name.replace(' ','') - print(name) - expnames.append(name[:-1]) - # print(params) - return params, expnames - - - - - - - -if __name__ == '__main__': - - - - # nerf - expFolder = "nerf/" - # parameters to iterate, use tuple to couple multiple parameters - datafolder = '/mnt/new_disk_2/anpei/Dataset/nerf_synthetic/' - param_dict = { - 'data_name': ['ship', 'mic', 'chair', 'lego', 'drums', 'ficus', 'hotdog', 'materials'], - 'data_dim_color': [13, 27, 54] - } - - # n_iters = 30000 - # for data_name in ['Robot']:#'Bike','Lifestyle','Palace','Robot','Spaceship','Steamtrain','Toad','Wineholder' - # cmd = f'CUDA_VISIBLE_DEVICES={cuda} python train.py ' \ - # f'--dataset_name nsvf --datadir /mnt/new_disk_2/anpei/Dataset/TeRF/Synthetic_NSVF/{data_name} '\ - # f'--expname {data_name} --batch_size {batch_size} ' \ - # f'--n_iters {n_iters} ' \ - # f'--N_voxel_init {128**3} --N_voxel_final {300**3} '\ - # f'--N_vis {5} ' \ - # f'--n_lamb_sigma "[16,16,16]" --n_lamb_sh "[48,48,48]" ' \ - # f'--upsamp_list "[2000, 3000, 4000, 5500,7000]" --update_AlphaMask_list "[3000,4000]" ' \ - # f'--shadingMode MLP_Fea --fea2denseAct softplus --view_pe {2} --fea_pe {2} ' \ - # f'--L1_weight_inital {8e-5} --L1_weight_rest {4e-5} --rm_weight_mask_thre {1e-4} --add_timestamp 0 ' \ - # f'--render_test 1 ' - # print(cmd) - # os.system(cmd) - - # nsvf - # expFolder = "nsvf_0227/" - # datafolder = '/mnt/new_disk_2/anpei/Dataset/TeRF/Synthetic_NSVF/' - # param_dict = { - # 'data_name': ['Robot','Steamtrain','Bike','Lifestyle','Palace','Spaceship','Toad','Wineholder'],#'Bike','Lifestyle','Palace','Robot','Spaceship','Steamtrain','Toad','Wineholder' - # 'shadingMode': ['SH'], - # ('n_lamb_sigma', 'n_lamb_sh'): [ ("[8,8,8]", "[8,8,8]")], - # ('view_pe', 'fea_pe', 'featureC','fea2denseAct','N_voxel_init') : [(2, 2, 128, 'softplus',128**3)], - # ('L1_weight_inital', 'L1_weight_rest', 'rm_weight_mask_thre'):[(4e-5, 4e-5, 1e-4)], - # ('n_iters','N_voxel_final'): [(30000,300**3)], - # ('dataset_name','N_vis','render_test') : [("nsvf",5,1)], - # ('upsamp_list','update_AlphaMask_list'): [("[2000,3000,4000,5500,7000]","[3000,4000]")] - # - # } - - # tankstemple - # expFolder = "tankstemple_0304/" - # datafolder = '/mnt/new_disk_2/anpei/Dataset/TeRF/TanksAndTemple/' - # param_dict = { - # 'data_name': ['Truck','Barn','Caterpillar','Family','Ignatius'], - # 'shadingMode': ['MLP_Fea'], - # ('n_lamb_sigma', 'n_lamb_sh'): [("[16,16,16]", "[48,48,48]")], - # ('view_pe', 'fea_pe','fea2denseAct','N_voxel_init','render_test') : [(2, 2, 'softplus',128**3,1)], - # ('TV_weight_density','TV_weight_app'):[(0.1,0.01)], - # # ('L1_weight_inital', 'L1_weight_rest', 'rm_weight_mask_thre'): [(4e-5, 4e-5, 1e-4)], - # ('n_iters','N_voxel_final'): [(15000,300**3)], - # ('dataset_name','N_vis') : [("tankstemple",5)], - # ('upsamp_list','update_AlphaMask_list'): [("[2000,3000,4000,5500,7000]","[2000,4000]")] - # } - - # llff - # expFolder = "real_iconic/" - # datafolder = '/mnt/new_disk_2/anpei/Dataset/MVSNeRF/real_iconic/' - # List = os.listdir(datafolder) - # param_dict = { - # 'data_name': List, - # ('shadingMode', 'view_pe', 'fea_pe','fea2denseAct', 'nSamples','N_voxel_init') : [('MLP_Fea', 0, 0, 'relu',512,128**3)], - # ('n_lamb_sigma', 'n_lamb_sh') : [("[16,4,4]", "[48,12,12]")], - # ('TV_weight_density', 'TV_weight_app'):[(1.0,1.0)], - # ('n_iters','N_voxel_final'): [(25000,640**3)], - # ('dataset_name','downsample_train','ndc_ray','N_vis','render_path') : [("llff",4.0, 1,-1,1)], - # ('upsamp_list','update_AlphaMask_list'): [("[2000,3000,4000,5500,7000]","[2500]")], - # } - - # expFolder = "llff/" - # datafolder = '/mnt/new_disk_2/anpei/Dataset/MVSNeRF/nerf_llff_data' - # param_dict = { - # 'data_name': ['fern', 'flower', 'room', 'leaves', 'horns', 'trex', 'fortress', 'orchids'],#'fern', 'flower', 'room', 'leaves', 'horns', 'trex', 'fortress', 'orchids' - # ('n_lamb_sigma', 'n_lamb_sh'): [("[16,4,4]", "[48,12,12]")], - # ('shadingMode', 'view_pe', 'fea_pe', 'featureC','fea2denseAct', 'nSamples','N_voxel_init') : [('MLP_Fea', 0, 0, 128, 'relu',512,128**3),('SH', 0, 0, 128, 'relu',512,128**3)], - # ('TV_weight_density', 'TV_weight_app'):[(1.0,1.0)], - # ('n_iters','N_voxel_final'): [(25000,640**3)], - # ('dataset_name','downsample_train','ndc_ray','N_vis','render_test','render_path') : [("llff",4.0, 1,-1,1,1)], - # ('upsamp_list','update_AlphaMask_list'): [("[2000,3000,4000,5500,7000]","[2500]")], - # } - - #setting available gpus - gpus_que = queue.Queue(3) - for i in [1,2,3]: - gpus_que.put(i) - - os.makedirs(f"log/{expFolder}", exist_ok=True) - - def run_program(gpu, expname, param): - cmd = f'CUDA_VISIBLE_DEVICES={gpu} python train.py ' \ - f'--expname {expname} --basedir ./log/{expFolder} --config configs/lego.txt ' \ - f'{param}' \ - f'> "log/{expFolder}{expname}/{expname}.txt"' - print(cmd) - os.system(cmd) - gpus_que.put(gpu) - - params, expnames = get_param_list(param_dict) - - - logFolder=f"log/{expFolder}" - os.makedirs(logFolder, exist_ok=True) - - ths = [] - for i in range(len(params)): - - if getStopFolder(logFolder): - break - - - targetFolder = f"log/{expFolder}{expnames[i]}" - gpu = gpus_que.get() - getFolderLocker(logFolder) - if os.path.isdir(targetFolder): - releaseFolderLocker(logFolder) - gpus_que.put(gpu) - continue - else: - os.makedirs(targetFolder, exist_ok=True) - print("making",targetFolder, "running",expnames[i], params[i]) - releaseFolderLocker(logFolder) - - - t = threading.Thread(target=run_program, args=(gpu, expnames[i], params[i]), daemon=True) - t.start() - ths.append(t) - - for th in ths: - th.join() \ No newline at end of file diff --git a/TensoRF/extra/compute_metrics.py b/TensoRF/extra/compute_metrics.py deleted file mode 100644 index 59efcb2..0000000 --- a/TensoRF/extra/compute_metrics.py +++ /dev/null @@ -1,182 +0,0 @@ -import os, math -import numpy as np -import scipy.signal -from typing import List, Optional -from PIL import Image -import os -import torch -import configargparse - -__LPIPS__ = {} -def init_lpips(net_name, device): - assert net_name in ['alex', 'vgg'] - import lpips - print(f'init_lpips: lpips_{net_name}') - return lpips.LPIPS(net=net_name, version='0.1').eval().to(device) - -def rgb_lpips(np_gt, np_im, net_name, device): - if net_name not in __LPIPS__: - __LPIPS__[net_name] = init_lpips(net_name, device) - gt = torch.from_numpy(np_gt).permute([2, 0, 1]).contiguous().to(device) - im = torch.from_numpy(np_im).permute([2, 0, 1]).contiguous().to(device) - return __LPIPS__[net_name](gt, im, normalize=True).item() - - -def findItem(items, target): - for one in items: - if one[:len(target)]==target: - return one - return None - - -''' Evaluation metrics (ssim, lpips) -''' -def rgb_ssim(img0, img1, max_val, - filter_size=11, - filter_sigma=1.5, - k1=0.01, - k2=0.03, - return_map=False): - # Modified from https://github.com/google/mipnerf/blob/16e73dfdb52044dcceb47cda5243a686391a6e0f/internal/math.py#L58 - assert len(img0.shape) == 3 - assert img0.shape[-1] == 3 - assert img0.shape == img1.shape - - # Construct a 1D Gaussian blur filter. - hw = filter_size // 2 - shift = (2 * hw - filter_size + 1) / 2 - f_i = ((np.arange(filter_size) - hw + shift) / filter_sigma)**2 - filt = np.exp(-0.5 * f_i) - filt /= np.sum(filt) - - # Blur in x and y (faster than the 2D convolution). - def convolve2d(z, f): - return scipy.signal.convolve2d(z, f, mode='valid') - - filt_fn = lambda z: np.stack([ - convolve2d(convolve2d(z[...,i], filt[:, None]), filt[None, :]) - for i in range(z.shape[-1])], -1) - mu0 = filt_fn(img0) - mu1 = filt_fn(img1) - mu00 = mu0 * mu0 - mu11 = mu1 * mu1 - mu01 = mu0 * mu1 - sigma00 = filt_fn(img0**2) - mu00 - sigma11 = filt_fn(img1**2) - mu11 - sigma01 = filt_fn(img0 * img1) - mu01 - - # Clip the variances and covariances to valid values. - # Variance must be non-negative: - sigma00 = np.maximum(0., sigma00) - sigma11 = np.maximum(0., sigma11) - sigma01 = np.sign(sigma01) * np.minimum( - np.sqrt(sigma00 * sigma11), np.abs(sigma01)) - c1 = (k1 * max_val)**2 - c2 = (k2 * max_val)**2 - numer = (2 * mu01 + c1) * (2 * sigma01 + c2) - denom = (mu00 + mu11 + c1) * (sigma00 + sigma11 + c2) - ssim_map = numer / denom - ssim = np.mean(ssim_map) - return ssim_map if return_map else ssim - - -if __name__ == '__main__': - - parser = configargparse.ArgumentParser() - parser.add_argument("--exp", type=str, help="folder of exps") - parser.add_argument("--paramStr", type=str, help="str of params") - args = parser.parse_args() - - - # datanames = ['drums','hotdog','materials','ficus','lego','mic','ship','chair'] #['ship']# - # gtFolder = "/home/code-base/user_space/codes/nerf/data/nerf_synthetic" - # expFolder = "/home/code-base/user_space/codes/TensoRF/log/"+args.exp - - # datanames = ['room','fortress', 'flower','orchids','leaves','horns','trex','fern'] #['ship']# - # gtFolder = "/mnt/new_disk_2/anpei/Dataset/MVSNeRF/nerf_llff_data/" - # expFolder = "/mnt/new_disk_2/anpei/code/TensoRF/log/"+args.exp - paramStr = args.paramStr - fileNum = 200 - - - expitems = os.listdir(expFolder) - finalFolder = f'{expFolder}/finals/{paramStr}' - outFile = f'{finalFolder}/{paramStr}_metrics.txt' - os.makedirs(finalFolder, exist_ok=True) - - expitems.sort(reverse=True) - - - with open(outFile, 'w') as f: - all_psnr = [] - all_ssim = [] - all_alex = [] - all_vgg = [] - for dataname in datanames: - - - gtstr = gtFolder+"/"+dataname+"/test/r_%d.png" - expname = findItem(expitems, f'{paramStr}-{dataname}') - print("expname: ", expname) - if expname is None: - print("no ",dataname, "exists") - continue - resultstr = expFolder+"/"+expname+"/imgs_test_all/"+ dataname+"-"+paramStr+ "_%03d.png" - metric_file = f'{expFolder}/{expname}/imgs_test_all/{paramStr}-{dataname}_mean.txt' - video_file = f'{expFolder}/{expname}/imgs_test_all/{paramStr}-{dataname}_video.mp4' - - exist_metric=False - if os.path.isfile(metric_file): - metrics = np.loadtxt(metric_file) - print(metrics, metrics.tolist()) - if metrics.size == 4: - psnr, ssim, l_a, l_v = metrics.tolist() - exist_metric = True - os.system(f"cp {video_file} {finalFolder}/") - - if not exist_metric: - psnrs = [] - ssims = [] - l_alex = [] - l_vgg = [] - for i in range(fileNum): - gt = np.asarray(Image.open(gtstr%i),dtype=np.float32) / 255.0 - gtmask = gt[...,[3]] - gt = gt[...,:3] - gt = gt*gtmask + (1-gtmask) - img = np.asarray(Image.open(resultstr%i),dtype=np.float32)[...,:3] / 255.0 - # print(gt[0,0],img[0,0],gt.shape, img.shape, gt.max(), img.max()) - - - psnr = -10. * np.log10(np.mean(np.square(img - gt))) - ssim = rgb_ssim(img, gt, 1) - lpips_alex = rgb_lpips(gt, img, 'alex','cuda') - lpips_vgg = rgb_lpips(gt, img, 'vgg','cuda') - - print(i, psnr, ssim, lpips_alex, lpips_vgg) - psnrs.append(psnr) - ssims.append(ssim) - l_alex.append(lpips_alex) - l_vgg.append(lpips_vgg) - psnr = np.mean(np.array(psnrs)) - ssim = np.mean(np.array(ssims)) - l_a = np.mean(np.array(l_alex)) - l_v = np.mean(np.array(l_vgg)) - - rS=f'{dataname} : psnr {psnr} ssim {ssim} l_a {l_a} l_v {l_v}' - print(rS) - f.write(rS+"\n") - - all_psnr.append(psnr) - all_ssim.append(ssim) - all_alex.append(l_a) - all_vgg.append(l_v) - - psnr = np.mean(np.array(all_psnr)) - ssim = np.mean(np.array(all_ssim)) - l_a = np.mean(np.array(all_alex)) - l_v = np.mean(np.array(all_vgg)) - - rS=f'mean : psnr {psnr} ssim {ssim} l_a {l_a} l_v {l_v}' - print(rS) - f.write(rS+"\n") \ No newline at end of file diff --git a/TensoRF/fileServer.py b/TensoRF/fileServer.py deleted file mode 100644 index ffcb633..0000000 --- a/TensoRF/fileServer.py +++ /dev/null @@ -1,41 +0,0 @@ -from flask import Flask, appcontext_popped -from flask import send_from_directory -import time -from multiprocessing import Process -import os - - -app = Flask(__name__) - -@app.route('/output/videos/') -def send_video(path): - return send_from_directory('output/videos',path) - -@app.route('/output/models/') -def send_model(path): - return send_from_directory('output/models',path) - -def dummy_nerf(): - count = 0 - while(True): - print(f"Job {count} Started ") - time.sleep(1) - print(f"Job {count} complete") - print() - count+=1 - -def start_flask(): - global app - app.run(debug=True) - -# Demonstrating how files will be pulled from the cache -"""if __name__ == "__main__": - flaskProcess = Process(target=start_flask, args= ()) - nerfProcess = Process(target=dummy_nerf, args= ()) - - flaskProcess.start() - nerfProcess.start() - - flaskProcess.join() - nerfProcess.join()""" - diff --git a/TensoRF/log.py b/TensoRF/log.py deleted file mode 100644 index cd84aac..0000000 --- a/TensoRF/log.py +++ /dev/null @@ -1,24 +0,0 @@ -import logging - -def nerf_worker_logger(name='root'): - """ - Initializer for a global nerf-worker logger. - -> - To initialize use: 'logger = log.nerf_worker_logger(name)' - To retrieve in different context: 'logger = logging.getLogger(name)' - """ - formatter = logging.Formatter(fmt='%(asctime)s - %(levelname)s - %(module)s - %(message)s') - handler = logging.FileHandler(name+'.log', mode='w') - handler.setFormatter(formatter) - - logger = logging.getLogger(name) - logger.setLevel(logging.DEBUG) - logger.addHandler(handler) - return logger - -if __name__ == "__main__": - theta = nerf_worker_logger('nerf-worker-test') - theta.info("info message") - theta.warning("warning message") - theta.error("error message") - theta.critical("critical message") \ No newline at end of file diff --git a/TensoRF/main.py b/TensoRF/main.py deleted file mode 100644 index c43a135..0000000 --- a/TensoRF/main.py +++ /dev/null @@ -1,189 +0,0 @@ -from flask import Flask, send_from_directory -from pathlib import Path -from log import nerf_worker_logger -from opt import config_parser -from worker import train_tensorf, render_novel_view -from dotenv import load_dotenv -import requests -import pika -import json -import time -import shutil -import os -import functools -import threading -import logging -import torch -import multiprocessing as mp - -app = Flask(__name__) -base_url = "http://nerf-worker:5200/" - -@app.route("/data/nerf_data/") -def send_video(path): - return send_from_directory("data/nerf_data/", path) - -def start_flask(): - global app - app.run(host="0.0.0.0", port=5200, debug=False) - -def on_message(channel, method, header, body, args): - logger = logging.getLogger('nerf-worker') - logger.info("Received message") - thrds = args - delivery_tag = method.delivery_tag - - t = threading.Thread(target = run_nerf_job, args=(channel, method, delivery_tag, body)) - t.start() - thrds.append(t) - -def ack_publish_message(channel, delivery_tag, body): - logger = logging.getLogger('nerf-worker') - logger.info("Publishing message") - - if body: - channel.basic_publish(exchange='', routing_key='nerf-out', body=body) - channel.basic_ack(delivery_tag=delivery_tag) - -def run_nerf_job(channel, method, properties, body): - logger = logging.getLogger('nerf-worker') - - args = config_parser( - "--config configs/localworkerconfig_testsimon.txt") - - nerf_data = json.loads(body.decode()) - id = nerf_data["id"] - width = nerf_data["vid_width"] - height = nerf_data["vid_height"] - intrinsic_matrix = nerf_data["intrinsic_matrix"] - frames = nerf_data["frames"] - - logger.info(f"Running nerf job for {id}") - - input_dir = Path(f"data/sfm_data/{id}") - os.makedirs(input_dir, exist_ok=True) - - for i, fr_ in enumerate(frames): - # Save copy of motion data - url = fr_["file_path"] - img = requests.get(url) - fr_["file_path"] = f"{i}.png" - img_file_path = input_dir / fr_["file_path"] - img_file_path.write_bytes(img.content) - - # Save copy of transform data - input_train = input_dir / f"transforms_train.json" - input_render = input_dir / f"transforms_render.json" - input_train.write_text(json.dumps(nerf_data, indent=4)) - input_render.write_text(json.dumps(nerf_data, indent=4)) - - logger.info("Saved motion and transorm data") - - # Run TensoRF algorithm, creates sfm2nerf datatype for training - args.datadir += f"/{id}" - args.expname = id - logfolder, tensorf_model = train_tensorf(args) - local_video_path = render_novel_view(args, logfolder, tensorf_model) - - # Clear from RAM/VRAM to prevent detached thread leak (can be >20GB) - torch.cuda.empty_cache() - del tensorf_model - - # Save model and video to nerf_data for retrieval - out_model_path = Path(f"data/nerf_data/{id}/model.th") - out_video_path = Path(f"data/nerf_data/{id}/video.mp4") - os.makedirs(f"data/nerf_data/{id}", exist_ok=True) - shutil.copy(f"{logfolder}/imgs_render_all/video.mp4", out_video_path) - shutil.copy(f"{logfolder}/{id}.th", out_model_path) - - out_model_path = base_url + str(out_model_path) - out_video_path = base_url + str(out_video_path) - - nerf_output_object = { - "id": id, - "model_filepath": out_model_path, - "rendered_video_path": out_video_path - } - - # Use threadsafe callback to ack message and publish nerf_output_object - callback = functools.partial( - ack_publish_message, - channel, - method.delivery_tag, - json.dumps(nerf_output_object)) - - channel.connection.add_callback_threadsafe(callback) - - -def nerf_worker(i, *args): - logger = nerf_worker_logger('nerf-worker') - logger.info("~NERF WORKER~") - logger.info(f"CUDA Available: {torch.cuda.is_available()}") - if torch.cuda.is_available(): - logger.info(f"Available CUDA devices: {torch.cuda.device_count()}") - for i in range(torch.cuda.device_count()): - logger.info(f"CUDA Device {i}: {torch.cuda.get_device_name(i)}") - - - # TODO: Communicate with rabbitmq server on port defined in web-server arguments - load_dotenv() - rabbitmq_domain = "rabbitmq" - credentials = pika.PlainCredentials( - str(os.getenv("RABBITMQ_DEFAULT_USER")), str(os.getenv("RABBITMQ_DEFAULT_PASS"))) - parameters = pika.ConnectionParameters( - rabbitmq_domain, 5672, '/', credentials, heartbeat=300 - ) - - # retries connection until connects or 2 minutes pass - timeout = time.time() + 60 * 2 - while True: - if time.time() > timeout: - logger.critical("nerf_worker took too long to connect to rabbitmq") - raise Exception( - "nerf_worker took too long to connect to rabbitmq") - try: - threads = [] - connection = pika.BlockingConnection(parameters) - channel = connection.channel() - channel.queue_declare(queue='nerf-in') - channel.queue_declare(queue='nerf-out') - - # Will block until it creates a separate thread for each message - # This is to prevent the main thread from blocking - channel.basic_qos(prefetch_count=1) - on_message_callback = functools.partial(on_message, args = (threads)) - channel.basic_consume(queue='nerf-in', on_message_callback=on_message_callback, auto_ack=False) - try: - channel.start_consuming() - except KeyboardInterrupt: - channel.stop_consuming() - connection.close() - for thread in threads: - thread.join() - - except pika.exceptions.AMQPConnectionError: - continue - - -if __name__ == "__main__": - # IMPORTANT: FOR CUDA DEVICE USAGE - # flask must run in a normally FORKED python.multiprocessing process - # training and pika must run in a SPAWNED torch.multiprocessing process - # else you will have issues with redeclaring cuda devices - # if flask is not in forked process, web-server cannot send get requests, - # but nerf-worker will be able to send get requests to web-server - - # additional note: spawn does not inherent memory, so need to reinitialize - # the logger in the spawned process. This creates issues with both file descriptors - # pointing to the same file, so the __main__ logger will not be able to write to the file - # for now I have moved the logger to the nerf_worker process as the flask process never used - # the logger - - flaskProcess = mp.Process(target=start_flask, args=()) - flaskProcess.start() - nerfProcess = torch.multiprocessing.spawn(fn=nerf_worker, args=()) - nerfProcess.start() - flaskProcess.join() - nerfProcess.join() - - \ No newline at end of file diff --git a/TensoRF/models/__init__.py b/TensoRF/models/__init__.py deleted file mode 100644 index e69de29..0000000 diff --git a/TensoRF/models/sh.py b/TensoRF/models/sh.py deleted file mode 100644 index 27e3cad..0000000 --- a/TensoRF/models/sh.py +++ /dev/null @@ -1,133 +0,0 @@ -import torch - -################## sh function ################## -C0 = 0.28209479177387814 -C1 = 0.4886025119029199 -C2 = [ - 1.0925484305920792, - -1.0925484305920792, - 0.31539156525252005, - -1.0925484305920792, - 0.5462742152960396 -] -C3 = [ - -0.5900435899266435, - 2.890611442640554, - -0.4570457994644658, - 0.3731763325901154, - -0.4570457994644658, - 1.445305721320277, - -0.5900435899266435 -] -C4 = [ - 2.5033429417967046, - -1.7701307697799304, - 0.9461746957575601, - -0.6690465435572892, - 0.10578554691520431, - -0.6690465435572892, - 0.47308734787878004, - -1.7701307697799304, - 0.6258357354491761, -] - -def eval_sh(deg, sh, dirs): - """ - Evaluate spherical harmonics at unit directions - using hardcoded SH polynomials. - Works with torch/np/jnp. - ... Can be 0 or more batch dimensions. - :param deg: int SH max degree. Currently, 0-4 supported - :param sh: torch.Tensor SH coeffs (..., C, (max degree + 1) ** 2) - :param dirs: torch.Tensor unit directions (..., 3) - :return: (..., C) - """ - assert deg <= 4 and deg >= 0 - assert (deg + 1) ** 2 == sh.shape[-1] - C = sh.shape[-2] - - result = C0 * sh[..., 0] - if deg > 0: - x, y, z = dirs[..., 0:1], dirs[..., 1:2], dirs[..., 2:3] - result = (result - - C1 * y * sh[..., 1] + - C1 * z * sh[..., 2] - - C1 * x * sh[..., 3]) - if deg > 1: - xx, yy, zz = x * x, y * y, z * z - xy, yz, xz = x * y, y * z, x * z - result = (result + - C2[0] * xy * sh[..., 4] + - C2[1] * yz * sh[..., 5] + - C2[2] * (2.0 * zz - xx - yy) * sh[..., 6] + - C2[3] * xz * sh[..., 7] + - C2[4] * (xx - yy) * sh[..., 8]) - - if deg > 2: - result = (result + - C3[0] * y * (3 * xx - yy) * sh[..., 9] + - C3[1] * xy * z * sh[..., 10] + - C3[2] * y * (4 * zz - xx - yy)* sh[..., 11] + - C3[3] * z * (2 * zz - 3 * xx - 3 * yy) * sh[..., 12] + - C3[4] * x * (4 * zz - xx - yy) * sh[..., 13] + - C3[5] * z * (xx - yy) * sh[..., 14] + - C3[6] * x * (xx - 3 * yy) * sh[..., 15]) - if deg > 3: - result = (result + C4[0] * xy * (xx - yy) * sh[..., 16] + - C4[1] * yz * (3 * xx - yy) * sh[..., 17] + - C4[2] * xy * (7 * zz - 1) * sh[..., 18] + - C4[3] * yz * (7 * zz - 3) * sh[..., 19] + - C4[4] * (zz * (35 * zz - 30) + 3) * sh[..., 20] + - C4[5] * xz * (7 * zz - 3) * sh[..., 21] + - C4[6] * (xx - yy) * (7 * zz - 1) * sh[..., 22] + - C4[7] * xz * (xx - 3 * yy) * sh[..., 23] + - C4[8] * (xx * (xx - 3 * yy) - yy * (3 * xx - yy)) * sh[..., 24]) - return result - -def eval_sh_bases(deg, dirs): - """ - Evaluate spherical harmonics bases at unit directions, - without taking linear combination. - At each point, the final result may the be - obtained through simple multiplication. - :param deg: int SH max degree. Currently, 0-4 supported - :param dirs: torch.Tensor (..., 3) unit directions - :return: torch.Tensor (..., (deg+1) ** 2) - """ - assert deg <= 4 and deg >= 0 - result = torch.empty((*dirs.shape[:-1], (deg + 1) ** 2), dtype=dirs.dtype, device=dirs.device) - result[..., 0] = C0 - if deg > 0: - x, y, z = dirs.unbind(-1) - result[..., 1] = -C1 * y; - result[..., 2] = C1 * z; - result[..., 3] = -C1 * x; - if deg > 1: - xx, yy, zz = x * x, y * y, z * z - xy, yz, xz = x * y, y * z, x * z - result[..., 4] = C2[0] * xy; - result[..., 5] = C2[1] * yz; - result[..., 6] = C2[2] * (2.0 * zz - xx - yy); - result[..., 7] = C2[3] * xz; - result[..., 8] = C2[4] * (xx - yy); - - if deg > 2: - result[..., 9] = C3[0] * y * (3 * xx - yy); - result[..., 10] = C3[1] * xy * z; - result[..., 11] = C3[2] * y * (4 * zz - xx - yy); - result[..., 12] = C3[3] * z * (2 * zz - 3 * xx - 3 * yy); - result[..., 13] = C3[4] * x * (4 * zz - xx - yy); - result[..., 14] = C3[5] * z * (xx - yy); - result[..., 15] = C3[6] * x * (xx - 3 * yy); - - if deg > 3: - result[..., 16] = C4[0] * xy * (xx - yy); - result[..., 17] = C4[1] * yz * (3 * xx - yy); - result[..., 18] = C4[2] * xy * (7 * zz - 1); - result[..., 19] = C4[3] * yz * (7 * zz - 3); - result[..., 20] = C4[4] * (zz * (35 * zz - 30) + 3); - result[..., 21] = C4[5] * xz * (7 * zz - 3); - result[..., 22] = C4[6] * (xx - yy) * (7 * zz - 1); - result[..., 23] = C4[7] * xz * (xx - 3 * yy); - result[..., 24] = C4[8] * (xx * (xx - 3 * yy) - yy * (3 * xx - yy)); - return result diff --git a/TensoRF/models/tensoRF.py b/TensoRF/models/tensoRF.py deleted file mode 100644 index 68250df..0000000 --- a/TensoRF/models/tensoRF.py +++ /dev/null @@ -1,434 +0,0 @@ -from .tensorBase import * - - -class TensorVM(TensorBase): - def __init__(self, aabb, gridSize, device, **kargs): - super(TensorVM, self).__init__(aabb, gridSize, device, **kargs) - - - def init_svd_volume(self, res, device): - self.plane_coef = torch.nn.Parameter( - 0.1 * torch.randn((3, self.app_n_comp + self.density_n_comp, res, res), device=device)) - self.line_coef = torch.nn.Parameter( - 0.1 * torch.randn((3, self.app_n_comp + self.density_n_comp, res, 1), device=device)) - self.basis_mat = torch.nn.Linear(self.app_n_comp * 3, self.app_dim, bias=False, device=device) - - - def get_optparam_groups(self, lr_init_spatialxyz = 0.02, lr_init_network = 0.001): - grad_vars = [{'params': self.line_coef, 'lr': lr_init_spatialxyz}, {'params': self.plane_coef, 'lr': lr_init_spatialxyz}, - {'params': self.basis_mat.parameters(), 'lr':lr_init_network}] - if isinstance(self.renderModule, torch.nn.Module): - grad_vars += [{'params':self.renderModule.parameters(), 'lr':lr_init_network}] - return grad_vars - - def compute_features(self, xyz_sampled): - - coordinate_plane = torch.stack((xyz_sampled[..., self.matMode[0]], xyz_sampled[..., self.matMode[1]], xyz_sampled[..., self.matMode[2]])).detach() - coordinate_line = torch.stack((xyz_sampled[..., self.vecMode[0]], xyz_sampled[..., self.vecMode[1]], xyz_sampled[..., self.vecMode[2]])) - coordinate_line = torch.stack((torch.zeros_like(coordinate_line), coordinate_line), dim=-1).detach() - - plane_feats = F.grid_sample(self.plane_coef[:, -self.density_n_comp:], coordinate_plane, align_corners=True).view( - -1, *xyz_sampled.shape[:1]) - line_feats = F.grid_sample(self.line_coef[:, -self.density_n_comp:], coordinate_line, align_corners=True).view( - -1, *xyz_sampled.shape[:1]) - - sigma_feature = torch.sum(plane_feats * line_feats, dim=0) - - - plane_feats = F.grid_sample(self.plane_coef[:, :self.app_n_comp], coordinate_plane, align_corners=True).view(3 * self.app_n_comp, -1) - line_feats = F.grid_sample(self.line_coef[:, :self.app_n_comp], coordinate_line, align_corners=True).view(3 * self.app_n_comp, -1) - - - app_features = self.basis_mat((plane_feats * line_feats).T) - - return sigma_feature, app_features - - def compute_densityfeature(self, xyz_sampled): - coordinate_plane = torch.stack((xyz_sampled[..., self.matMode[0]], xyz_sampled[..., self.matMode[1]], xyz_sampled[..., self.matMode[2]])).detach().view(3, -1, 1, 2) - coordinate_line = torch.stack((xyz_sampled[..., self.vecMode[0]], xyz_sampled[..., self.vecMode[1]], xyz_sampled[..., self.vecMode[2]])) - coordinate_line = torch.stack((torch.zeros_like(coordinate_line), coordinate_line), dim=-1).detach().view(3, -1, 1, 2) - - plane_feats = F.grid_sample(self.plane_coef[:, -self.density_n_comp:], coordinate_plane, align_corners=True).view( - -1, *xyz_sampled.shape[:1]) - line_feats = F.grid_sample(self.line_coef[:, -self.density_n_comp:], coordinate_line, align_corners=True).view( - -1, *xyz_sampled.shape[:1]) - - sigma_feature = torch.sum(plane_feats * line_feats, dim=0) - - - return sigma_feature - - def compute_appfeature(self, xyz_sampled): - coordinate_plane = torch.stack((xyz_sampled[..., self.matMode[0]], xyz_sampled[..., self.matMode[1]], xyz_sampled[..., self.matMode[2]])).detach().view(3, -1, 1, 2) - coordinate_line = torch.stack((xyz_sampled[..., self.vecMode[0]], xyz_sampled[..., self.vecMode[1]], xyz_sampled[..., self.vecMode[2]])) - coordinate_line = torch.stack((torch.zeros_like(coordinate_line), coordinate_line), dim=-1).detach().view(3, -1, 1, 2) - - plane_feats = F.grid_sample(self.plane_coef[:, :self.app_n_comp], coordinate_plane, align_corners=True).view(3 * self.app_n_comp, -1) - line_feats = F.grid_sample(self.line_coef[:, :self.app_n_comp], coordinate_line, align_corners=True).view(3 * self.app_n_comp, -1) - - - app_features = self.basis_mat((plane_feats * line_feats).T) - - - return app_features - - - def vectorDiffs(self, vector_comps): - total = 0 - - for idx in range(len(vector_comps)): - # print(self.line_coef.shape, vector_comps[idx].shape) - n_comp, n_size = vector_comps[idx].shape[:-1] - - dotp = torch.matmul(vector_comps[idx].view(n_comp,n_size), vector_comps[idx].view(n_comp,n_size).transpose(-1,-2)) - # print(vector_comps[idx].shape, vector_comps[idx].view(n_comp,n_size).transpose(-1,-2).shape, dotp.shape) - non_diagonal = dotp.view(-1)[1:].view(n_comp-1, n_comp+1)[...,:-1] - # print(vector_comps[idx].shape, vector_comps[idx].view(n_comp,n_size).transpose(-1,-2).shape, dotp.shape,non_diagonal.shape) - total = total + torch.mean(torch.abs(non_diagonal)) - return total - - def vector_comp_diffs(self): - - return self.vectorDiffs(self.line_coef[:,-self.density_n_comp:]) + self.vectorDiffs(self.line_coef[:,:self.app_n_comp]) - - - @torch.no_grad() - def up_sampling_VM(self, plane_coef, line_coef, res_target): - - for i in range(len(self.vecMode)): - vec_id = self.vecMode[i] - mat_id_0, mat_id_1 = self.matMode[i] - - plane_coef[i] = torch.nn.Parameter( - F.interpolate(plane_coef[i].data, size=(res_target[mat_id_1], res_target[mat_id_0]), mode='bilinear', - align_corners=True)) - line_coef[i] = torch.nn.Parameter( - F.interpolate(line_coef[i].data, size=(res_target[vec_id], 1), mode='bilinear', align_corners=True)) - - # plane_coef[0] = torch.nn.Parameter( - # F.interpolate(plane_coef[0].data, size=(res_target[1], res_target[0]), mode='bilinear', - # align_corners=True)) - # line_coef[0] = torch.nn.Parameter( - # F.interpolate(line_coef[0].data, size=(res_target[2], 1), mode='bilinear', align_corners=True)) - # plane_coef[1] = torch.nn.Parameter( - # F.interpolate(plane_coef[1].data, size=(res_target[2], res_target[0]), mode='bilinear', - # align_corners=True)) - # line_coef[1] = torch.nn.Parameter( - # F.interpolate(line_coef[1].data, size=(res_target[1], 1), mode='bilinear', align_corners=True)) - # plane_coef[2] = torch.nn.Parameter( - # F.interpolate(plane_coef[2].data, size=(res_target[2], res_target[1]), mode='bilinear', - # align_corners=True)) - # line_coef[2] = torch.nn.Parameter( - # F.interpolate(line_coef[2].data, size=(res_target[0], 1), mode='bilinear', align_corners=True)) - - return plane_coef, line_coef - - @torch.no_grad() - def upsample_volume_grid(self, res_target): - # self.app_plane, self.app_line = self.up_sampling_VM(self.app_plane, self.app_line, res_target) - # self.density_plane, self.density_line = self.up_sampling_VM(self.density_plane, self.density_line, res_target) - - scale = res_target[0]/self.line_coef.shape[2] #assuming xyz have the same scale - plane_coef = F.interpolate(self.plane_coef.detach().data, scale_factor=scale, mode='bilinear',align_corners=True) - line_coef = F.interpolate(self.line_coef.detach().data, size=(res_target[0],1), mode='bilinear',align_corners=True) - self.plane_coef, self.line_coef = torch.nn.Parameter(plane_coef), torch.nn.Parameter(line_coef) - self.compute_stepSize(res_target) - self.logger.info("upsampling to {}".format(res_target)) - - -class TensorVMSplit(TensorBase): - def __init__(self, aabb, gridSize, device, **kargs): - super(TensorVMSplit, self).__init__(aabb, gridSize, device, **kargs) - - - def init_svd_volume(self, res, device): - self.density_plane, self.density_line = self.init_one_svd(self.density_n_comp, self.gridSize, 0.1, device) - self.app_plane, self.app_line = self.init_one_svd(self.app_n_comp, self.gridSize, 0.1, device) - self.basis_mat = torch.nn.Linear(sum(self.app_n_comp), self.app_dim, bias=False).to(device) - - - def init_one_svd(self, n_component, gridSize, scale, device): - plane_coef, line_coef = [], [] - for i in range(len(self.vecMode)): - vec_id = self.vecMode[i] - mat_id_0, mat_id_1 = self.matMode[i] - plane_coef.append(torch.nn.Parameter( - scale * torch.randn((1, n_component[i], gridSize[mat_id_1], gridSize[mat_id_0])))) # - line_coef.append( - torch.nn.Parameter(scale * torch.randn((1, n_component[i], gridSize[vec_id], 1)))) - - return torch.nn.ParameterList(plane_coef).to(device), torch.nn.ParameterList(line_coef).to(device) - - - - def get_optparam_groups(self, lr_init_spatialxyz = 0.02, lr_init_network = 0.001): - grad_vars = [{'params': self.density_line, 'lr': lr_init_spatialxyz}, {'params': self.density_plane, 'lr': lr_init_spatialxyz}, - {'params': self.app_line, 'lr': lr_init_spatialxyz}, {'params': self.app_plane, 'lr': lr_init_spatialxyz}, - {'params': self.basis_mat.parameters(), 'lr':lr_init_network}] - if isinstance(self.renderModule, torch.nn.Module): - grad_vars += [{'params':self.renderModule.parameters(), 'lr':lr_init_network}] - return grad_vars - - - def vectorDiffs(self, vector_comps): - total = 0 - - for idx in range(len(vector_comps)): - n_comp, n_size = vector_comps[idx].shape[1:-1] - - dotp = torch.matmul(vector_comps[idx].view(n_comp,n_size), vector_comps[idx].view(n_comp,n_size).transpose(-1,-2)) - non_diagonal = dotp.view(-1)[1:].view(n_comp-1, n_comp+1)[...,:-1] - total = total + torch.mean(torch.abs(non_diagonal)) - return total - - def vector_comp_diffs(self): - return self.vectorDiffs(self.density_line) + self.vectorDiffs(self.app_line) - - def density_L1(self): - total = 0 - for idx in range(len(self.density_plane)): - total = total + torch.mean(torch.abs(self.density_plane[idx])) + torch.mean(torch.abs(self.density_line[idx]))# + torch.mean(torch.abs(self.app_plane[idx])) + torch.mean(torch.abs(self.density_plane[idx])) - return total - - def TV_loss_density(self, reg): - total = 0 - for idx in range(len(self.density_plane)): - total = total + reg(self.density_plane[idx]) * 1e-2 #+ reg(self.density_line[idx]) * 1e-3 - return total - - def TV_loss_app(self, reg): - total = 0 - for idx in range(len(self.app_plane)): - total = total + reg(self.app_plane[idx]) * 1e-2 #+ reg(self.app_line[idx]) * 1e-3 - return total - - def compute_densityfeature(self, xyz_sampled): - - # plane + line basis - coordinate_plane = torch.stack((xyz_sampled[..., self.matMode[0]], xyz_sampled[..., self.matMode[1]], xyz_sampled[..., self.matMode[2]])).detach().view(3, -1, 1, 2) - coordinate_line = torch.stack((xyz_sampled[..., self.vecMode[0]], xyz_sampled[..., self.vecMode[1]], xyz_sampled[..., self.vecMode[2]])) - coordinate_line = torch.stack((torch.zeros_like(coordinate_line), coordinate_line), dim=-1).detach().view(3, -1, 1, 2) - - sigma_feature = torch.zeros((xyz_sampled.shape[0],), device=xyz_sampled.device) - for idx_plane in range(len(self.density_plane)): - plane_coef_point = F.grid_sample(self.density_plane[idx_plane], coordinate_plane[[idx_plane]], - align_corners=True).view(-1, *xyz_sampled.shape[:1]) - line_coef_point = F.grid_sample(self.density_line[idx_plane], coordinate_line[[idx_plane]], - align_corners=True).view(-1, *xyz_sampled.shape[:1]) - sigma_feature = sigma_feature + torch.sum(plane_coef_point * line_coef_point, dim=0) - - return sigma_feature - - - def compute_appfeature(self, xyz_sampled): - - # plane + line basis - coordinate_plane = torch.stack((xyz_sampled[..., self.matMode[0]], xyz_sampled[..., self.matMode[1]], xyz_sampled[..., self.matMode[2]])).detach().view(3, -1, 1, 2) - coordinate_line = torch.stack((xyz_sampled[..., self.vecMode[0]], xyz_sampled[..., self.vecMode[1]], xyz_sampled[..., self.vecMode[2]])) - coordinate_line = torch.stack((torch.zeros_like(coordinate_line), coordinate_line), dim=-1).detach().view(3, -1, 1, 2) - - plane_coef_point,line_coef_point = [],[] - for idx_plane in range(len(self.app_plane)): - plane_coef_point.append(F.grid_sample(self.app_plane[idx_plane], coordinate_plane[[idx_plane]], - align_corners=True).view(-1, *xyz_sampled.shape[:1])) - line_coef_point.append(F.grid_sample(self.app_line[idx_plane], coordinate_line[[idx_plane]], - align_corners=True).view(-1, *xyz_sampled.shape[:1])) - plane_coef_point, line_coef_point = torch.cat(plane_coef_point), torch.cat(line_coef_point) - - - return self.basis_mat((plane_coef_point * line_coef_point).T) - - - - @torch.no_grad() - def up_sampling_VM(self, plane_coef, line_coef, res_target): - - for i in range(len(self.vecMode)): - vec_id = self.vecMode[i] - mat_id_0, mat_id_1 = self.matMode[i] - plane_coef[i] = torch.nn.Parameter( - F.interpolate(plane_coef[i].data, size=(res_target[mat_id_1], res_target[mat_id_0]), mode='bilinear', - align_corners=True)) - line_coef[i] = torch.nn.Parameter( - F.interpolate(line_coef[i].data, size=(res_target[vec_id], 1), mode='bilinear', align_corners=True)) - - - return plane_coef, line_coef - - @torch.no_grad() - def upsample_volume_grid(self, res_target): - self.app_plane, self.app_line = self.up_sampling_VM(self.app_plane, self.app_line, res_target) - self.density_plane, self.density_line = self.up_sampling_VM(self.density_plane, self.density_line, res_target) - - self.update_stepSize(res_target) - self.logger.info("upsampling to {}".format(res_target)) - - @torch.no_grad() - def shrink(self, new_aabb): - self.logger.info("====> shrinking ...") - xyz_min, xyz_max = new_aabb - t_l, b_r = (xyz_min - self.aabb[0]) / self.units, (xyz_max - self.aabb[0]) / self.units - # print(new_aabb, self.aabb) - # print(t_l, b_r,self.alphaMask.alpha_volume.shape) - t_l, b_r = torch.round(torch.round(t_l)).long(), torch.round(b_r).long() + 1 - b_r = torch.stack([b_r, self.gridSize]).amin(0) - - for i in range(len(self.vecMode)): - mode0 = self.vecMode[i] - self.density_line[i] = torch.nn.Parameter( - self.density_line[i].data[...,t_l[mode0]:b_r[mode0],:] - ) - self.app_line[i] = torch.nn.Parameter( - self.app_line[i].data[...,t_l[mode0]:b_r[mode0],:] - ) - mode0, mode1 = self.matMode[i] - self.density_plane[i] = torch.nn.Parameter( - self.density_plane[i].data[...,t_l[mode1]:b_r[mode1],t_l[mode0]:b_r[mode0]] - ) - self.app_plane[i] = torch.nn.Parameter( - self.app_plane[i].data[...,t_l[mode1]:b_r[mode1],t_l[mode0]:b_r[mode0]] - ) - - - if not torch.all(self.alphaMask.gridSize == self.gridSize): - t_l_r, b_r_r = t_l / (self.gridSize-1), (b_r-1) / (self.gridSize-1) - correct_aabb = torch.zeros_like(new_aabb) - correct_aabb[0] = (1-t_l_r)*self.aabb[0] + t_l_r*self.aabb[1] - correct_aabb[1] = (1-b_r_r)*self.aabb[0] + b_r_r*self.aabb[1] - self.logger.info("aabb {} \ncorrect aabb".format(new_aabb,correct_aabb)) - new_aabb = correct_aabb - - newSize = b_r - t_l - self.aabb = new_aabb - self.update_stepSize((newSize[0], newSize[1], newSize[2])) - - -class TensorCP(TensorBase): - def __init__(self, aabb, gridSize, device, **kargs): - super(TensorCP, self).__init__(aabb, gridSize, device, **kargs) - - - def init_svd_volume(self, res, device): - self.density_line = self.init_one_svd(self.density_n_comp[0], self.gridSize, 0.2, device) - self.app_line = self.init_one_svd(self.app_n_comp[0], self.gridSize, 0.2, device) - self.basis_mat = torch.nn.Linear(self.app_n_comp[0], self.app_dim, bias=False).to(device) - - - def init_one_svd(self, n_component, gridSize, scale, device): - line_coef = [] - for i in range(len(self.vecMode)): - vec_id = self.vecMode[i] - line_coef.append( - torch.nn.Parameter(scale * torch.randn((1, n_component, gridSize[vec_id], 1)))) - return torch.nn.ParameterList(line_coef).to(device) - - - def get_optparam_groups(self, lr_init_spatialxyz = 0.02, lr_init_network = 0.001): - grad_vars = [{'params': self.density_line, 'lr': lr_init_spatialxyz}, - {'params': self.app_line, 'lr': lr_init_spatialxyz}, - {'params': self.basis_mat.parameters(), 'lr':lr_init_network}] - if isinstance(self.renderModule, torch.nn.Module): - grad_vars += [{'params':self.renderModule.parameters(), 'lr':lr_init_network}] - return grad_vars - - def compute_densityfeature(self, xyz_sampled): - - coordinate_line = torch.stack((xyz_sampled[..., self.vecMode[0]], xyz_sampled[..., self.vecMode[1]], xyz_sampled[..., self.vecMode[2]])) - coordinate_line = torch.stack((torch.zeros_like(coordinate_line), coordinate_line), dim=-1).detach().view(3, -1, 1, 2) - - - line_coef_point = F.grid_sample(self.density_line[0], coordinate_line[[0]], - align_corners=True).view(-1, *xyz_sampled.shape[:1]) - line_coef_point = line_coef_point * F.grid_sample(self.density_line[1], coordinate_line[[1]], - align_corners=True).view(-1, *xyz_sampled.shape[:1]) - line_coef_point = line_coef_point * F.grid_sample(self.density_line[2], coordinate_line[[2]], - align_corners=True).view(-1, *xyz_sampled.shape[:1]) - sigma_feature = torch.sum(line_coef_point, dim=0) - - - return sigma_feature - - def compute_appfeature(self, xyz_sampled): - - coordinate_line = torch.stack( - (xyz_sampled[..., self.vecMode[0]], xyz_sampled[..., self.vecMode[1]], xyz_sampled[..., self.vecMode[2]])) - coordinate_line = torch.stack((torch.zeros_like(coordinate_line), coordinate_line), dim=-1).detach().view(3, -1, 1, 2) - - - line_coef_point = F.grid_sample(self.app_line[0], coordinate_line[[0]], - align_corners=True).view(-1, *xyz_sampled.shape[:1]) - line_coef_point = line_coef_point * F.grid_sample(self.app_line[1], coordinate_line[[1]], - align_corners=True).view(-1, *xyz_sampled.shape[:1]) - line_coef_point = line_coef_point * F.grid_sample(self.app_line[2], coordinate_line[[2]], - align_corners=True).view(-1, *xyz_sampled.shape[:1]) - - return self.basis_mat(line_coef_point.T) - - - @torch.no_grad() - def up_sampling_Vector(self, density_line_coef, app_line_coef, res_target): - - for i in range(len(self.vecMode)): - vec_id = self.vecMode[i] - density_line_coef[i] = torch.nn.Parameter( - F.interpolate(density_line_coef[i].data, size=(res_target[vec_id], 1), mode='bilinear', align_corners=True)) - app_line_coef[i] = torch.nn.Parameter( - F.interpolate(app_line_coef[i].data, size=(res_target[vec_id], 1), mode='bilinear', align_corners=True)) - - return density_line_coef, app_line_coef - - @torch.no_grad() - def upsample_volume_grid(self, res_target): - self.density_line, self.app_line = self.up_sampling_Vector(self.density_line, self.app_line, res_target) - - self.update_stepSize(res_target) - self.logger.info("upsampling to {}".format(res_target)) - - @torch.no_grad() - def shrink(self, new_aabb): - self.logger.info("====> shrinking ...") - xyz_min, xyz_max = new_aabb - t_l, b_r = (xyz_min - self.aabb[0]) / self.units, (xyz_max - self.aabb[0]) / self.units - - t_l, b_r = torch.round(torch.round(t_l)).long(), torch.round(b_r).long() + 1 - b_r = torch.stack([b_r, self.gridSize]).amin(0) - - - for i in range(len(self.vecMode)): - mode0 = self.vecMode[i] - self.density_line[i] = torch.nn.Parameter( - self.density_line[i].data[...,t_l[mode0]:b_r[mode0],:] - ) - self.app_line[i] = torch.nn.Parameter( - self.app_line[i].data[...,t_l[mode0]:b_r[mode0],:] - ) - - if not torch.all(self.alphaMask.gridSize == self.gridSize): - t_l_r, b_r_r = t_l / (self.gridSize-1), (b_r-1) / (self.gridSize-1) - correct_aabb = torch.zeros_like(new_aabb) - correct_aabb[0] = (1-t_l_r)*self.aabb[0] + t_l_r*self.aabb[1] - correct_aabb[1] = (1-b_r_r)*self.aabb[0] + b_r_r*self.aabb[1] - self.logger.info("aabb {} \ncorrect aabb {}".format(new_aabb,correct_aabb)) - new_aabb = correct_aabb - - newSize = b_r - t_l - self.aabb = new_aabb - self.update_stepSize((newSize[0], newSize[1], newSize[2])) - - def density_L1(self): - total = 0 - for idx in range(len(self.density_line)): - total = total + torch.mean(torch.abs(self.density_line[idx])) - return total - - def TV_loss_density(self, reg): - total = 0 - for idx in range(len(self.density_line)): - total = total + reg(self.density_line[idx]) * 1e-3 - return total - - def TV_loss_app(self, reg): - total = 0 - for idx in range(len(self.app_line)): - total = total + reg(self.app_line[idx]) * 1e-3 - return total \ No newline at end of file diff --git a/TensoRF/models/tensorBase.py b/TensoRF/models/tensorBase.py deleted file mode 100644 index a988fd4..0000000 --- a/TensoRF/models/tensorBase.py +++ /dev/null @@ -1,473 +0,0 @@ -import torch -import torch.nn -import torch.nn.functional as F -from .sh import eval_sh_bases -import numpy as np -import time - -import logging - - -def positional_encoding(positions, freqs): - - freq_bands = (2**torch.arange(freqs).float()).to(positions.device) # (F,) - pts = (positions[..., None] * freq_bands).reshape( - positions.shape[:-1] + (freqs * positions.shape[-1], )) # (..., DF) - pts = torch.cat([torch.sin(pts), torch.cos(pts)], dim=-1) - return pts - -def raw2alpha(sigma, dist): - # sigma, dist [N_rays, N_samples] - alpha = 1. - torch.exp(-sigma*dist) - - T = torch.cumprod(torch.cat([torch.ones(alpha.shape[0], 1).to(alpha.device), 1. - alpha + 1e-10], -1), -1) - - weights = alpha * T[:, :-1] # [N_rays, N_samples] - return alpha, weights, T[:,-1:] - - -def SHRender(xyz_sampled, viewdirs, features): - sh_mult = eval_sh_bases(2, viewdirs)[:, None] - rgb_sh = features.view(-1, 3, sh_mult.shape[-1]) - rgb = torch.relu(torch.sum(sh_mult * rgb_sh, dim=-1) + 0.5) - return rgb - - -def RGBRender(xyz_sampled, viewdirs, features): - - rgb = features - return rgb - -class AlphaGridMask(torch.nn.Module): - def __init__(self, device, aabb, alpha_volume): - super(AlphaGridMask, self).__init__() - self.device = device - - self.aabb=aabb.to(self.device) - self.aabbSize = self.aabb[1] - self.aabb[0] - self.invgridSize = 1.0/self.aabbSize * 2 - self.alpha_volume = alpha_volume.view(1,1,*alpha_volume.shape[-3:]) - self.gridSize = torch.LongTensor([alpha_volume.shape[-1],alpha_volume.shape[-2],alpha_volume.shape[-3]]).to(self.device) - - def sample_alpha(self, xyz_sampled): - xyz_sampled = self.normalize_coord(xyz_sampled) - alpha_vals = F.grid_sample(self.alpha_volume, xyz_sampled.view(1,-1,1,1,3), align_corners=True).view(-1) - - return alpha_vals - - def normalize_coord(self, xyz_sampled): - return (xyz_sampled-self.aabb[0]) * self.invgridSize - 1 - - -class MLPRender_Fea(torch.nn.Module): - def __init__(self,inChanel, viewpe=6, feape=6, featureC=128): - super(MLPRender_Fea, self).__init__() - - self.in_mlpC = 2*viewpe*3 + 2*feape*inChanel + 3 + inChanel - self.viewpe = viewpe - self.feape = feape - layer1 = torch.nn.Linear(self.in_mlpC, featureC) - layer2 = torch.nn.Linear(featureC, featureC) - layer3 = torch.nn.Linear(featureC,3) - - self.mlp = torch.nn.Sequential(layer1, torch.nn.ReLU(inplace=True), layer2, torch.nn.ReLU(inplace=True), layer3) - torch.nn.init.constant_(self.mlp[-1].bias, 0) - - def forward(self, pts, viewdirs, features): - indata = [features, viewdirs] - if self.feape > 0: - indata += [positional_encoding(features, self.feape)] - if self.viewpe > 0: - indata += [positional_encoding(viewdirs, self.viewpe)] - mlp_in = torch.cat(indata, dim=-1) - rgb = self.mlp(mlp_in) - rgb = torch.sigmoid(rgb) - - return rgb - -class MLPRender_PE(torch.nn.Module): - def __init__(self,inChanel, viewpe=6, pospe=6, featureC=128): - super(MLPRender_PE, self).__init__() - - self.in_mlpC = (3+2*viewpe*3)+ (3+2*pospe*3) + inChanel # - self.viewpe = viewpe - self.pospe = pospe - layer1 = torch.nn.Linear(self.in_mlpC, featureC) - layer2 = torch.nn.Linear(featureC, featureC) - layer3 = torch.nn.Linear(featureC,3) - - self.mlp = torch.nn.Sequential(layer1, torch.nn.ReLU(inplace=True), layer2, torch.nn.ReLU(inplace=True), layer3) - torch.nn.init.constant_(self.mlp[-1].bias, 0) - - def forward(self, pts, viewdirs, features): - indata = [features, viewdirs] - if self.pospe > 0: - indata += [positional_encoding(pts, self.pospe)] - if self.viewpe > 0: - indata += [positional_encoding(viewdirs, self.viewpe)] - mlp_in = torch.cat(indata, dim=-1) - rgb = self.mlp(mlp_in) - rgb = torch.sigmoid(rgb) - - return rgb - -class MLPRender(torch.nn.Module): - def __init__(self,inChanel, viewpe=6, featureC=128): - super(MLPRender, self).__init__() - - self.in_mlpC = (3+2*viewpe*3) + inChanel - self.viewpe = viewpe - - layer1 = torch.nn.Linear(self.in_mlpC, featureC) - layer2 = torch.nn.Linear(featureC, featureC) - layer3 = torch.nn.Linear(featureC,3) - - self.mlp = torch.nn.Sequential(layer1, torch.nn.ReLU(inplace=True), layer2, torch.nn.ReLU(inplace=True), layer3) - torch.nn.init.constant_(self.mlp[-1].bias, 0) - - def forward(self, pts, viewdirs, features): - indata = [features, viewdirs] - if self.viewpe > 0: - indata += [positional_encoding(viewdirs, self.viewpe)] - mlp_in = torch.cat(indata, dim=-1) - rgb = self.mlp(mlp_in) - rgb = torch.sigmoid(rgb) - - return rgb - - - -class TensorBase(torch.nn.Module): - def __init__(self, aabb, gridSize, device, density_n_comp = 8, appearance_n_comp = 24, app_dim = 27, - shadingMode = 'MLP_PE', alphaMask = None, near_far=[2.0,6.0], - density_shift = -10, alphaMask_thres=0.001, distance_scale=25, rayMarch_weight_thres=0.0001, - pos_pe = 6, view_pe = 6, fea_pe = 6, featureC=128, step_ratio=2.0, - fea2denseAct = 'softplus'): - super(TensorBase, self).__init__() - - self.density_n_comp = density_n_comp - self.app_n_comp = appearance_n_comp - self.app_dim = app_dim - self.aabb = aabb - self.alphaMask = alphaMask - self.device=device - - self.density_shift = density_shift - self.alphaMask_thres = alphaMask_thres - self.distance_scale = distance_scale - self.rayMarch_weight_thres = rayMarch_weight_thres - self.fea2denseAct = fea2denseAct - - self.near_far = near_far - self.step_ratio = step_ratio - - self.logger = logging.getLogger('nerf-worker') - - self.update_stepSize(gridSize) - - self.matMode = [[0,1], [0,2], [1,2]] - self.vecMode = [2, 1, 0] - self.comp_w = [1,1,1] - - - self.init_svd_volume(gridSize[0], device) - - self.shadingMode, self.pos_pe, self.view_pe, self.fea_pe, self.featureC = shadingMode, pos_pe, view_pe, fea_pe, featureC - self.init_render_func(shadingMode, pos_pe, view_pe, fea_pe, featureC, device) - - - def init_render_func(self, shadingMode, pos_pe, view_pe, fea_pe, featureC, device): - if shadingMode == 'MLP_PE': - self.renderModule = MLPRender_PE(self.app_dim, view_pe, pos_pe, featureC).to(device) - elif shadingMode == 'MLP_Fea': - self.renderModule = MLPRender_Fea(self.app_dim, view_pe, fea_pe, featureC).to(device) - elif shadingMode == 'MLP': - self.renderModule = MLPRender(self.app_dim, view_pe, featureC).to(device) - elif shadingMode == 'SH': - self.renderModule = SHRender - elif shadingMode == 'RGB': - assert self.app_dim == 3 - self.renderModule = RGBRender - else: - self.logger.critical("Unrecognized shading module") - exit() - self.logger.info("pos_pe {} view_pe {} fea_pe {}".format(pos_pe,view_pe,fea_pe)) - self.logger.info(self.renderModule) - - def update_stepSize(self, gridSize): - self.logger.info("aabb {}".format(self.aabb.view(-1))) - self.logger.info("grid size {}".format(gridSize)) - self.aabbSize = self.aabb[1] - self.aabb[0] - self.invaabbSize = 2.0/self.aabbSize - self.gridSize= torch.LongTensor(gridSize).to(self.device) - self.units=self.aabbSize / (self.gridSize-1) - self.stepSize=torch.mean(self.units)*self.step_ratio - self.aabbDiag = torch.sqrt(torch.sum(torch.square(self.aabbSize))) - self.nSamples=int((self.aabbDiag / self.stepSize).item()) + 1 - self.logger.info("sampling step size: {}".format(self.stepSize)) - self.logger.info("sampling number: ".format(self.nSamples)) - - def init_svd_volume(self, res, device): - pass - - def compute_features(self, xyz_sampled): - pass - - def compute_densityfeature(self, xyz_sampled): - pass - - def compute_appfeature(self, xyz_sampled): - pass - - def normalize_coord(self, xyz_sampled): - return (xyz_sampled-self.aabb[0]) * self.invaabbSize - 1 - - def get_optparam_groups(self, lr_init_spatial = 0.02, lr_init_network = 0.001): - pass - - def get_kwargs(self): - return { - 'aabb': self.aabb, - 'gridSize':self.gridSize.tolist(), - 'density_n_comp': self.density_n_comp, - 'appearance_n_comp': self.app_n_comp, - 'app_dim': self.app_dim, - - 'density_shift': self.density_shift, - 'alphaMask_thres': self.alphaMask_thres, - 'distance_scale': self.distance_scale, - 'rayMarch_weight_thres': self.rayMarch_weight_thres, - 'fea2denseAct': self.fea2denseAct, - - 'near_far': self.near_far, - 'step_ratio': self.step_ratio, - - 'shadingMode': self.shadingMode, - 'pos_pe': self.pos_pe, - 'view_pe': self.view_pe, - 'fea_pe': self.fea_pe, - 'featureC': self.featureC - } - - def save(self, path): - kwargs = self.get_kwargs() - ckpt = {'kwargs': kwargs, 'state_dict': self.state_dict()} - if self.alphaMask is not None: - alpha_volume = self.alphaMask.alpha_volume.bool().cpu().numpy() - ckpt.update({'alphaMask.shape':alpha_volume.shape}) - ckpt.update({'alphaMask.mask':np.packbits(alpha_volume.reshape(-1))}) - ckpt.update({'alphaMask.aabb': self.alphaMask.aabb.cpu()}) - torch.save(ckpt, path) - - def load(self, ckpt): - if 'alphaMask.aabb' in ckpt.keys(): - length = np.prod(ckpt['alphaMask.shape']) - alpha_volume = torch.from_numpy(np.unpackbits(ckpt['alphaMask.mask'])[:length].reshape(ckpt['alphaMask.shape'])) - self.alphaMask = AlphaGridMask(self.device, ckpt['alphaMask.aabb'].to(self.device), alpha_volume.float().to(self.device)) - self.load_state_dict(ckpt['state_dict']) - - - def sample_ray_ndc(self, rays_o, rays_d, is_train=True, N_samples=-1): - N_samples = N_samples if N_samples > 0 else self.nSamples - near, far = self.near_far - interpx = torch.linspace(near, far, N_samples).unsqueeze(0).to(rays_o) - if is_train: - interpx += torch.rand_like(interpx).to(rays_o) * ((far - near) / N_samples) - - rays_pts = rays_o[..., None, :] + rays_d[..., None, :] * interpx[..., None] - mask_outbbox = ((self.aabb[0] > rays_pts) | (rays_pts > self.aabb[1])).any(dim=-1) - return rays_pts, interpx, ~mask_outbbox - - def sample_ray(self, rays_o, rays_d, is_train=True, N_samples=-1): - N_samples = N_samples if N_samples>0 else self.nSamples - stepsize = self.stepSize - near, far = self.near_far - vec = torch.where(rays_d==0, torch.full_like(rays_d, 1e-6), rays_d) - rate_a = (self.aabb[1] - rays_o) / vec - rate_b = (self.aabb[0] - rays_o) / vec - t_min = torch.minimum(rate_a, rate_b).amax(-1).clamp(min=near, max=far) - - rng = torch.arange(N_samples)[None].float() - if is_train: - rng = rng.repeat(rays_d.shape[-2],1) - rng += torch.rand_like(rng[:,[0]]) - step = stepsize * rng.to(rays_o.device) - interpx = (t_min[...,None] + step) - - rays_pts = rays_o[...,None,:] + rays_d[...,None,:] * interpx[...,None] - mask_outbbox = ((self.aabb[0]>rays_pts) | (rays_pts>self.aabb[1])).any(dim=-1) - - return rays_pts, interpx, ~mask_outbbox - - - def shrink(self, new_aabb, voxel_size): - pass - - @torch.no_grad() - def getDenseAlpha(self,gridSize=None): - gridSize = self.gridSize if gridSize is None else gridSize - - samples = torch.stack(torch.meshgrid( - torch.linspace(0, 1, gridSize[0]), - torch.linspace(0, 1, gridSize[1]), - torch.linspace(0, 1, gridSize[2]), - ), -1).to(self.device) - dense_xyz = self.aabb[0] * (1-samples) + self.aabb[1] * samples - - # dense_xyz = dense_xyz - # print(self.stepSize, self.distance_scale*self.aabbDiag) - # computes the mask that prunes the model (How?) - alpha = torch.zeros_like(dense_xyz[...,0]) - for i in range(gridSize[0]): - alpha[i] = self.compute_alpha(dense_xyz[i].view(-1,3), self.stepSize).view((gridSize[1], gridSize[2])) - return alpha, dense_xyz - - # triggered every thousand training iterations - # reduce model size - @torch.no_grad() - def updateAlphaMask(self, gridSize=(200,200,200)): - - alpha, dense_xyz = self.getDenseAlpha(gridSize) - dense_xyz = dense_xyz.transpose(0,2).contiguous() - alpha = alpha.clamp(0,1).transpose(0,2).contiguous()[None,None] - total_voxels = gridSize[0] * gridSize[1] * gridSize[2] - - ks = 3 - alpha = F.max_pool3d(alpha, kernel_size=ks, padding=ks // 2, stride=1).view(gridSize[::-1]) - alpha[alpha>=self.alphaMask_thres] = 1 - alpha[alpha0.5] - - xyz_min = valid_xyz.amin(0) - xyz_max = valid_xyz.amax(0) - - new_aabb = torch.stack((xyz_min, xyz_max)) - - total = torch.sum(alpha) - self.logger.info(f"bbox: {xyz_min, xyz_max} alpha rest %%%f"%(total/total_voxels*100)) - return new_aabb - - @torch.no_grad() - def filtering_rays(self, all_rays, all_rgbs, N_samples=256, chunk=10240*5, bbox_only=False): - self.logger.info("========> filtering rays ...") - tt = time.time() - - N = torch.tensor(all_rays.shape[:-1]).prod() - - mask_filtered = [] - idx_chunks = torch.split(torch.arange(N), chunk) - for idx_chunk in idx_chunks: - rays_chunk = all_rays[idx_chunk].to(self.device) - - rays_o, rays_d = rays_chunk[..., :3], rays_chunk[..., 3:6] - if bbox_only: - vec = torch.where(rays_d == 0, torch.full_like(rays_d, 1e-6), rays_d) - rate_a = (self.aabb[1] - rays_o) / vec - rate_b = (self.aabb[0] - rays_o) / vec - t_min = torch.minimum(rate_a, rate_b).amax(-1)#.clamp(min=near, max=far) - t_max = torch.maximum(rate_a, rate_b).amin(-1)#.clamp(min=near, max=far) - mask_inbbox = t_max > t_min - - else: - xyz_sampled, _,_ = self.sample_ray(rays_o, rays_d, N_samples=N_samples, is_train=False) - mask_inbbox= (self.alphaMask.sample_alpha(xyz_sampled).view(xyz_sampled.shape[:-1]) > 0).any(-1) - - mask_filtered.append(mask_inbbox.cpu()) - - mask_filtered = torch.cat(mask_filtered).view(all_rgbs.shape[:-1]) - - self.logger.info(f'Ray filtering done! takes {time.time()-tt} s. ray mask ratio: {torch.sum(mask_filtered) / N}') - return all_rays[mask_filtered], all_rgbs[mask_filtered] - - - def feature2density(self, density_features): - if self.fea2denseAct == "softplus": - return F.softplus(density_features+self.density_shift) - elif self.fea2denseAct == "relu": - return F.relu(density_features) - - - def compute_alpha(self, xyz_locs, length=1): - - if self.alphaMask is not None: - alphas = self.alphaMask.sample_alpha(xyz_locs) - alpha_mask = alphas > 0 - else: - alpha_mask = torch.ones_like(xyz_locs[:,0], dtype=bool) - - - sigma = torch.zeros(xyz_locs.shape[:-1], device=xyz_locs.device) - - if alpha_mask.any(): - xyz_sampled = self.normalize_coord(xyz_locs[alpha_mask]) - sigma_feature = self.compute_densityfeature(xyz_sampled) - validsigma = self.feature2density(sigma_feature) - sigma[alpha_mask] = validsigma - - - alpha = 1 - torch.exp(-sigma*length).view(xyz_locs.shape[:-1]) - - return alpha - - - def forward(self, rays_chunk, white_bg=True, is_train=False, ndc_ray=False, N_samples=-1): - - # sample points - viewdirs = rays_chunk[:, 3:6] - if ndc_ray: - xyz_sampled, z_vals, ray_valid = self.sample_ray_ndc(rays_chunk[:, :3], viewdirs, is_train=is_train,N_samples=N_samples) - dists = torch.cat((z_vals[:, 1:] - z_vals[:, :-1], torch.zeros_like(z_vals[:, :1])), dim=-1) - rays_norm = torch.norm(viewdirs, dim=-1, keepdim=True) - dists = dists * rays_norm - viewdirs = viewdirs / rays_norm - else: - xyz_sampled, z_vals, ray_valid = self.sample_ray(rays_chunk[:, :3], viewdirs, is_train=is_train,N_samples=N_samples) - dists = torch.cat((z_vals[:, 1:] - z_vals[:, :-1], torch.zeros_like(z_vals[:, :1])), dim=-1) - viewdirs = viewdirs.view(-1, 1, 3).expand(xyz_sampled.shape) - - if self.alphaMask is not None: - alphas = self.alphaMask.sample_alpha(xyz_sampled[ray_valid]) - alpha_mask = alphas > 0 - ray_invalid = ~ray_valid - ray_invalid[ray_valid] |= (~alpha_mask) - ray_valid = ~ray_invalid - - - sigma = torch.zeros(xyz_sampled.shape[:-1], device=xyz_sampled.device) - rgb = torch.zeros((*xyz_sampled.shape[:2], 3), device=xyz_sampled.device) - - if ray_valid.any(): - xyz_sampled = self.normalize_coord(xyz_sampled) - sigma_feature = self.compute_densityfeature(xyz_sampled[ray_valid]) - - validsigma = self.feature2density(sigma_feature) - sigma[ray_valid] = validsigma - - - alpha, weight, bg_weight = raw2alpha(sigma, dists * self.distance_scale) - - app_mask = weight > self.rayMarch_weight_thres - - if app_mask.any(): - app_features = self.compute_appfeature(xyz_sampled[app_mask]) - valid_rgbs = self.renderModule(xyz_sampled[app_mask], viewdirs[app_mask], app_features) - rgb[app_mask] = valid_rgbs - - acc_map = torch.sum(weight, -1) - rgb_map = torch.sum(weight[..., None] * rgb, -2) - - if white_bg or (is_train and torch.rand((1,))<0.5): - rgb_map = rgb_map + (1. - acc_map[..., None]) - - - rgb_map = rgb_map.clamp(0,1) - - with torch.no_grad(): - depth_map = torch.sum(weight * z_vals, -1) - depth_map = depth_map + (1. - acc_map) * rays_chunk[..., -1] - - return rgb_map, depth_map # rgb, sigma, alpha, weight, bg_weight - diff --git a/TensoRF/opt.py b/TensoRF/opt.py deleted file mode 100644 index cd727b9..0000000 --- a/TensoRF/opt.py +++ /dev/null @@ -1,139 +0,0 @@ -import configargparse - -def config_parser(cmd=None): - parser = configargparse.ArgumentParser() - - # Argument to turn off saving .PNG files. Useful for timesaving: 1 = save, 0 = don't save - parser.add_argument('--png_mode', type=int, default=1, - help='argument that turns off saving .png files') - - parser.add_argument('--config', is_config_file=True, - help='config file path') - parser.add_argument("--expname", type=str, - help='experiment name') - parser.add_argument("--basedir", type=str, default='./log', - help='where to store ckpts and logs') - parser.add_argument("--add_timestamp", type=int, default=0, - help='add timestamp to dir') - parser.add_argument("--datadir", type=str, default='./data/llff/fern', - help='input data directory') - parser.add_argument("--progress_refresh_rate", type=int, default=10, - help='how many iterations to show psnrs or iters') - - parser.add_argument('--with_depth', action='store_true') - parser.add_argument('--downsample_train', type=float, default=1.0) - parser.add_argument('--downsample_test', type=float, default=1.0) - - parser.add_argument('--model_name', type=str, default='TensorVMSplit', - choices=['TensorVMSplit', 'TensorCP']) - - # loader options - parser.add_argument("--batch_size", type=int, default=4096) - parser.add_argument("--n_iters", type=int, default=30000) - - parser.add_argument('--dataset_name', type=str, default='blender', - choices=['blender', 'llff', 'nsvf', 'dtu','tankstemple', 'own_data', 'sfm2nerf']) - - - # training options - # learning rate - parser.add_argument("--lr_init", type=float, default=0.02, - help='learning rate') - parser.add_argument("--lr_basis", type=float, default=1e-3, - help='learning rate') - parser.add_argument("--lr_decay_iters", type=int, default=-1, - help = 'number of iterations the lr will decay to the target ratio; -1 will set it to n_iters') - parser.add_argument("--lr_decay_target_ratio", type=float, default=0.1, - help='the target decay ratio; after decay_iters inital lr decays to lr*ratio') - parser.add_argument("--lr_upsample_reset", type=int, default=1, - help='reset lr to inital after upsampling') - - # loss - parser.add_argument("--L1_weight_inital", type=float, default=0.0, - help='loss weight') - parser.add_argument("--L1_weight_rest", type=float, default=0, - help='loss weight') - parser.add_argument("--Ortho_weight", type=float, default=0.0, - help='loss weight') - parser.add_argument("--TV_weight_density", type=float, default=0.0, - help='loss weight') - parser.add_argument("--TV_weight_app", type=float, default=0.0, - help='loss weight') - - # model - # volume options - parser.add_argument("--n_lamb_sigma", type=int, action="append") - parser.add_argument("--n_lamb_sh", type=int, action="append") - parser.add_argument("--data_dim_color", type=int, default=27) - - parser.add_argument("--rm_weight_mask_thre", type=float, default=0.0001, - help='mask points in ray marching') - parser.add_argument("--alpha_mask_thre", type=float, default=0.0001, - help='threshold for creating alpha mask volume') - parser.add_argument("--distance_scale", type=float, default=25, - help='scaling sampling distance for computation') - parser.add_argument("--density_shift", type=float, default=-10, - help='shift density in softplus; making density = 0 when feature == 0') - - # network decoder - parser.add_argument("--shadingMode", type=str, default="MLP_PE", - help='which shading mode to use') - parser.add_argument("--pos_pe", type=int, default=6, - help='number of pe for pos') - parser.add_argument("--view_pe", type=int, default=6, - help='number of pe for view') - parser.add_argument("--fea_pe", type=int, default=6, - help='number of pe for features') - parser.add_argument("--featureC", type=int, default=128, - help='hidden feature channel in MLP') - - - - parser.add_argument("--ckpt", type=str, default=None, - help='specific weights npy file to reload for coarse network') - parser.add_argument("--render_only", type=int, default=0) - parser.add_argument("--render_test", type=int, default=0) - parser.add_argument("--render_train", type=int, default=0) - parser.add_argument("--render_path", type=int, default=0) - parser.add_argument("--export_mesh", type=int, default=0) - - # rendering options - parser.add_argument('--lindisp', default=False, action="store_true", - help='use disparity depth sampling') - parser.add_argument("--perturb", type=float, default=1., - help='set to 0. for no jitter, 1. for jitter') - parser.add_argument("--accumulate_decay", type=float, default=0.998) - parser.add_argument("--fea2denseAct", type=str, default='softplus') - parser.add_argument('--ndc_ray', type=int, default=0) - parser.add_argument('--nSamples', type=int, default=1e6, - help='sample point each ray, pass 1e6 if automatic adjust') - parser.add_argument('--step_ratio',type=float,default=0.5) - - - ## blender flags - parser.add_argument("--white_bkgd", action='store_true', - help='set to render synthetic data on a white bkgd (always use for dvoxels)') - - - - parser.add_argument('--N_voxel_init', - type=int, - default=100**3) - parser.add_argument('--N_voxel_final', - type=int, - default=300**3) - parser.add_argument("--upsamp_list", type=int, action="append") - parser.add_argument("--update_AlphaMask_list", type=int, action="append") - - parser.add_argument('--idx_view', - type=int, - default=0) - # logging/saving options - parser.add_argument("--N_vis", type=int, default=5, - help='N images to vis') - parser.add_argument("--vis_every", type=int, default=10000, - help='frequency of visualize the image') - if cmd is not None: - return parser.parse_args(cmd) - else: - return parser.parse_args() \ No newline at end of file diff --git a/TensoRF/renderer.py b/TensoRF/renderer.py deleted file mode 100644 index 64ab94a..0000000 --- a/TensoRF/renderer.py +++ /dev/null @@ -1,145 +0,0 @@ -import torch,os,imageio,sys -from tqdm.auto import tqdm -from dataLoader.ray_utils import get_rays -from models.tensoRF import TensorVM, TensorCP, raw2alpha, TensorVMSplit, AlphaGridMask -from utils import * -from dataLoader.ray_utils import ndc_rays_blender - - -def OctreeRender_trilinear_fast(rays, tensorf, chunk=4096, N_samples=-1, ndc_ray=False, white_bg=True, is_train=False, device='cuda'): - - rgbs, alphas, depth_maps, weights, uncertainties = [], [], [], [], [] - N_rays_all = rays.shape[0] - for chunk_idx in range(N_rays_all // chunk + int(N_rays_all % chunk > 0)): - rays_chunk = rays[chunk_idx * chunk:(chunk_idx + 1) * chunk].to(device) - - rgb_map, depth_map = tensorf(rays_chunk, is_train=is_train, white_bg=white_bg, ndc_ray=ndc_ray, N_samples=N_samples) - - rgbs.append(rgb_map) - depth_maps.append(depth_map) - - return torch.cat(rgbs), None, torch.cat(depth_maps), None, None - -@torch.no_grad() -def evaluation(test_dataset,tensorf, args, renderer, savePath=None, N_vis=5, prtx='', N_samples=-1, - white_bg=False, ndc_ray=False, compute_extra_metrics=True, device='cuda', save_imgs=0): - PSNRs, rgb_maps, depth_maps = [], [], [] - ssims,l_alex,l_vgg=[],[],[] - os.makedirs(savePath, exist_ok=True) - os.makedirs(savePath+"/rgbd", exist_ok=True) - - try: - tqdm._instances.clear() - except Exception: - pass - - near_far = test_dataset.near_far - img_eval_interval = 1 if N_vis < 0 else max(test_dataset.all_rays.shape[0] // N_vis,1) - idxs = list(range(0, test_dataset.all_rays.shape[0], img_eval_interval)) - for idx, samples in tqdm(enumerate(test_dataset.all_rays[0::img_eval_interval]), file=sys.stdout): - - W, H = test_dataset.img_wh - rays = samples.view(-1,samples.shape[-1]) - - rgb_map, _, depth_map, _, _ = renderer(rays, tensorf, chunk=4096, N_samples=N_samples, - ndc_ray=ndc_ray, white_bg = white_bg, device=device) - rgb_map = rgb_map.clamp(0.0, 1.0) - - rgb_map, depth_map = rgb_map.reshape(H, W, 3).cpu(), depth_map.reshape(H, W).cpu() - - depth_map, _ = visualize_depth_numpy(depth_map.numpy(),near_far) - if len(test_dataset.all_rgbs): - gt_rgb = test_dataset.all_rgbs[idxs[idx]].view(H, W, 3) - loss = torch.mean((rgb_map - gt_rgb) ** 2) - PSNRs.append(-10.0 * np.log(loss.item()) / np.log(10.0)) - - if compute_extra_metrics: - ssim = rgb_ssim(rgb_map, gt_rgb, 1) - l_a = rgb_lpips(gt_rgb.numpy(), rgb_map.numpy(), 'alex', tensorf.device) - l_v = rgb_lpips(gt_rgb.numpy(), rgb_map.numpy(), 'vgg', tensorf.device) - ssims.append(ssim) - l_alex.append(l_a) - l_vgg.append(l_v) - - rgb_map = (rgb_map.numpy() * 255).astype('uint8') - # rgb_map = np.concatenate((rgb_map, depth_map), axis=1) - rgb_maps.append(rgb_map) - depth_maps.append(depth_map) - - if (save_imgs == 1): - if savePath is not None: - imageio.imwrite(f'{savePath}/{prtx}{idx:03d}.png', rgb_map) - rgb_map = np.concatenate((rgb_map, depth_map), axis=1) - imageio.imwrite(f'{savePath}/rgbd/{prtx}{idx:03d}.png', rgb_map) - - imageio.mimwrite(f'{savePath}/{prtx}video.mp4', np.stack(rgb_maps), fps=30, quality=10) - imageio.mimwrite(f'{savePath}/{prtx}depthvideo.mp4', np.stack(depth_maps), fps=30, quality=10) - - if PSNRs: - psnr = np.mean(np.asarray(PSNRs)) - if compute_extra_metrics: - ssim = np.mean(np.asarray(ssims)) - l_a = np.mean(np.asarray(l_alex)) - l_v = np.mean(np.asarray(l_vgg)) - np.savetxt(f'{savePath}/{prtx}mean.txt', np.asarray([psnr, ssim, l_a, l_v])) - else: - np.savetxt(f'{savePath}/{prtx}mean.txt', np.asarray([psnr])) - return PSNRs - -@torch.no_grad() -def evaluation_path(test_dataset,tensorf, c2ws, renderer, savePath=None, N_vis=5, prtx='', N_samples=-1, - white_bg=False, ndc_ray=False, compute_extra_metrics=True, device='cuda', save_imgs=0): - PSNRs, rgb_maps, depth_maps = [], [], [] - ssims,l_alex,l_vgg=[],[],[] - os.makedirs(savePath, exist_ok=True) - os.makedirs(savePath+"/rgbd", exist_ok=True) - - try: - tqdm._instances.clear() - except Exception: - pass - - near_far = test_dataset.near_far - for idx, c2w in tqdm(enumerate(c2ws)): - - W, H = test_dataset.img_wh - - c2w = torch.FloatTensor(c2w) - rays_o, rays_d = get_rays(test_dataset.directions, c2w) # both (h*w, 3) - if ndc_ray: - rays_o, rays_d = ndc_rays_blender(H, W, test_dataset.focal[0], 1.0, rays_o, rays_d) - rays = torch.cat([rays_o, rays_d], 1) # (h*w, 6) - - rgb_map, _, depth_map, _, _ = renderer(rays, tensorf, chunk=8192, N_samples=N_samples, - ndc_ray=ndc_ray, white_bg = white_bg, device=device) - rgb_map = rgb_map.clamp(0.0, 1.0) - - rgb_map, depth_map = rgb_map.reshape(H, W, 3).cpu(), depth_map.reshape(H, W).cpu() - - depth_map, _ = visualize_depth_numpy(depth_map.numpy(),near_far) - - rgb_map = (rgb_map.numpy() * 255).astype('uint8') - # rgb_map = np.concatenate((rgb_map, depth_map), axis=1) - rgb_maps.append(rgb_map) - depth_maps.append(depth_map) - - if (save_imgs == 1): - if savePath is not None: - imageio.imwrite(f'{savePath}/{prtx}{idx:03d}.png', rgb_map) - rgb_map = np.concatenate((rgb_map, depth_map), axis=1) - imageio.imwrite(f'{savePath}/rgbd/{prtx}{idx:03d}.png', rgb_map) - - imageio.mimwrite(f'{savePath}/{prtx}video.mp4', np.stack(rgb_maps), fps=30, quality=8) - imageio.mimwrite(f'{savePath}/{prtx}depthvideo.mp4', np.stack(depth_maps), fps=30, quality=8) - - if PSNRs: - psnr = np.mean(np.asarray(PSNRs)) - if compute_extra_metrics: - ssim = np.mean(np.asarray(ssims)) - l_a = np.mean(np.asarray(l_alex)) - l_v = np.mean(np.asarray(l_vgg)) - np.savetxt(f'{savePath}/{prtx}mean.txt', np.asarray([psnr, ssim, l_a, l_v])) - else: - np.savetxt(f'{savePath}/{prtx}mean.txt', np.asarray([psnr])) - return PSNRs - diff --git a/TensoRF/requirements.txt b/TensoRF/requirements.txt deleted file mode 100644 index f8c4231..0000000 --- a/TensoRF/requirements.txt +++ /dev/null @@ -1,45 +0,0 @@ -absl-py==1.1.0 -cachetools==5.2.0 -certifi==2022.6.15 -charset-normalizer==2.0.12 -ConfigArgParse==1.5.3 -google-auth==2.8.0 -google-auth-oauthlib==0.4.6 -grpcio==1.46.3 -idna==3.3 -imageio==2.19.3 -imageio-ffmpeg==0.4.7 -kornia==0.6.5 -lpips==0.1.4 -Markdown==3.3.7 -networkx==2.8.4 -numpy==1.22.4 -oauthlib==3.2.0 -opencv-python-headless==4.9.0.80 -packaging==21.3 -Pillow==9.1.1 -pika==1.2.0 -plyfile==0.7.4 -protobuf==3.19.4 -pyasn1==0.4.8 -pyasn1-modules==0.2.8 -pyparsing==3.0.9 -PyWavelets==1.3.0 -requests==2.28.0 -requests-oauthlib==1.3.1 -rsa==4.8 -scikit-image==0.19.3 -scipy==1.8.1 -six==1.16.0 -tensorboard==2.9.1 -tensorboard-data-server==0.6.1 -tensorboard-plugin-wit==1.8.1 -tifffile==2022.5.4 -torch==2.2.0 -torchvision==0.17.0 -tqdm==4.64.0 -typing_extensions==4.8.0 -urllib3==1.26.9 -Werkzeug==2.1.2 -Flask==2.1.3 -python-dotenv \ No newline at end of file diff --git a/TensoRF/scripts/synthetic_to_ours.py b/TensoRF/scripts/synthetic_to_ours.py deleted file mode 100644 index f8678bb..0000000 --- a/TensoRF/scripts/synthetic_to_ours.py +++ /dev/null @@ -1,28 +0,0 @@ -import json -import sys - -from cv2 import inpaint - -if __name__ =='__main__': - print("Starting conversion") - input_file = sys.argv[1] - - input_str = open(input_file) - input = json.loads(input_str.read()) - input["vid_width"] = 800 - input["vid_height"] = 800 - focal = input["camera_angle_x"] - input["intrinsic_matrix"] = [[focal, 0, 400], - [0, focal, 0, 400], - [0,0,1]] - for f in input["frames"]: - f["extrinsic_matrix"] = f["transform_matrix"] - - print(json.dumps(input)) - - with open("new_"+input_file, "w") as outfile: - outfile.write(json.dumps(input,indent=4)) - - - - diff --git a/TensoRF/train.py b/TensoRF/train.py deleted file mode 100644 index 5f005fc..0000000 --- a/TensoRF/train.py +++ /dev/null @@ -1,322 +0,0 @@ - -import datetime -import os -import json, random -import sys - -from tqdm.auto import tqdm -from opt import config_parser -from renderer import * -from utils import * -from torch.utils.tensorboard import SummaryWriter -from dataLoader import dataset_dict - -import logging - -device = torch.device("cuda" if torch.cuda.is_available() else "cpu") - -renderer = OctreeRender_trilinear_fast - - -class SimpleSampler: - def __init__(self, total, batch): - self.total = total - self.batch = batch - self.curr = total - self.ids = None - - def nextids(self): - self.curr+=self.batch - if self.curr + self.batch > self.total: - self.ids = torch.LongTensor(np.random.permutation(self.total)) - self.curr = 0 - return self.ids[self.curr:self.curr+self.batch] - - -@torch.no_grad() -def export_mesh(args): - - ckpt = torch.load(args.ckpt, map_location=device) - kwargs = ckpt['kwargs'] - kwargs.update({'device': device}) - tensorf = eval(args.model_name)(**kwargs) - tensorf.load(ckpt) - - alpha,_ = tensorf.getDenseAlpha() - convert_sdf_samples_to_ply(alpha.cpu(), f'{args.ckpt[:-3]}.ply',bbox=tensorf.aabb.cpu(), level=0.005) - - -@torch.no_grad() -def render_test(args): - logger = logging.getLogger('nerf-worker') - # init dataset - dataset = dataset_dict[args.dataset_name] - test_dataset = dataset(args.datadir, split='test', downsample=args.downsample_train, is_stack=True) - white_bg = test_dataset.white_bg - ndc_ray = args.ndc_ray - - if not os.path.exists(args.ckpt): - logger.error("the ckpt path does not exist!") - return - - ckpt = torch.load(args.ckpt, map_location=device) - kwargs = ckpt['kwargs'] - kwargs.update({'device': device}) - tensorf = eval(args.model_name)(**kwargs) - tensorf.load(ckpt) - - logfolder = os.path.dirname(args.ckpt) - if args.render_train: - os.makedirs(f'{logfolder}/imgs_train_all', exist_ok=True) - train_dataset = dataset(args.datadir, split='train', downsample=args.downsample_train, is_stack=True) - PSNRs_test = evaluation(train_dataset,tensorf, args, renderer, f'{logfolder}/imgs_train_all/', - N_vis=-1, N_samples=-1, white_bg = white_bg, ndc_ray=ndc_ray,device=device, save_imgs=args.png_mode) - logger.info("======> {} train all psnr: {} <========================".format(args.expname,np.mean(PSNRs_test))) - - if args.render_test: - os.makedirs(f'{logfolder}/{args.expname}/imgs_test_all', exist_ok=True) - evaluation(test_dataset,tensorf, args, renderer, f'{logfolder}/{args.expname}/imgs_test_all/', - N_vis=-1, N_samples=-1, white_bg = white_bg, ndc_ray=ndc_ray,device=device, save_imgs=args.png_mode) - - if args.render_path: - c2ws = test_dataset.render_path - os.makedirs(f'{logfolder}/{args.expname}/imgs_path_all', exist_ok=True) - evaluation_path(test_dataset,tensorf, c2ws, renderer, f'{logfolder}/{args.expname}/imgs_path_all/', - N_vis=-1, N_samples=-1, white_bg = white_bg, ndc_ray=ndc_ray,device=device, save_imgs=args.png_mode) - -def reconstruction(args): - logger = logging.getLogger('nerf-worker') - # init dataset - dataset = dataset_dict[args.dataset_name] - train_dataset = dataset(args.datadir, split='train', downsample=args.downsample_train, is_stack=False) - test_dataset = dataset(args.datadir, split='test', downsample=args.downsample_train, is_stack=True) - white_bg = train_dataset.white_bg - near_far = train_dataset.near_far - ndc_ray = args.ndc_ray - - # init resolution - upsamp_list = args.upsamp_list - update_AlphaMask_list = args.update_AlphaMask_list - n_lamb_sigma = args.n_lamb_sigma - n_lamb_sh = args.n_lamb_sh - - - if args.add_timestamp: - logfolder = f'{args.basedir}/{args.expname}{datetime.datetime.now().strftime("-%Y%m%d-%H%M%S")}' - else: - logfolder = f'{args.basedir}/{args.expname}' - - - # init log file - os.makedirs(logfolder, exist_ok=True) - os.makedirs(f'{logfolder}/imgs_vis', exist_ok=True) - os.makedirs(f'{logfolder}/imgs_rgba', exist_ok=True) - os.makedirs(f'{logfolder}/rgba', exist_ok=True) - summary_writer = SummaryWriter(logfolder) - - - - # init parameters - # tensorVM, renderer = init_parameters(args, train_dataset.scene_bbox.to(device), reso_list[0]) - aabb = train_dataset.scene_bbox.to(device) - reso_cur = N_to_reso(args.N_voxel_init, aabb) - nSamples = min(args.nSamples, cal_n_samples(reso_cur,args.step_ratio)) - - - if args.ckpt is not None: - ckpt = torch.load(args.ckpt, map_location=device) - kwargs = ckpt['kwargs'] - kwargs.update({'device':device}) - tensorf = eval(args.model_name)(**kwargs) - tensorf.load(ckpt) - else: - tensorf = eval(args.model_name)(aabb, reso_cur, device, - density_n_comp=n_lamb_sigma, appearance_n_comp=n_lamb_sh, app_dim=args.data_dim_color, near_far=near_far, - shadingMode=args.shadingMode, alphaMask_thres=args.alpha_mask_thre, density_shift=args.density_shift, distance_scale=args.distance_scale, - pos_pe=args.pos_pe, view_pe=args.view_pe, fea_pe=args.fea_pe, featureC=args.featureC, step_ratio=args.step_ratio, fea2denseAct=args.fea2denseAct) - - - grad_vars = tensorf.get_optparam_groups(args.lr_init, args.lr_basis) - if args.lr_decay_iters > 0: - lr_factor = args.lr_decay_target_ratio**(1/args.lr_decay_iters) - else: - args.lr_decay_iters = args.n_iters - lr_factor = args.lr_decay_target_ratio**(1/args.n_iters) - - logger.info("lr decay {} {}".format(args.lr_decay_target_ratio, args.lr_decay_iters)) - - optimizer = torch.optim.Adam(grad_vars, betas=(0.9,0.99)) - - - #linear in logrithmic space - N_voxel_list = (torch.round(torch.exp(torch.linspace(np.log(args.N_voxel_init), np.log(args.N_voxel_final), len(upsamp_list)+1))).long()).tolist()[1:] - - - torch.cuda.empty_cache() - PSNRs,PSNRs_test = [],[0] - - allrays, allrgbs = train_dataset.all_rays, train_dataset.all_rgbs - if not args.ndc_ray: - allrays, allrgbs = tensorf.filtering_rays(allrays, allrgbs, bbox_only=True) - trainingSampler = SimpleSampler(allrays.shape[0], args.batch_size) - - Ortho_reg_weight = args.Ortho_weight - logger.info("initial Ortho_reg_weight {}".format(Ortho_reg_weight)) - - L1_reg_weight = args.L1_weight_inital - logger.info("initial L1_reg_weight {}".format(L1_reg_weight)) - TV_weight_density, TV_weight_app = args.TV_weight_density, args.TV_weight_app - tvreg = TVLoss() - logger.info("initial TV_weight density: {} appearance: {}".format(TV_weight_density,TV_weight_app)) - - - pbar = tqdm(range(args.n_iters), miniters=args.progress_refresh_rate, file=sys.stdout) - - # Main training loop - for iteration in pbar: - # sample image ray pair to train on (could batch this process) - ray_idx = trainingSampler.nextids() - rays_train, rgb_train = allrays[ray_idx], allrgbs[ray_idx].to(device) - - #rgb_map, alphas_map, depth_map, weights, uncertainty - rgb_map, alphas_map, depth_map, weights, uncertainty = renderer(rays_train, tensorf, chunk=args.batch_size, - N_samples=nSamples, white_bg = white_bg, ndc_ray=ndc_ray, device=device, is_train=True) - - # The primary loss is MSE of the rendered image vs the ground truth - loss = torch.mean((rgb_map - rgb_train) ** 2) - - # loss - total_loss = loss - if Ortho_reg_weight > 0: - loss_reg = tensorf.vector_comp_diffs() - total_loss += Ortho_reg_weight*loss_reg - summary_writer.add_scalar('train/reg', loss_reg.detach().item(), global_step=iteration) - if L1_reg_weight > 0: - loss_reg_L1 = tensorf.density_L1() - total_loss += L1_reg_weight*loss_reg_L1 - summary_writer.add_scalar('train/reg_l1', loss_reg_L1.detach().item(), global_step=iteration) - - if TV_weight_density>0: - TV_weight_density *= lr_factor - loss_tv = tensorf.TV_loss_density(tvreg) * TV_weight_density - total_loss = total_loss + loss_tv - summary_writer.add_scalar('train/reg_tv_density', loss_tv.detach().item(), global_step=iteration) - if TV_weight_app>0: - TV_weight_app *= lr_factor - loss_tv = tensorf.TV_loss_app(tvreg)*TV_weight_app - total_loss = total_loss + loss_tv - summary_writer.add_scalar('train/reg_tv_app', loss_tv.detach().item(), global_step=iteration) - - # backprop step that optimizes the radiance field model - optimizer.zero_grad() - total_loss.backward() - optimizer.step() - - # detach loss from gradiant graph and converts it to a float - loss = loss.detach().item() - - # logging - PSNRs.append(-10.0 * np.log(loss) / np.log(10.0)) - summary_writer.add_scalar('train/PSNR', PSNRs[-1], global_step=iteration) - summary_writer.add_scalar('train/mse', loss, global_step=iteration) - - # update learning rate - for param_group in optimizer.param_groups: - param_group['lr'] = param_group['lr'] * lr_factor - - # Print the current values of the losses. - if iteration % args.progress_refresh_rate == 0: - pbar.set_description( - f'Iteration {iteration:05d}:' - + f' train_psnr = {float(np.mean(PSNRs)):.2f}' - + f' test_psnr = {float(np.mean(PSNRs_test)):.2f}' - + f' mse = {loss:.6f}' - ) - PSNRs = [] - - - if iteration % args.vis_every == args.vis_every - 1 and args.N_vis!=0: - PSNRs_test = evaluation(test_dataset,tensorf, args, renderer, f'{logfolder}/imgs_vis/', N_vis=args.N_vis, - prtx=f'{iteration:06d}_', N_samples=nSamples, white_bg = white_bg, ndc_ray=ndc_ray, compute_extra_metrics=False, save_imgs=args.png_mode) - summary_writer.add_scalar('test/psnr', np.mean(PSNRs_test), global_step=iteration) - - - # Every few thousand iterations mask voxel representation to lower memory consumption - if iteration in update_AlphaMask_list: - - if reso_cur[0] * reso_cur[1] * reso_cur[2]<256**3:# update volume resolution - reso_mask = reso_cur - new_aabb = tensorf.updateAlphaMask(tuple(reso_mask)) - if iteration == update_AlphaMask_list[0]: - tensorf.shrink(new_aabb) - # tensorVM.alphaMask = None - L1_reg_weight = args.L1_weight_rest - logger.ino("continuing L1_reg_weight {}".format(L1_reg_weight)) - - - if not args.ndc_ray and iteration == update_AlphaMask_list[1]: - # filter rays outside the bbox - allrays,allrgbs = tensorf.filtering_rays(allrays,allrgbs) - trainingSampler = SimpleSampler(allrgbs.shape[0], args.batch_size) - - # potential hyper parameter tuning - # Gradually increase from initial to final voxel count - if iteration in upsamp_list: - n_voxels = N_voxel_list.pop(0) - reso_cur = N_to_reso(n_voxels, tensorf.aabb) - nSamples = min(args.nSamples, cal_n_samples(reso_cur,args.step_ratio)) - tensorf.upsample_volume_grid(reso_cur) - - if args.lr_upsample_reset: - logger.info("reset lr to initial") - lr_scale = 1 #0.1 ** (iteration / args.n_iters) - else: - lr_scale = args.lr_decay_target_ratio ** (iteration / args.n_iters) - grad_vars = tensorf.get_optparam_groups(args.lr_init*lr_scale, args.lr_basis*lr_scale) - optimizer = torch.optim.Adam(grad_vars, betas=(0.9, 0.99)) - - - tensorf.save(f'{logfolder}/{args.expname}.th') - - - if args.render_train: - os.makedirs(f'{logfolder}/imgs_train_all', exist_ok=True) - train_dataset = dataset(args.datadir, split='train', downsample=args.downsample_train, is_stack=True) - PSNRs_test = evaluation(train_dataset,tensorf, args, renderer, f'{logfolder}/imgs_train_all/', - N_vis=-1, N_samples=-1, white_bg = white_bg, ndc_ray=ndc_ray,device=device, save_imgs=args.png_mode) - logger.info("======> {} test all psnr: {} <========================".format(args.expname,np.mean(PSNRs_test))) - - if args.render_test: - os.makedirs(f'{logfolder}/imgs_test_all', exist_ok=True) - PSNRs_test = evaluation(test_dataset,tensorf, args, renderer, f'{logfolder}/imgs_test_all/', - N_vis=-1, N_samples=-1, white_bg = white_bg, ndc_ray=ndc_ray,device=device, save_imgs=args.png_mode) - summary_writer.add_scalar('test/psnr_all', np.mean(PSNRs_test), global_step=iteration) - logger.info("======> {} test all psnr: {} <========================".format(args.expname,np.mean(PSNRs_test))) - - if args.render_path: - c2ws = test_dataset.render_path - # c2ws = test_dataset.poses - logger.info("========>".format(c2ws.shape)) - os.makedirs(f'{logfolder}/imgs_path_all', exist_ok=True) - evaluation_path(test_dataset,tensorf, c2ws, renderer, f'{logfolder}/imgs_path_all/', - N_vis=-1, N_samples=-1, white_bg = white_bg, ndc_ray=ndc_ray,device=device, save_imgs=args.png_mode) - - -if __name__ == '__main__': - logger = logging.getLogger("nerf-worker") - - torch.set_default_dtype(torch.float32) - torch.manual_seed(20211202) - np.random.seed(20211202) - - args = config_parser() - logger.info(args) - - if args.export_mesh: - export_mesh(args) - - if args.render_only and (args.render_test or args.render_path): - render_test(args) - else: - reconstruction(args) - diff --git a/TensoRF/utils.py b/TensoRF/utils.py deleted file mode 100644 index cdde1c2..0000000 --- a/TensoRF/utils.py +++ /dev/null @@ -1,229 +0,0 @@ -import cv2,torch -import numpy as np -from PIL import Image -import torchvision.transforms as T -import torch.nn.functional as F -import scipy.signal - -import logging - -mse2psnr = lambda x : -10. * torch.log(x) / torch.log(torch.Tensor([10.])) - - -def visualize_depth_numpy(depth, minmax=None, cmap=cv2.COLORMAP_JET): - """ - depth: (H, W) - """ - - x = np.nan_to_num(depth) # change nan to 0 - if minmax is None: - mi = np.min(x[x>0]) # get minimum positive depth (ignore background) - ma = np.max(x) - else: - mi,ma = minmax - - x = (x-mi)/(ma-mi+1e-8) # normalize to 0~1 - x = (255*x).astype(np.uint8) - x_ = cv2.applyColorMap(x, cmap) - return x_, [mi,ma] - -def init_log(log, keys): - for key in keys: - log[key] = torch.tensor([0.0], dtype=float) - return log - -def visualize_depth(depth, minmax=None, cmap=cv2.COLORMAP_JET): - """ - depth: (H, W) - """ - if type(depth) is not np.ndarray: - depth = depth.cpu().numpy() - - x = np.nan_to_num(depth) # change nan to 0 - if minmax is None: - mi = np.min(x[x>0]) # get minimum positive depth (ignore background) - ma = np.max(x) - else: - mi,ma = minmax - - x = (x-mi)/(ma-mi+1e-8) # normalize to 0~1 - x = (255*x).astype(np.uint8) - x_ = Image.fromarray(cv2.applyColorMap(x, cmap)) - x_ = T.ToTensor()(x_) # (3, H, W) - return x_, [mi,ma] - -# params: bounding box and number of voxels -# returns: number of voxels in each axis -def N_to_reso(n_voxels, bbox): - xyz_min, xyz_max = bbox - dim = len(xyz_min) - voxel_size = ((xyz_max - xyz_min).prod() / n_voxels).pow(1 / dim) # size of one axis of each voxel - return ((xyz_max - xyz_min) / voxel_size).long().tolist() - -def cal_n_samples(reso, step_ratio=0.5): - return int(np.linalg.norm(reso)/step_ratio) - - - - -__LPIPS__ = {} -def init_lpips(net_name, device): - logger = logging.getLogger('nerf-worker') - assert net_name in ['alex', 'vgg'] - import lpips - logger.info("init_lpips: lpips_{}".format(net_name)) - return lpips.LPIPS(net=net_name, version='0.1').eval().to(device) - -def rgb_lpips(np_gt, np_im, net_name, device): - if net_name not in __LPIPS__: - __LPIPS__[net_name] = init_lpips(net_name, device) - gt = torch.from_numpy(np_gt).permute([2, 0, 1]).contiguous().to(device) - im = torch.from_numpy(np_im).permute([2, 0, 1]).contiguous().to(device) - return __LPIPS__[net_name](gt, im, normalize=True).item() - - -def findItem(items, target): - for one in items: - if one[:len(target)]==target: - return one - return None - - -''' Evaluation metrics (ssim, lpips) -''' -def rgb_ssim(img0, img1, max_val, - filter_size=11, - filter_sigma=1.5, - k1=0.01, - k2=0.03, - return_map=False): - # Modified from https://github.com/google/mipnerf/blob/16e73dfdb52044dcceb47cda5243a686391a6e0f/internal/math.py#L58 - assert len(img0.shape) == 3 - assert img0.shape[-1] == 3 - assert img0.shape == img1.shape - - # Construct a 1D Gaussian blur filter. - hw = filter_size // 2 - shift = (2 * hw - filter_size + 1) / 2 - f_i = ((np.arange(filter_size) - hw + shift) / filter_sigma)**2 - filt = np.exp(-0.5 * f_i) - filt /= np.sum(filt) - - # Blur in x and y (faster than the 2D convolution). - def convolve2d(z, f): - return scipy.signal.convolve2d(z, f, mode='valid') - - filt_fn = lambda z: np.stack([ - convolve2d(convolve2d(z[...,i], filt[:, None]), filt[None, :]) - for i in range(z.shape[-1])], -1) - mu0 = filt_fn(img0) - mu1 = filt_fn(img1) - mu00 = mu0 * mu0 - mu11 = mu1 * mu1 - mu01 = mu0 * mu1 - sigma00 = filt_fn(img0**2) - mu00 - sigma11 = filt_fn(img1**2) - mu11 - sigma01 = filt_fn(img0 * img1) - mu01 - - # Clip the variances and covariances to valid values. - # Variance must be non-negative: - sigma00 = np.maximum(0., sigma00) - sigma11 = np.maximum(0., sigma11) - sigma01 = np.sign(sigma01) * np.minimum( - np.sqrt(sigma00 * sigma11), np.abs(sigma01)) - c1 = (k1 * max_val)**2 - c2 = (k2 * max_val)**2 - numer = (2 * mu01 + c1) * (2 * sigma01 + c2) - denom = (mu00 + mu11 + c1) * (sigma00 + sigma11 + c2) - ssim_map = numer / denom - ssim = np.mean(ssim_map) - return ssim_map if return_map else ssim - - -import torch.nn as nn -class TVLoss(nn.Module): - def __init__(self,TVLoss_weight=1): - super(TVLoss,self).__init__() - self.TVLoss_weight = TVLoss_weight - - def forward(self,x): - batch_size = x.size()[0] - h_x = x.size()[2] - w_x = x.size()[3] - count_h = self._tensor_size(x[:,:,1:,:]) - count_w = self._tensor_size(x[:,:,:,1:]) - count_w = max(count_w, 1) - h_tv = torch.pow((x[:,:,1:,:]-x[:,:,:h_x-1,:]),2).sum() - w_tv = torch.pow((x[:,:,:,1:]-x[:,:,:,:w_x-1]),2).sum() - return self.TVLoss_weight*2*(h_tv/count_h+w_tv/count_w)/batch_size - - def _tensor_size(self,t): - return t.size()[1]*t.size()[2]*t.size()[3] - - - -import plyfile -import skimage.measure -def convert_sdf_samples_to_ply( - pytorch_3d_sdf_tensor, - ply_filename_out, - bbox, - level=0.5, - offset=None, - scale=None, -): - """ - Convert sdf samples to .ply - - :param pytorch_3d_sdf_tensor: a torch.FloatTensor of shape (n,n,n) - :voxel_grid_origin: a list of three floats: the bottom, left, down origin of the voxel grid - :voxel_size: float, the size of the voxels - :ply_filename_out: string, path of the filename to save to - - This function adapted from: https://github.com/RobotLocomotion/spartan - """ - - logger = logging.getLogger('nerf-worker') - - numpy_3d_sdf_tensor = pytorch_3d_sdf_tensor.numpy() - voxel_size = list((bbox[1]-bbox[0]) / np.array(pytorch_3d_sdf_tensor.shape)) - - verts, faces, normals, values = skimage.measure.marching_cubes( - numpy_3d_sdf_tensor, level=level, spacing=voxel_size - ) - faces = faces[...,::-1] # inverse face orientation - - # transform from voxel coordinates to camera coordinates - # note x and y are flipped in the output of marching_cubes - mesh_points = np.zeros_like(verts) - mesh_points[:, 0] = bbox[0,0] + verts[:, 0] - mesh_points[:, 1] = bbox[0,1] + verts[:, 1] - mesh_points[:, 2] = bbox[0,2] + verts[:, 2] - - # apply additional offset and scale - if scale is not None: - mesh_points = mesh_points / scale - if offset is not None: - mesh_points = mesh_points - offset - - # try writing to the ply file - - num_verts = verts.shape[0] - num_faces = faces.shape[0] - - verts_tuple = np.zeros((num_verts,), dtype=[("x", "f4"), ("y", "f4"), ("z", "f4")]) - - for i in range(0, num_verts): - verts_tuple[i] = tuple(mesh_points[i, :]) - - faces_building = [] - for i in range(0, num_faces): - faces_building.append(((faces[i, :].tolist(),))) - faces_tuple = np.array(faces_building, dtype=[("vertex_indices", "i4", (3,))]) - - el_verts = plyfile.PlyElement.describe(verts_tuple, "vertex") - el_faces = plyfile.PlyElement.describe(faces_tuple, "face") - - ply_data = plyfile.PlyData([el_verts, el_faces]) - logger.info("saving mesh to {}".format(ply_filename_out)) - ply_data.write(ply_filename_out) diff --git a/TensoRF/worker.py b/TensoRF/worker.py deleted file mode 100644 index 3f078cd..0000000 --- a/TensoRF/worker.py +++ /dev/null @@ -1,332 +0,0 @@ -# Based on train.py -import datetime -import os -import json, random -import sys - -from tqdm.auto import tqdm -from opt import config_parser -from renderer import * -from utils import * -from torch.utils.tensorboard import SummaryWriter -from dataLoader import dataset_dict - -import logging - -device = torch.device("cuda" if torch.cuda.is_available() else "cpu") - -renderer = OctreeRender_trilinear_fast - -class SimpleSampler: - def __init__(self, total, batch): - self.total = total - self.batch = batch - self.curr = total - self.ids = None - - def nextids(self): - self.curr+=self.batch - if self.curr + self.batch > self.total: - self.ids = torch.LongTensor(np.random.permutation(self.total)) - self.curr = 0 - return self.ids[self.curr:self.curr+self.batch] - - -@torch.no_grad() -def export_mesh(args): - - ckpt = torch.load(args.ckpt, map_location=device) - kwargs = ckpt['kwargs'] - kwargs.update({'device': device}) - tensorf = eval(args.model_name)(**kwargs) - tensorf.load(ckpt) - - alpha,_ = tensorf.getDenseAlpha() - convert_sdf_samples_to_ply(alpha.cpu(), f'{args.ckpt[:-3]}.ply',bbox=tensorf.aabb.cpu(), level=0.005) - - -@torch.no_grad() -def render_novel_view(args, logfolder, tensorf_model): - logger = logging.getLogger('nerf-worker') - - # init dataset under a "test" annotation - dataset = dataset_dict[args.dataset_name] - test_dataset = dataset(args.datadir, split='render', downsample=args.downsample_train, is_stack=True) - white_bg = test_dataset.white_bg - ndc_ray = args.ndc_ray - - tensorf_model.aabb = test_dataset.scene_bbox.to(device) - - - logger.info("Rendering scene to be saved at: {}".format(logfolder)) - # render path and save images to imgs_path_all - if args.render_path: - logger.info("Rendering path") - c2ws = test_dataset.render_path - os.makedirs(f'{logfolder}/{args.expname}/imgs_path_all', exist_ok=True) - evaluation_path(test_dataset,tensorf_model, c2ws, renderer, f'{logfolder}/{args.expname}/imgs_path_all/', - N_vis=-1, N_samples=-1, white_bg = white_bg, ndc_ray=ndc_ray,device=device, save_imgs = args.png_mode) - else: - logger.info("Rendering all") - os.makedirs(f'{logfolder}/imgs_render_all', exist_ok=True) - evaluation(test_dataset,tensorf_model, args, renderer, f'{logfolder}/imgs_render_all/', - N_vis=-1, N_samples=-1, white_bg = white_bg, ndc_ray=ndc_ray,device=device, save_imgs = args.png_mode) - - # video saved to {logfolder}/{args.expname}/imgs_path_all/video.mp4 - return f'{logfolder}/imgs_path_all/video.mp4' - -# Build radiance Field -def train_tensorf(args): - logger = logging.getLogger('nerf-worker') - logger.info("Training TensoRF") - # init dataset - dataset = dataset_dict[args.dataset_name] - train_dataset = dataset(args.datadir, split='train', downsample=args.downsample_train, is_stack=False) - white_bg = train_dataset.white_bg - near_far = train_dataset.near_far - ndc_ray = args.ndc_ray - - # init resolution - upsamp_list = args.upsamp_list - update_AlphaMask_list = args.update_AlphaMask_list - n_lamb_sigma = args.n_lamb_sigma - n_lamb_sh = args.n_lamb_sh - - - if args.add_timestamp: - logfolder = f'{args.basedir}/{args.expname}{datetime.datetime.now().strftime("-%Y%m%d-%H%M%S")}' - else: - logfolder = f'{args.basedir}/{args.expname}' - - - # init log file - os.makedirs(logfolder, exist_ok=True) - os.makedirs(f'{logfolder}/imgs_vis', exist_ok=True) - os.makedirs(f'{logfolder}/imgs_rgba', exist_ok=True) - os.makedirs(f'{logfolder}/rgba', exist_ok=True) - summary_writer = SummaryWriter(logfolder) - - - - # init parameters - # tensorVM, renderer = init_parameters(args, train_dataset.scene_bbox.to(device), reso_list[0]) - aabb = train_dataset.scene_bbox.to(device) - reso_cur = N_to_reso(args.N_voxel_init, aabb) - nSamples = min(args.nSamples, cal_n_samples(reso_cur,args.step_ratio)) - - # Load model checkpoint - if args.ckpt is not None: - ckpt = torch.load(args.ckpt, map_location=device) - kwargs = ckpt['kwargs'] - kwargs.update({'device':device}) - tensorf = eval(args.model_name)(**kwargs) - tensorf.load(ckpt) - else: - tensorf = eval(args.model_name)(aabb, reso_cur, device, - density_n_comp=n_lamb_sigma, appearance_n_comp=n_lamb_sh, app_dim=args.data_dim_color, near_far=near_far, - shadingMode=args.shadingMode, alphaMask_thres=args.alpha_mask_thre, density_shift=args.density_shift, distance_scale=args.distance_scale, - pos_pe=args.pos_pe, view_pe=args.view_pe, fea_pe=args.fea_pe, featureC=args.featureC, step_ratio=args.step_ratio, fea2denseAct=args.fea2denseAct) - logger.info(f"type of tensorf: {type(tensorf)}") - - # learning rate for Adam optimizer - grad_vars = tensorf.get_optparam_groups(args.lr_init, args.lr_basis) - if args.lr_decay_iters > 0: - lr_factor = args.lr_decay_target_ratio**(1/args.lr_decay_iters) - else: - args.lr_decay_iters = args.n_iters - lr_factor = args.lr_decay_target_ratio**(1/args.n_iters) - - logger.info("lr decay {} {}".format(args.lr_decay_target_ratio, args.lr_decay_iters)) - - # modifying the optimizer is a potential avenue for further performance gains - optimizer = torch.optim.Adam(grad_vars, betas=(0.9,0.99)) - - - #linear in logrithmic space - # Defines the number of voxels to up-scale to iteratively throughout training - N_voxel_list = (torch.round(torch.exp(torch.linspace(np.log(args.N_voxel_init), np.log(args.N_voxel_final), len(upsamp_list)+1))).long()).tolist()[1:] - - - torch.cuda.empty_cache() - PSNRs,PSNRs_test = [],[0] - - allrays, allrgbs = train_dataset.all_rays, train_dataset.all_rgbs - if not args.ndc_ray: - allrays, allrgbs = tensorf.filtering_rays(allrays, allrgbs, bbox_only=True) - trainingSampler = SimpleSampler(allrays.shape[0], args.batch_size) - - Ortho_reg_weight = args.Ortho_weight - logger.info("initial Ortho_reg_weight {}".format(Ortho_reg_weight)) - - L1_reg_weight = args.L1_weight_inital - logger.info("initial L1_reg_weight {}".format(L1_reg_weight)) - TV_weight_density, TV_weight_app = args.TV_weight_density, args.TV_weight_app - tvreg = TVLoss() - logger.info("initial TV_weight density: {} appearance: {}".format(TV_weight_density,TV_weight_app)) - - - pbar = tqdm(range(args.n_iters), miniters=args.progress_refresh_rate, file=sys.stdout) - - # Main training loop - for iteration in pbar: - # sample image ray pair to train on (could batch this process) - ray_idx = trainingSampler.nextids() - rays_train, rgb_train = allrays[ray_idx], allrgbs[ray_idx].to(device) - - #rgb_map, alphas_map, depth_map, weights, uncertainty - rgb_map, alphas_map, depth_map, weights, uncertainty = renderer(rays_train, tensorf, chunk=args.batch_size, - N_samples=nSamples, white_bg = white_bg, ndc_ray=ndc_ray, device=device, is_train=True) - - # The primary loss is MSE of the rendered image vs the ground truth - loss = torch.mean((rgb_map - rgb_train) ** 2) - - # loss - total_loss = loss - if Ortho_reg_weight > 0: - loss_reg = tensorf.vector_comp_diffs() - total_loss += Ortho_reg_weight*loss_reg - summary_writer.add_scalar('train/reg', loss_reg.detach().item(), global_step=iteration) - if L1_reg_weight > 0: - loss_reg_L1 = tensorf.density_L1() - total_loss += L1_reg_weight*loss_reg_L1 - summary_writer.add_scalar('train/reg_l1', loss_reg_L1.detach().item(), global_step=iteration) - - if TV_weight_density>0: - TV_weight_density *= lr_factor - loss_tv = tensorf.TV_loss_density(tvreg) * TV_weight_density - total_loss = total_loss + loss_tv - summary_writer.add_scalar('train/reg_tv_density', loss_tv.detach().item(), global_step=iteration) - if TV_weight_app>0: - TV_weight_app *= lr_factor - loss_tv = loss_tv + tensorf.TV_loss_app(tvreg)*TV_weight_app - total_loss = total_loss + loss_tv - summary_writer.add_scalar('train/reg_tv_app', loss_tv.detach().item(), global_step=iteration) - - # backprop step that optimizes the radiance field model - optimizer.zero_grad() - total_loss.backward() - optimizer.step() - - # detach loss from gradiant graph and converts it to a float - loss = loss.detach().item() - - # logging - PSNRs.append(-10.0 * np.log(loss) / np.log(10.0)) - summary_writer.add_scalar('train/PSNR', PSNRs[-1], global_step=iteration) - summary_writer.add_scalar('train/mse', loss, global_step=iteration) - - # update learning rate - for param_group in optimizer.param_groups: - param_group['lr'] = param_group['lr'] * lr_factor - - # Print the current values of the losses. - if iteration % args.progress_refresh_rate == 0: - pbar.set_description( - f'Iteration {iteration:05d}:' - + f' train_psnr = {float(np.mean(PSNRs)):.2f}' - + f' test_psnr = {float(np.mean(PSNRs_test)):.2f}' - + f' mse = {loss:.6f}' - ) - PSNRs = [] - logger.info( - f'Iteration {iteration:05d}:' - + f' train_psnr = {float(np.mean(PSNRs)):.2f}' - + f' test_psnr = {float(np.mean(PSNRs_test)):.2f}' - + f' mse = {loss:.6f}' - ) - - - # Every few thousand iterations mask voxel representation to lower memory consumption - if iteration in update_AlphaMask_list: - - if reso_cur[0] * reso_cur[1] * reso_cur[2]<256**3:# update volume resolution - reso_mask = reso_cur - new_aabb = tensorf.updateAlphaMask(tuple(reso_mask)) - if iteration == update_AlphaMask_list[0]: - tensorf.shrink(new_aabb) - # tensorVM.alphaMask = None - L1_reg_weight = args.L1_weight_rest - logger.info("continuing L1_reg_weight {}".format(L1_reg_weight)) - - - if not args.ndc_ray and iteration == update_AlphaMask_list[1]: - # filter rays outside the bbox - allrays,allrgbs = tensorf.filtering_rays(allrays,allrgbs) - trainingSampler = SimpleSampler(allrgbs.shape[0], args.batch_size) - - # potential hyper parameter tuning - # Gradually increase from initial to final voxel count - if iteration in upsamp_list: - n_voxels = N_voxel_list.pop(0) - reso_cur = N_to_reso(n_voxels, tensorf.aabb) - nSamples = min(args.nSamples, cal_n_samples(reso_cur,args.step_ratio)) - tensorf.upsample_volume_grid(reso_cur) - - if args.lr_upsample_reset: - logger.info("reset lr to initial") - lr_scale = 1 #0.1 ** (iteration / args.n_iters) - else: - lr_scale = args.lr_decay_target_ratio ** (iteration / args.n_iters) - grad_vars = tensorf.get_optparam_groups(args.lr_init*lr_scale, args.lr_basis*lr_scale) - optimizer = torch.optim.Adam(grad_vars, betas=(0.9, 0.99)) - - - # save model to file - tensorf.save(f'{logfolder}/{args.expname}.th') - return logfolder, tensorf - - -from fileServer import start_flask -from multiprocessing import Process -import threading -import time -# Operates in two modes, trains a new model or loads from a file -def main(): - - logger = logging.getLogger('nerf-worker') - #flaskProcess = threading.Thread(target=start_flask, args= ()) - #flaskProcess.start() - #time.sleep(5) - - # wait for que and load images into local directory - torch.set_default_dtype(torch.float32) - torch.manual_seed(20211202) - np.random.seed(20211202) - # format images and training and test json - - # load worker config - args = config_parser() - logger.info(args) - - - if args.render_only: - if not os.path.exists(args.ckpt): - logger.error("the ckpt path does not exist!") - return - - ckpt = torch.load(args.ckpt, map_location=device) - kwargs = ckpt['kwargs'] - kwargs.update({'device': device}) - tensorf_model = eval(args.model_name)(**kwargs) - tensorf_model.load(ckpt) - logfolder = os.path.dirname(args.ckpt) - else: - # train TensoRF on all input data and saves model to file - # (in the future train on part, test to confirm performance, then train on test set) - logfolder, tensorf_model = train_tensorf(args) - - - # Render new video (can be combined with train) - # Currently evaluation_path takes in a dataset object that has desired rays to render - video_filepath = render_novel_view(args, logfolder, tensorf_model) - - logger.info("Video rendered at :{}".format(video_filepath)) - # add results to que and clean up local files - - #flaskProcess.join() - - - - -if __name__ == '__main__': - main() \ No newline at end of file diff --git a/colmap/.gitignore b/colmap/.gitignore deleted file mode 100644 index f760688..0000000 --- a/colmap/.gitignore +++ /dev/null @@ -1,3 +0,0 @@ -data -__pycache__/ -*.log \ No newline at end of file diff --git a/colmap/Dockerfile b/colmap/Dockerfile deleted file mode 100644 index 368d788..0000000 --- a/colmap/Dockerfile +++ /dev/null @@ -1,30 +0,0 @@ -#FROM nvidia/cuda:10.2-devel-ubuntu18.04 -FROM colmap/colmap:latest - -WORKDIR /colmap - -# https://forums.developer.nvidia.com/t/gpg-error-http-developer-download-nvidia-com-compute-cuda-repos-ubuntu1804-x86-64/212904 -# Updating GPG Key for Nvidia manually -RUN apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1604/x86_64/3bf863cc.pub -RUN export DEBIAN_FRONTEND=noninteractive && \ - apt-get update -y && \ - apt-get install libssl-dev -y && \ - apt-get install software-properties-common -y && \ - add-apt-repository ppa:deadsnakes/ppa && \ - apt-get update -y && \ - apt-get install curl -y && \ - apt-get install python3.10 -y && \ - apt-get install python3-pip -y - - -RUN curl -sS https://bootstrap.pypa.io/get-pip.py | python3.10 -# Requirements.txt copied seperately to utilize docker build cache - -# Change for deployment to local directory -COPY ./colmap/requirements.txt requirements.txt -RUN python3.10 -m pip install --upgrade -r requirements.txt - -COPY . . - -#CMD ["python3", "-V"] -CMD ["python3.10", "main.py", "--config=configs/default.txt"] \ No newline at end of file diff --git a/colmap/Pipfile.lock b/colmap/Pipfile.lock deleted file mode 100644 index 47febe5..0000000 --- a/colmap/Pipfile.lock +++ /dev/null @@ -1,232 +0,0 @@ -{ - "_meta": { - "hash": { - "sha256": "93675b4afc21ab3060f05fdd174997fa40fc108ea20bdfbd45adae5c0998e040" - }, - "pipfile-spec": 6, - "requires": { - "python_version": "3.10" - }, - "sources": [ - { - "name": "pypi", - "url": "https://pypi.org/simple", - "verify_ssl": true - } - ] - }, - "default": { - "click": { - "hashes": [ - "sha256:7682dc8afb30297001674575ea00d1814d808d6a36af415a82bd481d37ba7b8e", - "sha256:bb4d8133cb15a609f44e8213d9b391b0809795062913b383c62be0ee95b1db48" - ], - "markers": "python_version >= '3.7'", - "version": "==8.1.3" - }, - "flask": { - "hashes": [ - "sha256:315ded2ddf8a6281567edb27393010fe3406188bafbfe65a3339d5787d89e477", - "sha256:fad5b446feb0d6db6aec0c3184d16a8c1f6c3e464b511649c8918a9be100b4fe" - ], - "index": "pypi", - "version": "==2.1.2" - }, - "itsdangerous": { - "hashes": [ - "sha256:2c2349112351b88699d8d4b6b075022c0808887cb7ad10069318a8b0bc88db44", - "sha256:5dbbc68b317e5e42f327f9021763545dc3fc3bfe22e6deb96aaf1fc38874156a" - ], - "markers": "python_version >= '3.7'", - "version": "==2.1.2" - }, - "jinja2": { - "hashes": [ - "sha256:31351a702a408a9e7595a8fc6150fc3f43bb6bf7e319770cbc0db9df9437e852", - "sha256:6088930bfe239f0e6710546ab9c19c9ef35e29792895fed6e6e31a023a182a61" - ], - "markers": "python_version >= '3.7'", - "version": "==3.1.2" - }, - "markupsafe": { - "hashes": [ - "sha256:0212a68688482dc52b2d45013df70d169f542b7394fc744c02a57374a4207003", - "sha256:089cf3dbf0cd6c100f02945abeb18484bd1ee57a079aefd52cffd17fba910b88", - "sha256:10c1bfff05d95783da83491be968e8fe789263689c02724e0c691933c52994f5", - "sha256:33b74d289bd2f5e527beadcaa3f401e0df0a89927c1559c8566c066fa4248ab7", - "sha256:3799351e2336dc91ea70b034983ee71cf2f9533cdff7c14c90ea126bfd95d65a", - "sha256:3ce11ee3f23f79dbd06fb3d63e2f6af7b12db1d46932fe7bd8afa259a5996603", - "sha256:421be9fbf0ffe9ffd7a378aafebbf6f4602d564d34be190fc19a193232fd12b1", - "sha256:43093fb83d8343aac0b1baa75516da6092f58f41200907ef92448ecab8825135", - "sha256:46d00d6cfecdde84d40e572d63735ef81423ad31184100411e6e3388d405e247", - "sha256:4a33dea2b688b3190ee12bd7cfa29d39c9ed176bda40bfa11099a3ce5d3a7ac6", - "sha256:4b9fe39a2ccc108a4accc2676e77da025ce383c108593d65cc909add5c3bd601", - "sha256:56442863ed2b06d19c37f94d999035e15ee982988920e12a5b4ba29b62ad1f77", - "sha256:671cd1187ed5e62818414afe79ed29da836dde67166a9fac6d435873c44fdd02", - "sha256:694deca8d702d5db21ec83983ce0bb4b26a578e71fbdbd4fdcd387daa90e4d5e", - "sha256:6a074d34ee7a5ce3effbc526b7083ec9731bb3cbf921bbe1d3005d4d2bdb3a63", - "sha256:6d0072fea50feec76a4c418096652f2c3238eaa014b2f94aeb1d56a66b41403f", - "sha256:6fbf47b5d3728c6aea2abb0589b5d30459e369baa772e0f37a0320185e87c980", - "sha256:7f91197cc9e48f989d12e4e6fbc46495c446636dfc81b9ccf50bb0ec74b91d4b", - "sha256:86b1f75c4e7c2ac2ccdaec2b9022845dbb81880ca318bb7a0a01fbf7813e3812", - "sha256:8dc1c72a69aa7e082593c4a203dcf94ddb74bb5c8a731e4e1eb68d031e8498ff", - "sha256:8e3dcf21f367459434c18e71b2a9532d96547aef8a871872a5bd69a715c15f96", - "sha256:8e576a51ad59e4bfaac456023a78f6b5e6e7651dcd383bcc3e18d06f9b55d6d1", - "sha256:96e37a3dc86e80bf81758c152fe66dbf60ed5eca3d26305edf01892257049925", - "sha256:97a68e6ada378df82bc9f16b800ab77cbf4b2fada0081794318520138c088e4a", - "sha256:99a2a507ed3ac881b975a2976d59f38c19386d128e7a9a18b7df6fff1fd4c1d6", - "sha256:a49907dd8420c5685cfa064a1335b6754b74541bbb3706c259c02ed65b644b3e", - "sha256:b09bf97215625a311f669476f44b8b318b075847b49316d3e28c08e41a7a573f", - "sha256:b7bd98b796e2b6553da7225aeb61f447f80a1ca64f41d83612e6139ca5213aa4", - "sha256:b87db4360013327109564f0e591bd2a3b318547bcef31b468a92ee504d07ae4f", - "sha256:bcb3ed405ed3222f9904899563d6fc492ff75cce56cba05e32eff40e6acbeaa3", - "sha256:d4306c36ca495956b6d568d276ac11fdd9c30a36f1b6eb928070dc5360b22e1c", - "sha256:d5ee4f386140395a2c818d149221149c54849dfcfcb9f1debfe07a8b8bd63f9a", - "sha256:dda30ba7e87fbbb7eab1ec9f58678558fd9a6b8b853530e176eabd064da81417", - "sha256:e04e26803c9c3851c931eac40c695602c6295b8d432cbe78609649ad9bd2da8a", - "sha256:e1c0b87e09fa55a220f058d1d49d3fb8df88fbfab58558f1198e08c1e1de842a", - "sha256:e72591e9ecd94d7feb70c1cbd7be7b3ebea3f548870aa91e2732960fa4d57a37", - "sha256:e8c843bbcda3a2f1e3c2ab25913c80a3c5376cd00c6e8c4a86a89a28c8dc5452", - "sha256:efc1913fd2ca4f334418481c7e595c00aad186563bbc1ec76067848c7ca0a933", - "sha256:f121a1420d4e173a5d96e47e9a0c0dcff965afdf1626d28de1460815f7c4ee7a", - "sha256:fc7b548b17d238737688817ab67deebb30e8073c95749d55538ed473130ec0c7" - ], - "markers": "python_version >= '3.7'", - "version": "==2.1.1" - }, - "numpy": { - "hashes": [ - "sha256:092f5e6025813e64ad6d1b52b519165d08c730d099c114a9247c9bb635a2a450", - "sha256:196cd074c3f97c4121601790955f915187736f9cf458d3ee1f1b46aff2b1ade0", - "sha256:1c29b44905af288b3919803aceb6ec7fec77406d8b08aaa2e8b9e63d0fe2f160", - "sha256:2b2da66582f3a69c8ce25ed7921dcd8010d05e59ac8d89d126a299be60421171", - "sha256:5043bcd71fcc458dfb8a0fc5509bbc979da0131b9d08e3d5f50fb0bbb36f169a", - "sha256:58bfd40eb478f54ff7a5710dd61c8097e169bc36cc68333d00a9bcd8def53b38", - "sha256:79a506cacf2be3a74ead5467aee97b81fca00c9c4c8b3ba16dbab488cd99ba10", - "sha256:94b170b4fa0168cd6be4becf37cb5b127bd12a795123984385b8cd4aca9857e5", - "sha256:97a76604d9b0e79f59baeca16593c711fddb44936e40310f78bfef79ee9a835f", - "sha256:98e8e0d8d69ff4d3fa63e6c61e8cfe2d03c29b16b58dbef1f9baa175bbed7860", - "sha256:ac86f407873b952679f5f9e6c0612687e51547af0e14ddea1eedfcb22466babd", - "sha256:ae8adff4172692ce56233db04b7ce5792186f179c415c37d539c25de7298d25d", - "sha256:bd3fa4fe2e38533d5336e1272fc4e765cabbbde144309ccee8675509d5cd7b05", - "sha256:d0d2094e8f4d760500394d77b383a1b06d3663e8892cdf5df3c592f55f3bff66", - "sha256:d54b3b828d618a19779a84c3ad952e96e2c2311b16384e973e671aa5be1f6187", - "sha256:d6ca8dabe696c2785d0c8c9b0d8a9b6e5fdbe4f922bde70d57fa1a2848134f95", - "sha256:d8cc87bed09de55477dba9da370c1679bd534df9baa171dd01accbb09687dac3", - "sha256:f0f18804df7370571fb65db9b98bf1378172bd4e962482b857e612d1fec0f53e", - "sha256:f1d88ef79e0a7fa631bb2c3dda1ea46b32b1fe614e10fedd611d3d5398447f2f", - "sha256:f9c3fc2adf67762c9fe1849c859942d23f8d3e0bee7b5ed3d4a9c3eeb50a2f07", - "sha256:fc431493df245f3c627c0c05c2bd134535e7929dbe2e602b80e42bf52ff760bc", - "sha256:fe8b9683eb26d2c4d5db32cd29b38fdcf8381324ab48313b5b69088e0e355379" - ], - "index": "pypi", - "version": "==1.23.0" - }, - "pymongo": { - "hashes": [ - "sha256:019a4c13ef1d9accd08de70247068671b116a0383adcd684f6365219f29f41cd", - "sha256:07f50a3b8a3afb086089abcd9ab562fb2a27b63fd7017ca13dfe7b663c8f3762", - "sha256:08a619c92769bd7346434dfc331a3aa8dc63bee80ed0be250bb0e878c69a6f3e", - "sha256:0a3474e6a0df0077a44573727341df6627042df5ca61ea5373c157bb6512ccc7", - "sha256:0b8a1c766de29173ddbd316dbd75a97b19a4cf9ac45a39ad4f53426e5df1483b", - "sha256:0f7e3872fb7b61ec574b7e04302ea03928b670df583f8691cb1df6e54cd42b19", - "sha256:17df40753085ccba38a0e150001f757910d66440d9b5deced30ed4cc8b45b6f3", - "sha256:298908478d07871dbe17e9ccd37a10a27ad3f37cc1faaf0cc4d205da3c3e8539", - "sha256:302ac0f4825501ab0900b8f1a2bb2dc7d28f69c7f15fbc799fb26f9b9ebb1ecb", - "sha256:303d1b3da2461586379d98b344b529598c8156857285ba5bd156dab1c875d1f6", - "sha256:306336dab4537b2343e52ec34017c3051c3aee5a961fff4915ab27f7e6d9b1e9", - "sha256:30d35a8855f328a85e5002f0908b24e500efdf8f5f78b73098995ce111baa2a9", - "sha256:3139c9ddee379c22a9109a0b3bf4cdb64597db2bbd3909f7a2825b47226977a4", - "sha256:32e785c37f6a0e844788c6085ea2c9c0c528348c22cebe91896705a92f2b1b26", - "sha256:33a5693e8d1fbb7743b7e867d43c1095652a0c6fedddab6cefe6020bee2ca393", - "sha256:35d02603c2318676fca5049cdc722bb2e7a378eaccf139ad767365e0eb3bcdbe", - "sha256:4516a5ce2beaebddc74d6e304ed520324dda99573c310ef4078284b026f81e93", - "sha256:49bb36986f11da2da190a2e777a411c0a28eeb8623850091ea8099b84e3860c7", - "sha256:4aa4800530782f7d38aeb169476a5bc692aacc394686f0ca3866e4bb85c9aa3f", - "sha256:4d1cdece06156542c18b691511a01fe78a694b9fa287ffd8e15680dbf2beeed5", - "sha256:4e4d2babb8737d650250d0fa940ffa1b88aa92b8eb399af093734950a1eeca45", - "sha256:4fd5c4f25d8d488ee5701c3ec786f52907dca653b47ce8709bcc2bfb0f5506ae", - "sha256:52c8b7bffd2140818ade2aa28c24cfe47935a7273a3bb976d1d8fb17e716536f", - "sha256:56b856a459762a3c052987e28ed2bd4b874f0be6671d2cc4f74c4891f47f997a", - "sha256:571a3e1ef4abeb4ac719ac381f5aada664627b4ee048d9995e93b4bcd0f70601", - "sha256:5cae9c935cdc53e4729920543b7d990615a115d85f32144773bc4b2b05144628", - "sha256:5d6ef3fa41f3e3be93483a77f81dea8c7ce5ed4411382a31af2b09b9ec5d9585", - "sha256:6396f0db060db9d8751167ea08f3a77a41a71cd39236fade4409394e57b377e8", - "sha256:69beffb048de19f7c18617b90e38cbddfac20077b1826c27c3fe2e3ef8ac5a43", - "sha256:7507439cd799295893b5602f438f8b6a0f483efb00720df1aa33a39102b41bcf", - "sha256:7aa40509dd9f75c256f0a7533d5e2ccef711dbbf0d91c13ac937d21d76d71656", - "sha256:7d69a3d980ecbf7238ab37b9027c87ad3b278bb3742a150fc33b5a8a9d990431", - "sha256:7dae2cf84a09329617b08731b95ad1fc98d50a9b40c2007e351438bd119a2f7a", - "sha256:7f36eacc70849d40ce86c85042ecfcbeab810691b1a3b08062ede32a2d6521ac", - "sha256:7f55a602d55e8f0feafde533c69dfd29bf0e54645ab0996b605613cda6894a85", - "sha256:8357aa727094798f1d831339ecfd8b3e388c01db6015a3cbd51790cb75e39994", - "sha256:84dc6bfeaeba98fe93fc837b12f9af4842694cdbde18083f150e80aec3de88f9", - "sha256:86b18420f00d5977bda477369ac85e04185ef94046a04ae0d85f5a807d1a8eb4", - "sha256:89f32d8450e15b0c11efdc81e2704d68c502c889d48415a50add9fa031144f75", - "sha256:8a1de8931cdad8cd12724e12a6167eef8cb478cc3ee5d2c9f4670c934f2975e1", - "sha256:8f106468062ac7ff03e3522a66cb7b36c662326d8eb7af1be0f30563740ff002", - "sha256:9a4ea87a0401c06b687db29e2ae836b2b58480ab118cb6eea8ac2ef45a4345f8", - "sha256:9ee1b019a4640bf39c0705ab65e934cfe6b89f1a8dc26f389fae3d7c62358d6f", - "sha256:a0d7c6d6fbca62508ea525abd869fca78ecf68cd3bcf6ae67ec478aa37cf39c0", - "sha256:a1417cb339a367a5dfd0e50193a1c0e87e31325547a0e7624ee4ff414c0b53b3", - "sha256:a35f1937b0560587d478fd2259a6d4f66cf511c9d28e90b52b183745eaa77d95", - "sha256:a4a35e83abfdac7095430e1c1476e0871e4b234e936f4a7a7631531b09a4f198", - "sha256:a7d1c8830a7bc10420ceb60a256d25ab5b032a6dad12a46af6ab2e470cee9124", - "sha256:a938d4d5b530f8ea988afb80817209eabc150c53b8c7af79d40080313a35e470", - "sha256:a9a2c377106fe01a57bad0f703653de286d56ee5285ed36c6953535cfa11f928", - "sha256:baf7546afd27be4f96f23307d7c295497fb512875167743b14a7457b95761294", - "sha256:bb21e2f35d6f09aa4a6df0c716f41e036cfcf05a98323b50294f93085ad775e9", - "sha256:bc62ba37bcb42e4146b853940b65a2de31c2962d2b6da9bc3ce28270d13b5c4e", - "sha256:be3ba736aabf856195199208ed37459408c932940cbccd2dc9f6ff2e800b0261", - "sha256:c03eb43d15c8af58159e7561076634d565530aaacaf48cf4e070c3501e88a372", - "sha256:c1349331fa743eed4042f9652200e60596f8beb957554acbcbb42aad4272c606", - "sha256:c3637cfce519560e2a2579d05eb81e912d109283b8ddc8de46f57ec20d273d92", - "sha256:c481cd1af2a77f58f495f7f87c2d715c6f1179d07c1ec927cca1f7977a2d99aa", - "sha256:c575f9499e5f540e034ff87bef894f031ae613a98b0d1d3afcc1f482527d5f1c", - "sha256:c604831daf2e7e5979ecd97a90cb8c4a7bae208ff45bc792e32eae09c3281afb", - "sha256:c759e1e0333664831d8d1d6b26cf59f23f3707758f696c71f506504b33130f81", - "sha256:c8a2743dd50629c0222f26c5f55975e45841d985b4b1c7a54b3f03b53de3427d", - "sha256:cbcac9263f500da94405cc9fc7e7a42a3ba6c2fe88b2cd7039737cba44c66889", - "sha256:cce1b7a680653e31ff2b252f19a39f1ded578a35a96c419ddb9632c62d2af7d8", - "sha256:cf96799b3e5e2e2f6dbca015f72b28e7ae415ce8147472f89a3704a035d6336d", - "sha256:d06ed18917dbc7a938c4231cbbec52a7e474be270b2ef9208abb4d5a34f5ceb9", - "sha256:d4ba5b4f1a0334dbe673f767f28775744e793fcb9ea57a1d72bc622c9f90e6b4", - "sha256:d7b8f25c9b0043cbaf77b8b895814e33e7a3c807a097377c07e1bd49946030d5", - "sha256:d86511ef8217822fb8716460aaa1ece31fe9e8a48900e541cb35acb7c35e9e2e", - "sha256:db8a9cbe965c7343feab2e2bf9a3771f303f8a7ca401dececb6ef28e06b3b18c", - "sha256:dbe92a8808cefb284e235b8f82933d7d2e24ff929fe5d53f1fd3ca55fced4b58", - "sha256:deb83cc9f639045e2febcc8d4306d4b83893af8d895f2ed70aa342a3430b534c", - "sha256:df9084e06efb3d59608a6a443faa9861828585579f0ae8e95f5a4dab70f1a00f", - "sha256:dfb89e92746e4a1e0d091cba73d6cc1e16b4094ebdbb14c2e96a80320feb1ad7", - "sha256:e13ddfe2ead9540e8773cae098f54c5206d6fcef64846a3e5042db47fc3a41ed", - "sha256:e4956384340eec7b526149ac126c8aa11d32441cb3ce77a690cb4821d0d0635c", - "sha256:e6eecd027b6ba5617ea6af3e12e20d578d8f4ad1bf51a9abe69c6fd4835ea532", - "sha256:eff9818b7671a55f1ce781398607e0d8c304cd430c0581fbe15b868a7a371c27", - "sha256:f0aea377b9dfc166c8fa05bb158c30ee3d53d73f0ed2fc05ba6c638d9563422f", - "sha256:f1fba193ab2f25849e24caa4570611aa2f80bc1c1ba791851523734b4ed69e43", - "sha256:f6db4f00d3baad615e99a865539391243d12b113fb628ebda1d7794ce02d5a10", - "sha256:f9405c02af86850e0a8a8ba777b7e7609e0d07bff46adc4f78892cc2d5456018", - "sha256:fb4445e3721720c5ca14c0650f35c263b3430e6e16df9d2504618df914b3fb99" - ], - "index": "pypi", - "version": "==4.1.1" - }, - "python-magic": { - "hashes": [ - "sha256:c1ba14b08e4a5f5c31a302b7721239695b2f0f058d125bd5ce1ee36b9d9d3c3b", - "sha256:c212960ad306f700aa0d01e5d7a325d20548ff97eb9920dcd29513174f0294d3" - ], - "index": "pypi", - "version": "==0.4.27" - }, - "werkzeug": { - "hashes": [ - "sha256:1ce08e8093ed67d638d63879fd1ba3735817f7a80de3674d293f5984f25fb6e6", - "sha256:72a4b735692dd3135217911cbeaa1be5fa3f62bffb8745c5215420a03dc55255" - ], - "index": "pypi", - "version": "==2.1.2" - } - }, - "develop": {} -} diff --git a/colmap/README.md b/colmap/README.md deleted file mode 100644 index cb446e0..0000000 --- a/colmap/README.md +++ /dev/null @@ -1,73 +0,0 @@ -### COLMAP -COLMAP is a general-purpose Structure-from-Motion (SfM) and Multi-View Stereo (MVS) pipeline -that is used for estimating three-dimensional structures from two-dimensional image -sequences. The process of the reconstruction process includes the following: -- Data Collection → Images collected by user -- Extraction of Image Features → Extract eigenvalues from the image -- Feature Point Matching → Matches corresponding points between images -- Sparse Reconstruction (SfM) → Restore the three-dimensional structure of the scene and camera attitude -- Depth Map Estimation → Restore the depth information of the reference image -- Dense Reconstruction (MVS) → Obtain camera pose and calculate 3D points corresponding to each pixel in the image - -This project will be using the image and camera data output from COLMAP and -implementing it to the NeRF. To achieve this, COLMAP will be run from command line using colmap_runner.py. Then using the generate output of each images, process the data using image_position_extractor.py to extract the quaternion and transpose vector. Finally, feed the data to matrix.py that converts the data to intrinsic and extrinsic matrices to be used by NeRF. - - -### Data Processing - -Image list with two lines of data per image: -IMAGE_ID, QW, QX, QY, QZ, TX, TY, TZ, CAMERA_ID, NAME\ -POINTS2D[ ] as (X, Y, POINT3D_ID) - -Quaternion: QW, QX, QY, QZ\ -Transpose Vector: TX, TY, TZ - --r^t * T = coordinate of project/camera center -- r^t = inverse/transpose of the 3x3 rotation matrix composed from the quaternion -- T = translation vector - -The quaternion elements will first be converted to a rotational matrix. Then, the coordinate of the images' camera center can be computed by taking the inverse of this rotational matrix multiplied by the translation vector. This is essentially the extrinsic matrix that describes the camera's location in the world, and what direction it's pointing. - -Quaternion can also be used to calculate the Euler angles which contains the roll, pitch, and yaw of the three rotational axes. -- Roll is rotation around x in radians (counterclockwise) -- Pitch is rotation around y in radians (counterclockwise) -- Yaw is rotation around z in radians (counterclockwise) - -**Projection Matrix** -- Each 2D image coordinates have intrinsic and extrinsic properties that can describe the 3D world coordinates. The intrinsic matrix is mainly used to scale and calibrate the optical center while the extrinsic properties contains the rotation and translation to project an image to the 3D world view. - -![My Image](projection_matrix.jpg) -Source:https://www.cc.gatech.edu/classes/AY2016/cs4476_fall/results/proj3/html/agartia3/index.html - - -**Projection Diagram** -- The image below a pinhole camera model that is used in this project. The real-word coordinates X, Y and Z of point P is used to calculate the image coordinates X' and Y' of the point P′ projected onto the image plane using the focal length. - -![My Image](projection_diagram.jpg) -Source: https://www.sciencedirect.com/topics/engineering/intrinsic-parameter - -### Good Data Collection Tips -- Try to take pictures/videos in a clean background that is not messy -- Take structure in an environment that has opposing color to the structure itself -- Take a decent amount of pictures that covers a wide range of position -- Try to take images with the same angles among an axis - -### Script Usage -- colmap_runner.py → run colmap from command line -- image_position_extractor.py → used in matrix.py to extract colmap output -- matrix.py → outputs json object of intrinsic and extrinsic matrix -- main.py → starts worker to process requests automatically from the web-server (start the web-server before running) - -### Reference for Additional Research -- COLMAP Installation\ -https://colmap.github.io/install.html -- COLMAP Data Output Format\ -https://colmap.github.io/format.html -- Quaternion to Rotational Matrix\ -https://www.euclideanspace.com/maths/geometry/rotations/conversions/quaternionToMatrix/index.htm -- Quaternion to Euler Angle\ -https://en.wikipedia.org/wiki/Conversion_between_quaternions_and_Euler_angles -- Extrinsic Matrix\ -https://ksimek.github.io/2012/08/22/extrinsic/ - - diff --git a/colmap/colmap_runner.py b/colmap/colmap_runner.py deleted file mode 100644 index a57ca5a..0000000 --- a/colmap/colmap_runner.py +++ /dev/null @@ -1,138 +0,0 @@ -import subprocess -import os -import sys -import logging -from pathlib import Path - -#Usage: python colmap_runner.py --flags -#Flags: --colmap_exe_path "path" ==> Path to the colmap executeable. -# > Defaults to looking for COLMAP.bat in a folder called COLMAP in the -# > folder this script is in -# -# --image_path "path" ==> Path to the folder containing the images for COLMAP's input -# > Defailts to looking for a folder called "Images" in the folder -# > this script is in -# -# --name "name" ==> Name of the folder to be created to store the data for this instance -# > of colmap. -# > Defaults to "colmap_output" -# -# --output_folder "path" ==> Directory to where colmap will put its output. -# > Defaults to the folder where this script is - - - -#run_colmap function: -# -#creates a new folder called instance_name in output_path and fills it with the colmap data -# generated by the exe at colmap_path with data from images_path -# -#returns a status code - - -# 0 = Success -# 1 = Unspecified error -# 2 = FileExistsError; happens when you try to create data in an already existing folder -# 3 = FileNotFoundError; happens when you try to use an output folder that does not exist - -def run_colmap(colmap_path, images_path, output_path): - ### Create a new folder to store our data - # TODO: determine GPU use on start - use_gpu = "false" - Path(f"{output_path}").mkdir(parents=True, exist_ok=True) - - # sfm-worker logger - logger = logging.getLogger('sfm-worker') - - - logger.info("run_colmap()-colmap_path: " + colmap_path) - logger.info("run_colmap()-images_path: " + images_path) - logger.info("run_colmap()-output_path: " + colmap_path) - - #Creating a new database for colmap - try: - database_path = output_path + "/database.db" - subprocess.call([colmap_path, "database_creator", "--database_path", database_path]) - logger.info("Created DB") - except: - logger.error("DB Creation Failed") - return 1 - - #Feature extracting - try: - # --SiftExtraction.use_gpu=false for docker - # TODO: make gpu use dynamic - subprocess.call([colmap_path, "feature_extractor","--ImageReader.camera_model","PINHOLE",f"--SiftExtraction.use_gpu={use_gpu}","--ImageReader.single_camera=1", "--database_path", database_path, "--image_path", images_path]) - logger.info("Features Extracted") - except: - logger.error("Features unable to be extracted") - return 1 - - #Feature matching - try: - subprocess.call([colmap_path, "exhaustive_matcher",f"--SiftMatching.use_gpu={use_gpu}", "--database_path", database_path]) - logger.info("Feature Matched") - except: - logger.error("Features unable to be matched") - return 1 - - #Generating model - try: - subprocess.call([colmap_path, "mapper", "--database_path", database_path, "--image_path", images_path, "--output_path", output_path]) - logger.info("Model generated") - except: - logger.error("Model unable to be generated") - return 1 - - #Getting model as text - try: - # TODO: no longer works on windows fix file paths or run in docker - subprocess.call([colmap_path, "model_converter", "--input_path", output_path + r"/0", "--output_path", output_path, "--output_type", "TXT"]) - logger.info("Model as text successful") - except: - logger.error("Model as text unsuccessful") - return 1 - - logger.info("run_colmap successfully executed") - return 0 - - -if __name__ == '__main__': - #Default flags - instance_name = "colmap_output" - output_path = "./" - colmap_path = r".\COLMAP\COLMAP.bat" - images_path = r".\Images" - - logger = logging.getLogger('sfm-worker') - - #Parse flags - #Flag format up top - for i in range (len(sys.argv)): - if i == 0: - continue - if sys.argv[i].startswith("--"): - match sys.argv[i]: - case "--output_folder": - output_path = sys.argv[i+1] - case "--name": - instance_name = sys.argv[i+1] - case "--colmap_exe_path": - colmap_path = sys.argv[i+1] - case "--image_path": - images_path = sys.argv[i+1] - case _: - logger.error("ERROR: Unrecognized flag {}".format(sys.argv[i])) - quit() - - #Run COLMAP :) - status = run_colmap(instance_name, output_path, colmap_path, images_path) - if status == 0: - logger.info("COLMAP ran successfully.") - elif status == 1: - logger.error("ERROR: There was an unknown error running COLMAP") - elif status == 2: - logger.error("ERROR: COLMAP - file {}/{} already exists.".format(output_path,instance_name)) - elif status == 3: - logger.error("ERROR: COLMAP - file {} could not be found.".format(output_path)) - elif status == 4: - logger.error("ERROR: COLMAP - Video was too blurry for computation.") diff --git a/colmap/colmap_tests.py b/colmap/colmap_tests.py deleted file mode 100644 index 790637b..0000000 --- a/colmap/colmap_tests.py +++ /dev/null @@ -1,55 +0,0 @@ -import unittest -import numpy as np -from matrix import euler_from_quaternion, quaternion_rotation_matrix, rotation_matrix_from_vectors - -class TestMatrixFunctions(unittest.TestCase): - - def test_euler_from_quaternion(self): - # Test case 1 - roll, pitch, yaw = euler_from_quaternion(0, 0, 0.7072, 0.7072) - self.assertAlmostEqual(roll, 0) - self.assertAlmostEqual(pitch, 0.0) - self.assertAlmostEqual(yaw, 1.5710599372799763) - - # Test case 2 - roll, pitch, yaw = euler_from_quaternion(0.5, 0.5, 0.5, 0.5) - self.assertAlmostEqual(roll, 1.5707963267948966) - self.assertAlmostEqual(pitch, 0.0) - self.assertAlmostEqual(yaw, 1.5707963267948966) - - def test_quaternion_rotation_matrix(self): - # Test case 1 - rotation_matrix = quaternion_rotation_matrix(0.5, 0.5, 0.5, 0.5) - expected_matrix = np.array([[0.0, 0.0, 1.0], - [1.0, 0.0, 0.0], - [0.0, 1.0, 0.0]]) - np.testing.assert_array_almost_equal(rotation_matrix, expected_matrix) - - # Test case 2 - rotation_matrix = quaternion_rotation_matrix(0.0, 0.0, 0.0, 1.0) - expected_matrix = np.array([[-1.0, 0.0, 0.0], - [0.0, -1.0, 0.0], - [0.0, 0.0, 1.0]]) - np.testing.assert_array_almost_equal(rotation_matrix, expected_matrix) - - def test_rotation_matrix_from_vectors(self): - # Test case 1 - vec1 = np.array([1.0, 0.0, 0.0]) - vec2 = np.array([0.0, 1.0, 0.0]) - rotation_matrix = rotation_matrix_from_vectors(vec1, vec2) - expected_matrix = np.array([[0.0, -1.0, 0.0], - [1.0, 0.0, 0.0], - [0.0, 0.0, 1.0]]) - np.testing.assert_array_almost_equal(rotation_matrix, expected_matrix) - - # Test case 2 - vec1 = np.array([1.0, 0.0, 0.0]) - vec2 = np.array([0.0, 0.0, 1.0]) - rotation_matrix = rotation_matrix_from_vectors(vec1, vec2) - expected_matrix = np.array([[0.0, 0.0, -1.0], - [0.0, 1.0, 0.0], - [1.0, 0.0, 0.0]]) - np.testing.assert_array_almost_equal(rotation_matrix, expected_matrix) - -if __name__ == '__main__': - unittest.main() \ No newline at end of file diff --git a/colmap/configs/default.txt b/colmap/configs/default.txt deleted file mode 100644 index e69de29..0000000 diff --git a/colmap/configs/local.txt b/colmap/configs/local.txt deleted file mode 100644 index ef9728f..0000000 --- a/colmap/configs/local.txt +++ /dev/null @@ -1,5 +0,0 @@ -#set local run status / run colmap worker with or without webserver -local_run = True - -#Specify input data file path used for local runs ONLY -input_data_path = data/inputs/input.mp4 \ No newline at end of file diff --git a/colmap/image_position_extractor.py b/colmap/image_position_extractor.py deleted file mode 100644 index 69a5ed5..0000000 --- a/colmap/image_position_extractor.py +++ /dev/null @@ -1,85 +0,0 @@ -import sys -import csv - -#Usage: python image_position_extractor.py InputFile.txt OutputFile.csv -# -#InputFile.txt ==> Should be images.txt generated by COLMAP. -#OutputFile.csv ==> Output file for the script. Will create or overrite the given file. - - - -#extract_position_data function: -# -#reads the data from infile (assumed to be an output file from COLMAP) and -# puts the data into outfile. -# -#returns a status code - - -# 0 = Success -# 1 = Unspecified error -# 2 = FileNotFoundError; happens when infile cannot be found - -def extract_position_data(infile, outfile, debug=False): - #Open input file - try: - file = open(infile, 'r') - except FileNotFoundError: - return 2 - except: - return 1 - - lines = file.readlines() - images = [] - - #Read each line of the file and add proper data to the images[] array - compatible_img_formats = (".jpg\n",".png\n") - for line in lines: - words = line.split(' ') - if (words[-1].lower()).endswith(compatible_img_formats): - image_data = {} - qw, qx, qy, qz, tx, ty, tz = words[1:8] - imageName = words[-1][:-1] - - if debug: - print(imageName + "- QW:" + qw + ", QX:" + qx + ", QY:" + qy + ", QZ:" + qz + ", TX:" + tx +", TY:" + ty +", TZ:" + tz) - - image_data["Image_Name"] = imageName - image_data["QW"] = qw - image_data["QX"] = qx - image_data["QY"] = qy - image_data["QZ"] = qz - image_data["TX"] = tx - image_data["TY"] = ty - image_data["TZ"] = tz - images.append(image_data) - - #Write to output csv - #TODO: Make this output to JSON with format from Eric - with open(outfile, mode='w', newline='') as csv_file: - csv_file.truncate(0) - fieldnames = ['Image_Name', 'QW', 'QX', 'QY', 'QZ', 'TX', 'TY', 'TZ'] - writer = csv.DictWriter(csv_file, fieldnames=fieldnames) - - writer.writeheader() - for image in images: - writer.writerow(image) - - return 0 - -if __name__ == '__main__': - if len(sys.argv) != 3: - print("Invalid usage. Correct usage: python ImagePositionExtractor \"DataFile.txt\" \"Output.csv\"") - exit() - - infile = sys.argv[1] - outfile = sys.argv[2] - - #Extract data and print status - status = extract_position_data(infile, outfile, True) - if status == 0: - print(f"Data successfully extracted from {infile} and placed in {outfile}") - elif status == 1: - print(f"ERROR: Unspecified error extracting from {infile} or placing in {outfile}") - elif status == 2: - print(f"ERROR: Could not file file {infile}") - diff --git a/colmap/log.py b/colmap/log.py deleted file mode 100644 index dbc0d8a..0000000 --- a/colmap/log.py +++ /dev/null @@ -1,24 +0,0 @@ -import logging - -def sfm_worker_logger(name='root'): - """ - Initializer for a global sfm-worker logger. - -> - To initialize use: 'logger = log.sfm_worker_logger(name)' - To retrieve in different context: 'logger = logging.getLogger(name)' - """ - formatter = logging.Formatter(fmt='%(asctime)s - %(levelname)s - %(module)s - %(message)s') - handler = logging.FileHandler(name+'.log', mode='w') - handler.setFormatter(formatter) - - logger = logging.getLogger(name) - logger.setLevel(logging.DEBUG) - logger.addHandler(handler) - return logger - -if __name__ == "__main__": - theta = sfm_worker_logger('sfm-worker-test') - theta.info("info message") - theta.warning("warning message") - theta.error("error message") - theta.critical("critical message") \ No newline at end of file diff --git a/colmap/main.py b/colmap/main.py deleted file mode 100644 index 81e1a3f..0000000 --- a/colmap/main.py +++ /dev/null @@ -1,218 +0,0 @@ -from flask import Flask -from flask import send_from_directory -from pathlib import Path -from video_to_images import split_video_into_frames -from colmap_runner import run_colmap -from matrix import get_json_matrices -from image_position_extractor import extract_position_data -from opt import config_parser -import requests -import pika -import json -import time -from multiprocessing import Process -import os -import argparse -import sys - -import logging -from log import sfm_worker_logger - -from dotenv import load_dotenv - -app = Flask(__name__) -# base_url = "http://host.docker.internal:5000/" -base_url = "http://sfm-worker:5100/" - - -@app.route("/data/outputs/") -def send_video(path): - logger.info(f"Sending video: {path}") - return send_from_directory("data/outputs/", path) - - -def start_flask(): - global app - app.run(host="0.0.0.0", port=5100, debug=True) - - -def to_url(local_file_path: str): - return base_url + local_file_path - - -def run_full_sfm_pipeline(id, video_file_path, input_data_dir, output_data_dir): - # run colmap and save data to custom directory - # Create output directory under data/output_data_dir/id - # TODO: use library to fix filepath joining - if not output_data_dir.endswith(("\\", "/")) and not id.startswith(("\\", "/")): - output_data_dir = output_data_dir + "/" - output_path = output_data_dir + id - Path(f"{output_path}").mkdir(parents=True, exist_ok=True) - - # Get logger - logger = logging.getLogger('sfm-worker') - - # (1) vid_to_images.py - imgs_folder = os.path.join(output_path, "imgs") - logger.info("Video file path:{}".format(video_file_path)) - - split_status = split_video_into_frames(video_file_path, imgs_folder, 100) - # Catches flag for blurriness - if split_status == 4: - logger.error("Video is too blurry.") - # motion_data flag option determines the status of the job - # flag = 4 means the video was too blurry - motion_data = {"flag":4,"id":id} - return motion_data, None - - # imgs are now in output_data_dir/id - - # (2) colmap_runner.py - colmap_path = "/usr/local/bin/colmap" - status = run_colmap(colmap_path, imgs_folder, output_path) - if status == 0: - logger.info("COLMAP ran successfully.") - elif status == 1: - logger.error("ERROR: There was an unknown error running COLMAP") - - # (3) matrix.py - initial_motion_path = os.path.join(output_path, "images.txt") - camera_stats_path = os.path.join(output_path, "cameras.txt") - parsed_motion_path = os.path.join(output_path, "parsed_data.csv") - - extract_position_data(initial_motion_path, parsed_motion_path) - motion_data = get_json_matrices(camera_stats_path, parsed_motion_path) - motion_data["id"] = id - motion_data["flag"] = 0 - - # Save copy of motion data - with open(os.path.join(output_path, "transforms_data.json"), "w") as outfile: - outfile.write(json.dumps(motion_data, indent=4)) - - return motion_data, imgs_folder - - -def colmap_worker(): - load_dotenv() - input_data_dir = "data/inputs/" - output_data_dir = "data/outputs/" - Path(f"{input_data_dir}").mkdir(parents=True, exist_ok=True) - Path(f"{output_data_dir}").mkdir(parents=True, exist_ok=True) - - logger = logging.getLogger('sfm-worker') - - def process_colmap_job(ch, method, properties, body): - logger.info("Starting New Job") - logger.info(body.decode()) - job_data = json.loads(body.decode()) - id = job_data["id"] - - logger.info(f"Running New Job With ID: {id}") - - # TODO: Handle exceptions and enable steaming to make safer - video = requests.get(job_data["file_path"], timeout=10) - logger.info("Web server pinged") - video_file_path = f"{input_data_dir}{id}.mp4" - logger.info("Saving video to: {video_file_path}") - open(video_file_path, "wb").write(video.content) - - logger.info("Video downloaded") - - # RUNS COLMAP AND CONVERSION CODE - motion_data, imgs_folder = run_full_sfm_pipeline( - id, video_file_path, input_data_dir, output_data_dir - ) - # Catch incomplete videos by flag != 1 and return here - if motion_data["flag"] != 0: - logger.error("An error was found. Ending process.") - channel.basic_publish( - exchange="", routing_key="sfm-out", body=json.dumps(motion_data) - ) - ch.basic_ack(delivery_tag=method.delivery_tag) - return - - # create links to local data to serve - for i, frame in enumerate(motion_data["frames"]): - file_name = frame["file_path"] - file_path = os.path.join(imgs_folder, file_name) - file_url = to_url(file_path) - motion_data["frames"][i]["file_path"] = file_url - - json_motion_data = json.dumps(motion_data) - channel.basic_publish( - exchange="", routing_key="sfm-out", body=json_motion_data - ) - - # confirm to rabbitmq job is done - ch.basic_ack(delivery_tag=method.delivery_tag) - logger.info("Job complete") - - - rabbitmq_domain = "rabbitmq" - credentials = pika.PlainCredentials( - str(os.getenv("RABBITMQ_DEFAULT_USER")), str(os.getenv("RABBITMQ_DEFAULT_PASS"))) - parameters = pika.ConnectionParameters( - rabbitmq_domain, 5672, '/', credentials, heartbeat=300 - ) - - # retries connection until connects or 2 minutes pass - timeout = time.time() + 60 * 2 - while True: - if time.time() > timeout: - raise Exception( - "nerf_worker took too long to connect to rabbitmq") - try: - connection = pika.BlockingConnection(parameters) - channel = connection.channel() - channel.queue_declare(queue='sfm-in') - channel.queue_declare(queue='sfm-out') - - # Will block and call process_nerf_job repeatedly - channel.basic_qos(prefetch_count=1) - channel.basic_consume( - queue='sfm-in', on_message_callback=process_colmap_job, auto_ack=False) - try: - channel.start_consuming() - except KeyboardInterrupt: - channel.stop_consuming() - connection.close() - break - except pika.exceptions.AMQPConnectionError: - continue - - channel.basic_qos(prefetch_count=1) - channel.basic_consume(queue="sfm-in", on_message_callback=process_colmap_job) - channel.start_consuming() - logger.critical("Should not get here: After consumption of RabbitMQ.") - -if __name__ == "__main__": - input_data_dir = "data/inputs/" - output_data_dir = "data/outputs/" - Path(f"{input_data_dir}").mkdir(parents=True, exist_ok=True) - Path(f"{output_data_dir}").mkdir(parents=True, exist_ok=True) - - """ - STARTING LOGGER - """ - logger = sfm_worker_logger('sfm-worker') - logger.info("~SFM WORKER~") - - # Load args from config file - args = config_parser() - - # Local run behavior - if args.local_run == True: - motion_data, imgs_folder = run_full_sfm_pipeline( - "Local_Test", args.input_data_path, input_data_dir, output_data_dir - ) - logger.info("MOTION DATA: {}".format(motion_data)) - json_motion_data = json.dumps(motion_data) - - # Standard webserver run behavior - else: - sfmProcess = Process(target=colmap_worker, args=()) - flaskProcess = Process(target=start_flask, args=()) - flaskProcess.start() - sfmProcess.start() - flaskProcess.join() - sfmProcess.join() diff --git a/colmap/matrix.py b/colmap/matrix.py deleted file mode 100644 index 8bbcaf7..0000000 --- a/colmap/matrix.py +++ /dev/null @@ -1,308 +0,0 @@ -#!/usr/bin/env python3 -""" --r^t * T = Coordinate of project/camera center: -r^t = inverse/transpose of the 3x3 rotation matrix composed from the quaternion -T = translation vector -python3 matrix.py output.csv >out.txt -combined filed parsing and matrix formation -command: python3 matrix.py images.txt camera.txt -parse_data.py will first parse images.txt and redirect to parsed_data.csv -matrix.py will then take the parsed_data.csv and redirect intrinsic and extrinsic -matrix to out_matrix.txt with the following info: -1. camera model -2. resolution -3. 1 intrinsic matrix -4. extrinsic matrix for each image w/ image name -""" - -import sys -import csv -import math -import numpy as np -import image_position_extractor -import json -import os -import logging - - -# https://en.wikipedia.org/wiki/Conversion_between_quaternions_and_Euler_angles -def euler_from_quaternion(x, y, z, w): - """ - Convert a quaternion into euler angles (roll, pitch, yaw) - - Parameters - ---------- - x, y, z, w : float - A 4 element array representing the quaternion (x,y,z,w) - - Returns - ------- - roll: float - rotation around x in radians (counterclockwise) - pitch: float - rotation around y in radians (counterclockwise) - yaw: float - rotation around z in radians (counterclockwise) - """ - - # roll (x-axis rotation) - sinr_cosp = +2.0 * (w * x + y * z) - cosr_cosp = +1.0 - 2.0 * (x * x + y * y) - roll_x = math.atan2(sinr_cosp, cosr_cosp) - - # pitch (y-axis rotation) - sinp = +2.0 * (w * y - z * x) - sinp = +1.0 if sinp > +1.0 else sinp - sinp = -1.0 if sinp < -1.0 else sinp - pitch_y = math.asin(sinp) - - # yaw (x-axis rotation) - siny_cosp = +2.0 * (w * z + x * y) - cosy_cosp = +1.0 - 2.0 * (y * y + z * z) - yaw_z = math.atan2(siny_cosp, cosy_cosp) - - return roll_x, pitch_y, yaw_z # in radians - - -# https://www.euclideanspace.com/maths/geometry/rotations/conversions/quaternionToMatrix/index.htm -def quaternion_rotation_matrix(qw, qx, qy, qz) -> np.ndarray: - # w x y z - """ - Convert a quaternion into a full three-dimensional rotation matrix. - - Parameters - ---------- - qw, qx, qy, qz : float - A 4 element array representing the quaternion (qw,qx,qy,qz) - - Returns - ------- - rotation_matrix : np.ndarray - A 3x3 element matrix representing the full 3D rotation matrix. - This rotation matrix converts a point in the local reference - frame to a point in the global reference frame. - """ - - # First row of the rotation matrix - r00 = 1 - 2 * qy**2 - 2 * qz**2 - r01 = 2 * qx * qy - 2 * qz * qw - r02 = 2 * qx * qz + 2 * qy * qw - - # Second row of the rotation matrix - r10 = 2 * qx * qy + 2 * qz * qw - r11 = 1 - 2 * qx**2 - 2 * qz**2 - r12 = 2 * qy * qz - 2 * qx * qw - - # Third row of the rotation matrix - r20 = 2 * qx * qz - 2 * qy * qw - r21 = 2 * qy * qz + 2 * qx * qw - r22 = 1 - 2 * qx**2 - 2 * qy**2 - - # 3x3 rotation matrix - rotation_matrix = np.array([[r00, r01, r02], - [r10, r11, r12], - [r20, r21, r22]]) - # np.set_printoptions(threshold=sys.maxsize) - return rotation_matrix - -# Function from https://stackoverflow.com/questions/45142959/calculate-rotation-matrix-to-align-two-vectors-in-3d-space -# Authored by Peter -def rotation_matrix_from_vectors(vec1, vec2): - """ Find the rotation matrix that aligns vec1 to vec2 - :param vec1: A 3d "source" vector - :param vec2: A 3d "destination" vector - :return mat: A transform matrix (3x3) which when applied to vec1, aligns it with vec2. - """ - a, b = (vec1 / np.linalg.norm(vec1)).reshape(3), (vec2 / np.linalg.norm(vec2)).reshape(3) - v = np.cross(a, b) - c = np.dot(a, b) - s = np.linalg.norm(v) - kmat = np.array([[0, -v[2], v[1]], [v[2], 0, -v[0]], [-v[1], v[0], 0]]) - rotation_matrix = np.eye(3) + kmat + kmat.dot(kmat) * ((1 - c) / (s ** 2)) - return rotation_matrix - -def get_extrinsic(center_point, fp: str = "parsed_data.csv"): - - # sfm-worker logger - logger = logging.getLogger('sfm-worker') - - # contrains filepath and extrinsic matrix - filepaths = [] - extrinsic_matrices = [] - with open(fp) as csv_file: - csv_reader = csv.reader(csv_file, delimiter=",") - row = next(csv_reader) # start with second line - - for row in csv_reader: - image_name = str(row[0]) - filepaths.append(image_name) - - qw = float(row[1]) - qx = float(row[2]) - qy = float(row[3]) - qz = float(row[4]) - - tx = float(row[5]) - ty = float(row[6]) - tz = float(row[7]) - - # find extrinsic matrix - T = np.array([tx, ty, tz]) - r = quaternion_rotation_matrix(qw, qx, qy, qz) # rotational matrix - - - extrinsic = np.zeros((4,4)) - extrinsic[0:3,0:3] = r - extrinsic[0:3,3] = T - - #T is not the position of the camera - #r_t = r.transpose() - #camera = -r_t @T - extrinsic[3][3] = 1 - c2w = np.linalg.inv(extrinsic) - - # convert from OPENCV to OPENGL coordinates - conversion = np.array([[1,0,0,0], - [0,-1,0,0], - [0,0,-1,0], - [0,0,0,1]]) - # flips y and z coords - c2w = c2w @ conversion - - extrinsic_matrices.append(c2w) - - # Center extrinsics around center point: - #extrinsic[0:3,3] += center_point - - # stack all extrinsic to perform faster transformations to the whole stack - extrinsic_matrices = np.stack(extrinsic_matrices,axis=0) - logger.info("Extrinsic shape: {extrinsic_matrices.shape}") - avg_y_axis = np.sum(extrinsic_matrices[:,0:3,1], axis=0) - avg_y_axis = avg_y_axis/np.linalg.norm(avg_y_axis) - logger.info("Consensus Y axis:: {avg_y_axis}") - - # Find a matrix to rotate the average y axis with the y-axis unit vector thus aligning every extrinsic to point in the same direction - Rot = np.zeros((4,4)) - Rot[0:3,0:3] = rotation_matrix_from_vectors(avg_y_axis,np.asarray([0,0,1])) - Rot[-1,-1] = 1 - Rot = np.expand_dims(Rot,axis=0) - - # Rotate Extrinsic to all face up - extrinsic_matrices = Rot @ extrinsic_matrices - - # Adjust extrinsic to center around the central point - #center_point = np.average(extrinsic_matrices[:,0:3,3],axis=0) - logger.info(center_point.shape) - logger.info("center point {}".format(center_point)) - extrinsic_matrices[:,0:3,3] -= center_point - - # Z offset assuming cameras are never below the object - extrinsic_matrices[:,2,3] -= min(extrinsic_matrices[:,2,3].min(),0) - - # Normalize extrinsic transformation to remain within bounding box - translation_magnitudes = np.linalg.norm(extrinsic_matrices[:,0:3,3],axis=1) - avg_translation_magnitude = np.average(translation_magnitudes) - logger.info("Translation mag: {}".format(avg_translation_magnitude)) - extrinsic_matrices[:,0:3,3] /= avg_translation_magnitude - - # scale back up TODO: make dynamic - extrinsic_matrices[:,0:3,3] *= 4 - - logger.info("Max {}".format(extrinsic_matrices[:,0:3,3].max())) - logger.info("Min {}".format(extrinsic_matrices[:,0:3,3].min())) - logger.info("avg {}".format(np.average(extrinsic_matrices[:,0:3,3]))) - - # Convert to json - frames = [] - for extrin, file_path in zip(extrinsic_matrices,filepaths): - extrinsic_list = extrin.tolist() # convert to list for json - - img_frame = { "file_path": file_path, - "extrinsic_matrix": extrinsic_list} - - frames.append(img_frame) - - return frames -# add the video name thing -def get_intrinsic(fp: str = "cameras.txt"): - infile = open(fp, "r") - lines = infile.readlines() - - for line in lines: - data = line.split(" ") - if not data[0].startswith("#"): - camera = data[1] - width = int(data[2]) - height = int(data[3]) - - fx = float(data[4]) - fy = float(data[5]) - x0 = float(data[6]) - y0 = float(data[7]) - - intrinsic = np.array([[fx, 0, x0], - [0, fy, y0], - [0, 0, 1]]) - intrinsic_list = intrinsic.tolist() # convert to list for json - - intrinsic = { "vid_width": width, - "vid_height": height, - "intrinsic_matrix": intrinsic_list - } - - return intrinsic - -# COLMAP TO NDC -def get_extrinsics_center(fp: str = "points3D.txt"): - # sfm-worker logger - logger = logging.getLogger('sfm-worker') - - infile = open(fp, "r") - lines = infile.readlines() - point_count = 0 - - central_point = np.zeros(3) - # find center of all the points - for line in lines: - data = line.split(" ") - if not data[0].startswith("#"): - # X Y Z - central_point[0] += float(data[1]) - central_point[1] += float(data[2]) - central_point[2] += float(data[3]) - point_count+=1 - - central_point /= point_count - logger.info("Central point: {}".format(central_point)) - return central_point - - -def get_json_matrices(camera_file, motion_data ): - point_path = os.path.join(os.path.dirname(camera_file),"points3D.txt") - center_point = get_extrinsics_center(point_path) - intrinsic = get_intrinsic(camera_file) - extrinsic = get_extrinsic(center_point,motion_data) - intrinsic["frames"] = extrinsic - - return intrinsic - -def main(): - logger = logging.getLogger('sfm-worker') - # check for input argument - if len(sys.argv) != 3: - logger.error("ERROR: Bad arguments. Usage: python3 %s images.txt camera.txt" % sys.argv[0]) - sys.exit(1) - - center_point = get_extrinsics_center() - intrinsic = get_intrinsic() - extrinsic = get_extrinsic(center_point) - intrinsic["frames"] = extrinsic - json_object= json.dumps(intrinsic, indent=4) - - with open('data.json', 'w') as outfile: - outfile.write(json_object) - -if __name__ == "__main__": - image_position_extractor.extract_position_data("images.txt", "parsed_data.csv") - main() - diff --git a/colmap/opt.py b/colmap/opt.py deleted file mode 100644 index 3bcae4b..0000000 --- a/colmap/opt.py +++ /dev/null @@ -1,18 +0,0 @@ -import configargparse - -def config_parser(cmd=None): - parser = configargparse.ArgumentParser() - - parser.add_argument('--config', is_config_file = True, default = 'configs/default.txt', - help = 'config file path') - - parser.add_argument('--local_run', type = bool, default = False, - help = 'run colmap worker locally') - - parser.add_argument('--input_data_path', default = 'data/inputs/input.mp4', - help = 'input data path for local runs') - - if cmd is not None: - return parser.parse_args(cmd) - else: - return parser.parse_args() \ No newline at end of file diff --git a/colmap/projection_diagram.jpg b/colmap/projection_diagram.jpg deleted file mode 100644 index 708e45d..0000000 Binary files a/colmap/projection_diagram.jpg and /dev/null differ diff --git a/colmap/projection_matrix.jpg b/colmap/projection_matrix.jpg deleted file mode 100644 index 887a64e..0000000 Binary files a/colmap/projection_matrix.jpg and /dev/null differ diff --git a/colmap/requirements.txt b/colmap/requirements.txt deleted file mode 100644 index d96265a..0000000 --- a/colmap/requirements.txt +++ /dev/null @@ -1,16 +0,0 @@ --i https://pypi.org/simple -requests==2.28.1 -click==8.1.3 -flask==2.1.2 -itsdangerous==2.1.2 -jinja2==3.1.2 -markupsafe==2.1.1 -numpy==1.23.0 -pymongo==4.1.1 -python-magic==0.4.27 -werkzeug==2.1.2 -opencv-python==4.6.0.66 -pika==1.3.0 -configargparse==1.5.3 -python-dotenv -kneed diff --git a/colmap/to_cam.py b/colmap/to_cam.py deleted file mode 100644 index 15b4a9b..0000000 --- a/colmap/to_cam.py +++ /dev/null @@ -1,119 +0,0 @@ -import json -import sys -import numpy as np -import math -import matplotlib as mpl -import matplotlib.pyplot as plt -from matplotlib.patches import Patch -from mpl_toolkits.mplot3d.art3d import Poly3DCollection - -class CameraPoseVisualizer: - def __init__(self, xlim, ylim, zlim): - self.fig = plt.figure(figsize=(18, 7)) - self.ax = self.fig.add_subplot(projection='3d') - self.ax.set_aspect("auto") - self.ax.set_xlim(xlim) - self.ax.set_ylim(ylim) - self.ax.set_zlim(zlim) - self.ax.set_xlabel('x') - self.ax.set_ylabel('y') - self.ax.set_zlabel('z') - - # Create axis - axes = [3, 3, 3] - - # Create Data - data = np.ones(axes) - print(data.shape) - - # Control Transparency - alpha = 0.9 - - # Control colour - colors = np.empty(axes + [4], dtype=np.float32) - - colors[:] = [1, 0, 0, alpha] # red - x,y,z = np.indices((4,4,4),dtype='float32') - x-= 1.5 - y-= 1.5 - z-= 1.5 - self.ax.voxels(x,y,z,data, facecolors=colors, edgecolors='grey') - print('initialize camera pose visualizer') - - def extrinsic2pyramid(self, extrinsic, color='r', focal_len_scaled=1, aspect_ratio=0.3): - vertex_std = np.array([[0, 0, 0, 1], - [focal_len_scaled * aspect_ratio, -focal_len_scaled * aspect_ratio, focal_len_scaled, 1], - [focal_len_scaled * aspect_ratio, focal_len_scaled * aspect_ratio, focal_len_scaled, 1], - [-focal_len_scaled * aspect_ratio, focal_len_scaled * aspect_ratio, focal_len_scaled, 1], - [-focal_len_scaled * aspect_ratio, -focal_len_scaled * aspect_ratio, focal_len_scaled, 1]]) - vertex_transformed = vertex_std @ extrinsic.T - meshes = [[vertex_transformed[0, :-1], vertex_transformed[1][:-1], vertex_transformed[2, :-1]], - [vertex_transformed[0, :-1], vertex_transformed[2, :-1], vertex_transformed[3, :-1]], - [vertex_transformed[0, :-1], vertex_transformed[3, :-1], vertex_transformed[4, :-1]], - [vertex_transformed[0, :-1], vertex_transformed[4, :-1], vertex_transformed[1, :-1]], - [vertex_transformed[1, :-1], vertex_transformed[2, :-1], vertex_transformed[3, :-1], vertex_transformed[4, :-1]]] - self.ax.add_collection3d( - Poly3DCollection(meshes, facecolors=color, linewidths=0.3, edgecolors=color, alpha=0.15)) - - def customize_legend(self, list_label): - list_handle = [] - for idx, label in enumerate(list_label): - color = plt.cm.rainbow(idx / len(list_label)) - patch = Patch(color=color, label=label) - list_handle.append(patch) - plt.legend(loc='right', bbox_to_anchor=(1.8, 0.5), handles=list_handle) - - def colorbar(self, max_frame_length): - cmap = mpl.cm.rainbow - norm = mpl.colors.Normalize(vmin=0, vmax=max_frame_length) - self.fig.colorbar(mpl.cm.ScalarMappable(norm=norm, cmap=cmap), orientation='vertical', label='Frame Number') - - def plot_cam(self, cam, color="blue"): - self.ax.scatter(cam[0],cam[1],cam[2], color= color) - - - def show(self): - print("Displaying Data") - plt.title('Extrinsic Parameters') - plt.show() - -if __name__ =='__main__': - print("Starting conversion") - input_file = sys.argv[1] - - input_str = open(input_file) - input = json.loads(input_str.read()) - - extrins = [] - for f in input["frames"]: - extrinsic = np.array(f["extrinsic_matrix"]) - extrins+=[ extrinsic ] - - visualizer = CameraPoseVisualizer([-5, 5], [-5, 5], [0, 5]) - cams = [] - for i,e in enumerate(extrins): - if i%3 == 0: - color = plt.cm.rainbow(i / len(extrins)) - visualizer.extrinsic2pyramid(e, color) - primary_point = np.asarray([0,0,-2,1]) - - r = e[0:3,0:3] - t = e[0:3,3] - c = -r.T @ t - print("Rotation:\n",r) - print("Translation:\n",t) - print("Cam:\n",c) - print() - visualizer.plot_cam(e @ primary_point, color) - secondary_point = np.asarray([0,0,-3,1]) - visualizer.plot_cam(e @ secondary_point, color) - - - cams.append(c) - visualizer.show() - - - - - - diff --git a/colmap/video_to_images.py b/colmap/video_to_images.py deleted file mode 100644 index 4c4b4a9..0000000 --- a/colmap/video_to_images.py +++ /dev/null @@ -1,224 +0,0 @@ -from genericpath import exists -import subprocess -import os -import sys -import logging -from pathlib import Path - -# new imports -import cv2 -from random import sample -#Usage: python video_to_images.py --flags -#Flags: --ffmpeg_exe_path "path" ==> Path to the ffmpeg executeable. -# > Defaults to looking for ffmpeg.exe in the folder this script is in. -# -# --wanted_frames "uint" ==> Number of frames we want to use -# > If total frames < wanted_frames, we default to total frames -# > Defaults to 200. -# -# --name "name" ==> Name of the folder to be created to store the data for this instance -# > of ffmpeg. -# > Defaults to "ffmpeg_output" -# -# --output_folder "path" ==> Directory to where ffmpeg will put its output. -# > Defaults to the folder where this script is -# -# --video_path "path" ==> Path to the video to be converted into its composite images. -# > Defaults to looking for "video.mp4" in the folder this script -# > is in. - - - -#split_video_into_frames function: -# -#creates a new folder called instance_name in output_path and fills it with the frames -# of the video at video_path. Samples wanted_frames amount of frames, -# or 200 frames by default -# -#returns a status code - -# 0 = Success -# 1 = Unspecified error -# 2 = FileExistsError; happens when you try to create data in an already existing folder -# 3 = FileNotFoundError; happens when you try to use an output folder that does not exist - -def split_video_into_frames(video_path, output_path, max_frames=200): - ## determines whether image is blurry or not. - # uses the variance of a laplacian transform to check for edges and returns true - # if the variance is less than the threshold and the video is determined to be blurry - - # Get Logger: - logger = logging.getLogger('sfm-worker') - - def is_blurry(image, THRESHOLD): - ## Convert image to grayscale - gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY) - ## run the variance of the laplacian transform to test blurriness - laplacian_var = cv2.Laplacian(gray, cv2.CV_64F).var() - return laplacian_var < THRESHOLD - - ## determines amount of blurriness - # see IS_BLURRY for more information - def blurriness(image): - ## Convert image to grayscale - gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY) - ## run the variance of the laplacian transform to test blurriness - laplacian_var = cv2.Laplacian(gray, cv2.CV_64F).var() - return laplacian_var - - # Create output folder - Path(f"{output_path}").mkdir(parents=True, exist_ok=True) - - ## determine video length: - # TODO: Check video type to ensure it is supported - vidcap = cv2.VideoCapture(video_path ) - frame_count = vidcap.get(cv2.CAP_PROP_FRAME_COUNT) - frame_count = int(frame_count) - - ## sample up to max frame count - sample_count = min(frame_count,max_frames) - logger.info("SAMPLE COUNTER {}".format(sample_count)) - - success, image = vidcap.read() - img_height = image.shape[0] - img_width = image.shape[1] - - ## Rank all images based off bluriness - blur_list = [] - ## check blurriness of all images and sort to caluculate threshold - while success: - image_blur = blurriness(image) - blur_list.append(image_blur) - success, image = vidcap.read() - - vidcap.release() - sorted_list = sorted(blur_list) - ## we want the remaining best images - ## e.g, if we want 75 images out of 100, threshold should be 25th image - threshold_img = len(blur_list) - sample_count - THRESHOLD = sorted_list[threshold_img] - - ## checks number of images within the threshold - count_good_img = 0 - for i in blur_list: - if i >= THRESHOLD: - count_good_img += 1 - - ## account for not enough images in threshold so that we return the exact number of images - if count_good_img > sample_count: - for i in range(count_good_img - sample_count): - for val in blur_list: - if val >= THRESHOLD: - val = 0 - break - - - ## If this threshold is too low, completely reject video - avg_threshold = (sorted_list[-1] + THRESHOLD)/2 - if avg_threshold < 100: - # ERROR: Video is too blurry. Please try again. - return 4 - - - needs_adjust = False ## determines if we need to adjust - aspect_ratio = img_height / img_width - #print (f"aspect ratio: {aspect_ratio}") - #print (f"img_width: {img_width}") - #print (f"img_height: {img_height}") - ## adjust as necessaryx - MAX_WIDTH = 200 - MAX_HEIGHT = 208 - - ## for resizing images - if (img_height > MAX_HEIGHT): - scaler = MAX_HEIGHT / img_height - img_height = (int) (img_height * scaler) - needs_adjust = True - - if (img_width > MAX_WIDTH): - scaler = MAX_WIDTH / img_width - img_width = (int) (scaler * img_width) - needs_adjust = True - - ## applying aspect ratio - if (aspect_ratio > 1): - img_width = (int) (img_width / aspect_ratio) - else: - img_height = (int) (img_height * aspect_ratio) - - #print(f"new img height: {img_height}") - #print(f"new img width: {img_width}") - dimensions = (img_width, img_height) - - - count = 0 - - ## write to the folder the images we want - vidcap = cv2.VideoCapture(video_path) - success, image = vidcap.read() - while success: - if (blur_list[count] >= THRESHOLD): - if (needs_adjust == True): - image = cv2.resize(image, dimensions, interpolation=cv2.INTER_LANCZOS4) - cv2.imwrite(f"{output_path}/img_{count}.png", image) - logger.info("Saved image {}".format(count)) - success, image = vidcap.read() - - count += 1 - vidcap.release() - - #Sucess, return 0 - ## can return img_width, img_height, and wanted_frames - return 0 - -def test(): - instance_name = "test" - output_path = "test_out" - ffmpeg_path = "" - video_path = "landscape_video" # change to whatever vid you want - wanted_frames = 200 - split_video_into_frames(instance_name, output_path, ffmpeg_path, video_path, wanted_frames) - -if __name__ == '__main__': - #Default flags - instance_name = "ffmpeg_output" - output_path = "./" - ffmpeg_path = r".\ffmpeg.exe" - video_path = r".\video.mp4" - wanted_frames = 24 - - logger = logging.getLogger('sfm-worker') - - #Parse flags - #Flag format up top - """ - for i in range (len(sys.argv)): - if i == 0: - continue - if sys.argv[i].startswith("--"): - match sys.argv[i]: - case "--output_folder": - output_path = sys.argv[i+1] - case "--name": - instance_name = sys.argv[i+1] - case "--ffmpeg_exe_path": - ffmpeg_path = sys.argv[i+1] - case "--video_path": - video_path = sys.argv[i+1] - case "--fps": - fps = sys.argv[i+1] - case _: - print("ERROR: Unrecognized flag", sys.argv[i]) - quit()""" - - #Calling split_video_into_frames - status = split_video_into_frames(instance_name, output_path, ffmpeg_path, video_path, wanted_frames=200) - if status == 0: - logger.info("ffmpeg ran successfully.") - elif status == 1: - logger.error("ERROR: There was an unknown error running ffmpeg") - elif status == 2: - logger.error("ERROR: ffmpeg - file {}/{} already exists.".format(output_path,instance_name)) - elif status == 3: - logger.error("ERROR: ffmpeg - file {} could not be found.".format(output_path)) - elif status == 4: - logger.error("ERROR: Video is too blurry.") \ No newline at end of file diff --git a/docker-compose-flask.yml b/docker-compose-flask.yml new file mode 100644 index 0000000..3e74f78 --- /dev/null +++ b/docker-compose-flask.yml @@ -0,0 +1,121 @@ +x-environment: &environment + COMPOSE_DOCKER_CLI_BUILD: 1 # Enables DOCKER_BUILDKIT (Used to cache go mod/py pip dependencies) + +services: + mongodb: + image: mongo:latest + container_name: mongodb + environment: + <<: *environment + MONGO_INITDB_ROOT_USERNAME: ${MONGO_INITDB_ROOT_USERNAME} + MONGO_INITDB_ROOT_PASSWORD: ${MONGO_INITDB_ROOT_PASSWORD} + ports: + - 27017:27017 + volumes: + - mongodb_data_container:/data/db + networks: + - backend + + rabbitmq: + container_name: rabbitmq + image: rabbitmq:3.8-management-alpine + environment: + <<: *environment + RABBITMQ_DEFAULT_USER: ${RABBITMQ_DEFAULT_USER} + RABBITMQ_DEFAULT_PASS: ${RABBITMQ_DEFAULT_PASS} + ports: + # AMQP protocol port + - "5672:5672" + # HTTP management UI + - "15672:15672" + networks: + - backend + + web-server: + build: + context: ./web-server + dockerfile: ./Dockerfile + image: web-server-img + container_name: web-server + depends_on: + - rabbitmq + - mongodb + environment: + <<: *environment + APP_PORT: 5000 + MONGO_INITDB_ROOT_USERNAME: ${MONGO_INITDB_ROOT_USERNAME} + MONGO_INITDB_ROOT_PASSWORD: ${MONGO_INITDB_ROOT_PASSWORD} + MONGO_IP: mongodb + JWT_SECRET: ${JWT_SECRET_KEY} + volumes: + - ./web-server/data:/app/data + - ./web-server/web-server.log:/app/web-server.log + ports: + - 5000:5000 + networks: + - backend + + sfm-worker: + build: + context: ./sfm-worker + dockerfile: Dockerfile + image: sfm-worker-img + container_name: sfm-worker + depends_on: + - rabbitmq + - web-server + environment: + <<: *environment + RABBITMQ_DEFAULT_USER: ${RABBITMQ_DEFAULT_USER} + RABBITMQ_DEFAULT_PASS: ${RABBITMQ_DEFAULT_PASS} + SFM_USE_GPU: ${SFM_USE_GPU} + volumes: + - ./sfm-worker:/app + ports: + - 5100:5100 + networks: + - backend + command: python3.10 main.py --config=configs/default.txt + deploy: # Use NVIDIA GPUs + resources: + reservations: + devices: + - driver: nvidia + count: all + capabilities: [gpu] + + nerf-worker: + build: + context: ./nerf-worker + dockerfile: Dockerfile + image: nerf-worker-img + container_name: nerf-worker + depends_on: + - rabbitmq + - web-server + volumes: + - ./nerf-worker:/app + environment: + <<: *environment + RABBITMQ_DEFAULT_USER: ${RABBITMQ_DEFAULT_USER} + RABBITMQ_DEFAULT_PASS: ${RABBITMQ_DEFAULT_PASS} + ports: + - 5200:5200 + networks: + - backend + command: python3.8 main.py + deploy: # Use NVIDIA GPUs + resources: + reservations: + devices: + - driver: nvidia + count: all + capabilities: [gpu] + +networks: + backend: + name: backend-network + driver: bridge + +volumes: + mongodb_data_container: \ No newline at end of file diff --git a/docker-compose-go.yml b/docker-compose-go.yml new file mode 100644 index 0000000..7e31038 --- /dev/null +++ b/docker-compose-go.yml @@ -0,0 +1,121 @@ +x-environment: &environment + COMPOSE_DOCKER_CLI_BUILD: 1 # Enables DOCKER_BUILDKIT (Used to cache go mod/py pip dependencies) + +services: + mongodb: + image: mongo:latest + container_name: mongodb + environment: + <<: *environment + MONGO_INITDB_ROOT_USERNAME: ${MONGO_INITDB_ROOT_USERNAME} + MONGO_INITDB_ROOT_PASSWORD: ${MONGO_INITDB_ROOT_PASSWORD} + ports: + - 27017:27017 + volumes: + - mongodb_data_container:/data/db + networks: + - backend + + rabbitmq: + container_name: rabbitmq + image: rabbitmq:3.8-management-alpine + environment: + <<: *environment + RABBITMQ_DEFAULT_USER: ${RABBITMQ_DEFAULT_USER} + RABBITMQ_DEFAULT_PASS: ${RABBITMQ_DEFAULT_PASS} + ports: + # AMQP protocol port + - "5672:5672" + # HTTP management UI + - "15672:15672" + networks: + - backend + + web-server: + build: + context: ./go-web-server + dockerfile: ./Dockerfile + image: web-server-img + container_name: web-server + depends_on: + - rabbitmq + - mongodb + environment: + <<: *environment + APP_PORT: 5000 + MONGO_INITDB_ROOT_USERNAME: ${MONGO_INITDB_ROOT_USERNAME} + MONGO_INITDB_ROOT_PASSWORD: ${MONGO_INITDB_ROOT_PASSWORD} + MONGO_IP: mongodb + JWT_SECRET: ${JWT_SECRET_KEY} + volumes: + - ./GoWebServer/data:/app/data + - ./GoWebServer/web-server.log:/app/web-server.log + ports: + - 5000:5000 + networks: + - backend + + sfm-worker: + build: + context: ./sfm-worker + dockerfile: Dockerfile + image: sfm-worker-img + container_name: sfm-worker + depends_on: + - rabbitmq + - web-server + environment: + <<: *environment + RABBITMQ_DEFAULT_USER: ${RABBITMQ_DEFAULT_USER} + RABBITMQ_DEFAULT_PASS: ${RABBITMQ_DEFAULT_PASS} + SFM_USE_GPU: ${SFM_USE_GPU} + volumes: + - ./sfm-worker:/app + ports: + - 5100:5100 + networks: + - backend + command: python3.10 main.py --config=configs/default.txt + deploy: # Use NVIDIA GPUs + resources: + reservations: + devices: + - driver: nvidia + count: all + capabilities: [gpu] + + nerf-worker: + build: + context: ./nerf-worker + dockerfile: Dockerfile + image: nerf-worker-img + container_name: nerf-worker + depends_on: + - rabbitmq + - web-server + volumes: + - ./nerf-worker:/app + environment: + <<: *environment + RABBITMQ_DEFAULT_USER: ${RABBITMQ_DEFAULT_USER} + RABBITMQ_DEFAULT_PASS: ${RABBITMQ_DEFAULT_PASS} + ports: + - 5200:5200 + networks: + - backend + command: python3.8 main.py + deploy: # Use NVIDIA GPUs + resources: + reservations: + devices: + - driver: nvidia + count: all + capabilities: [gpu] + +networks: + backend: + name: backend-network + driver: bridge + +volumes: + mongodb_data_container: \ No newline at end of file diff --git a/docker-compose.yml b/docker-compose.yml deleted file mode 100644 index ebd8d73..0000000 --- a/docker-compose.yml +++ /dev/null @@ -1,103 +0,0 @@ -version: '3.7' -services: - mongodb: - image: mongo:latest - container_name: mongodb - environment: - MONGO_INITDB_ROOT_USERNAME: ${MONGO_INITDB_ROOT_USERNAME} - MONGO_INITDB_ROOT_PASSWORD: ${MONGO_INITDB_ROOT_PASSWORD} - ports: - - 27017:27017 - volumes: - - mongodb_data_container:/data/db - networks: - - backend - - rabbitmq: - container_name: rabbitmq - image: rabbitmq:3.8-management-alpine - environment: - RABBITMQ_DEFAULT_USER: ${RABBITMQ_DEFAULT_USER} - RABBITMQ_DEFAULT_PASS: ${RABBITMQ_DEFAULT_PASS} - ports: - # AMQP protocol port - - '5672:5672' - # HTTP management UI - - '15672:15672' - networks: - - backend - - web-server: - build: - context: . - dockerfile: ./web-server/Dockerfile - image: web-server-img - container_name: web-server - depends_on: - - rabbitmq - - mongodb - environment: - APP_PORT: 5000 - volumes: - # Host directory : Container directory - - ./web-server:/web-server - ports: - - "5000:5000" - networks: - - backend - command: python3 main.py --configip configs/docker_in.json - - sfm-worker: - build: - context: . - dockerfile: ./colmap/Dockerfile - - image: sfm-worker-img - container_name: sfm-worker - depends_on: - - rabbitmq - - web-server - volumes: - - ./colmap:/colmap - - ports: - - 5100:5100 - networks: - - backend - command: python3.10 main.py --config=configs/default.txt - - nerf-worker: - build: - context: . - dockerfile: ./TensoRF/Dockerfile - - image: nerf-worker-img - container_name: nerf-worker - depends_on: - - rabbitmq - - web-server - volumes: - - ./TensoRF:/TensoRF - - ports: - - 5200:5200 - networks: - - backend - command: python3.10 main.py - - deploy: - resources: - reservations: - devices: - - driver: nvidia - count: all - capabilities: [gpu] - -networks: - backend: - name: backend-network - driver: bridge - -volumes: - mongodb_data_container: - diff --git a/CODE_OF_CONDUCT.md b/docs/CODE_OF_CONDUCT.md similarity index 100% rename from CODE_OF_CONDUCT.md rename to docs/CODE_OF_CONDUCT.md diff --git a/Contributing.md b/docs/Contributing.md similarity index 100% rename from Contributing.md rename to docs/Contributing.md diff --git a/README_diagram.md b/docs/README_diagram.md similarity index 100% rename from README_diagram.md rename to docs/README_diagram.md diff --git a/SSH_SETUP.md b/docs/SSH_SETUP.md similarity index 100% rename from SSH_SETUP.md rename to docs/SSH_SETUP.md diff --git a/empty_data.sh b/empty_data.sh deleted file mode 100644 index 1d294ff..0000000 --- a/empty_data.sh +++ /dev/null @@ -1,12 +0,0 @@ -# Description: This script empties all the local run data, this -# is particularly useful when you want to keep the filesize small -# while debugging or testing the code. - -find ./colmap/data/inputs -type f -delete -find ./colmap/data/outputs -mindepth 1 -delete -find ./TensoRF/data/sfm_data -mindepth 1 -delete -find ./TensoRF/data/nerf_data -mindepth 1 -delete -find ./TensoRF/log -mindepth 1 -delete -find ./web-server/data/nerf -mindepth 1 -delete -find ./web-server/data/raw/videos -mindepth 1 -delete -find ./web-server/data/sfm -mindepth 1 -delete \ No newline at end of file diff --git a/go-web-server b/go-web-server new file mode 160000 index 0000000..c499c9e --- /dev/null +++ b/go-web-server @@ -0,0 +1 @@ +Subproject commit c499c9e8cefee4bc9d77aa0625ff0dc54b0c939c diff --git a/nerf-worker b/nerf-worker new file mode 160000 index 0000000..530b6b6 --- /dev/null +++ b/nerf-worker @@ -0,0 +1 @@ +Subproject commit 530b6b6d3dd661d53708bb07466e652d607a3967 diff --git a/scripts/empty_data.sh b/scripts/empty_data.sh new file mode 100755 index 0000000..5fc1c5b --- /dev/null +++ b/scripts/empty_data.sh @@ -0,0 +1,21 @@ +# Description: This script empties all the local run data, this +# is particularly useful when using volume mapping and want to keep +# disk impact small while debugging or testing the code. + +# Note this does not clear the references to this data in mongoDB +# so you may need to manually clear the database if you want to start +# fresh. + +find ./sfm-worker/data/inputs -type f -delete +find ./sfm-worker/data/outputs -mindepth 1 -delete + +find ./nerf-worker/data/nerf -mindepth 1 -delete +find ./nerf-worker/data/sfm -mindepth 1 -delete + +find ./web-server/data/nerf -mindepth 1 -delete +find ./web-server/data/raw/videos -mindepth 1 -delete +find ./web-server/data/sfm -mindepth 1 -delete + +find ./go-web-server/data/nerf -mindepth 1 -delete +find ./go-web-server/data/raw/videos -mindepth 1 -delete +find ./go-web-server/data/sfm -mindepth 1 -delete diff --git a/sfm-worker b/sfm-worker new file mode 160000 index 0000000..035daae --- /dev/null +++ b/sfm-worker @@ -0,0 +1 @@ +Subproject commit 035daae43065860d99e8e53bd84b55756770d500 diff --git a/web-server/.gitignore b/web-server/.gitignore deleted file mode 100644 index b687e58..0000000 --- a/web-server/.gitignore +++ /dev/null @@ -1,6 +0,0 @@ -__pycache__ -data/raw/videos/* -data/sfm/* -data/nerf/* -Pipfile -*.log \ No newline at end of file diff --git a/web-server/DOCKER-COMPOSE-SETUP.md b/web-server/DOCKER-COMPOSE-SETUP.md deleted file mode 100644 index 36d02cb..0000000 --- a/web-server/DOCKER-COMPOSE-SETUP.md +++ /dev/null @@ -1,44 +0,0 @@ -## Start docker-compose file - -docker-compose up -d - -#### Update container configuration - -If you want to change some container configuration: - -```shell -docker-compose up -d --no-deps --build {service-name} -``` - -#### Connect to container: - -```shell -docker-compose exec {service-name} bash -``` -rabbitmq: -```shell -docker-compose exec rabbitmq3 bash -``` -mongodb: -```shell -docker-compose exec mongodb_container bash -``` - -#### Restart everything: - -```shell -docker-compose restart -``` - -#### Stop everything: - -```shell -docker-compose stop -``` - -#### Remove everything: - -```shell -docker-compose down -v -``` - diff --git a/web-server/Dockerfile b/web-server/Dockerfile deleted file mode 100644 index 1cc5faa..0000000 --- a/web-server/Dockerfile +++ /dev/null @@ -1,13 +0,0 @@ -FROM python:3.10.8-slim - -WORKDIR /web-server - -# Change for deployment to local directory -COPY ./web-server/requirements.txt requirements.txt -RUN pip3 install --upgrade pip -RUN pip3 install -r requirements.txt - -# Overwritten by compose -COPY . . - -CMD ["python3", "main.py"] diff --git a/web-server/LOGLEVELS.md b/web-server/LOGLEVELS.md deleted file mode 100644 index 26201ba..0000000 --- a/web-server/LOGLEVELS.md +++ /dev/null @@ -1,9 +0,0 @@ -# Logging Levels - -| Level | Numeric Value | -| :--- | ---: | -| CRITICAL | 50 | -| ERROR | 40 | -| WARNING | 30 | -| INFO | 20 | -| DEBUG | 10 | diff --git a/web-server/README.md b/web-server/README.md deleted file mode 100644 index 732f7b9..0000000 --- a/web-server/README.md +++ /dev/null @@ -1,142 +0,0 @@ -# Web-Server - -This flask based webserver coordinates with the workers using RabbitMQ to send jobs to async workers process user videos. - -## Description -The webserver is based on a model view controller architecture with an additional services layer on top of the model. Following the MVC design patter the controller defines the API and takes user requests from the front end. The controller then utilizes the services and views to generate it's response to the user. The model is the python interface with mongoDB and the View (not implemented yet) contains rendering templates to return the to the user. The services layer is where the majority of the logic is held and has a synchronous portion responsible for publishing work requests to the asynchronous workers and an asynchronous portion responsible to listening to the output queues to receive data from the workers when they are complete. -### Web-server Structure - -![](../pics/Webserver.png) - -### Running the WebServer -``` -python main.py --configip configs/docker_in.json -``` -``` -python main.pu --configip configs/docker_out.json -``` -The config file used is based on where you running the code: - docker_in.json - if you want to run inside a docker container - docker_out.json - if you want to run locally - -### Config Files Structure -``` -{ - webserver: IP1, - mongodb: IP2, - rabbitmq: IP3 -} -``` -The config files are json files and are structured almost like a dictionary. -They have keys and values. The key and values for our purposes would be strings: - keys - Identifier for the IP - values - IP - -### Local File structure -When running the web-server a local file structure should be created to match the following layout. This is where the server will store all images, videos, and non text based data. -``` -├── data -| ├── raw -| | |── videos -│ ├── sfm -| | |── JOBID -| | |──imgs -│ ├── nerf -│ | |── JOBID - -``` -### DB structure -All the data in mongodb is in a single collection labeled scenes. A scene represents a single request from the user to render a scene and contains the following data: - -Scene: -``` -{ - "id":str, - "status":int, - "video":