Colab demo? / Headless server version? #147

INF800 · 2022-01-14T15:51:17Z

INF800
Jan 14, 2022

If there isn't a colab demo already, will send a PR. Please let me know if there will be any OOM issues or any technical issues that I many face.

Answered by myagues

Jan 19, 2022

Can confirm it works in Colab (link) (with a T4), only downside is it takes some 5-10 min of compile time given that Colab allocates only 2 CPUs.

Maybe an approach could be copying the compiled folder to the user's GDrive, so it could be reusable in next runs, and avoid recompilation, hoping you get the same GPU in the Colab lottery.

View full answer

Tom94 · 2022-01-14T17:17:25Z

Tom94
Jan 14, 2022
Maintainer

No, we haven't looked into colab at all, actually. It would be great to have, thank you!

The CPU memory usage is relatively tame. GPU memory usage is on the order of the size of the dataset (plus a few GB for temporary training, inference, and render buffers).

Unfortunately, the codebase is not particularly optimized for being memory-economical. We've been spoiled by the 3090's 24GB.

Example GPU RAM usages:

NeRF (bundled fox dataset): 7.56 GB
SDF (bundled armadillo mesh): 1.73 GB

You can read out the GPU RAM usage at the top of the UI

0 replies

pwais · 2022-01-14T19:58:04Z

pwais
Jan 14, 2022

with Colab, you might need to get lucky and get a V100 to get anywhere (might be Colab Pro only?)... the P100s and K80s don't have Tensor cores, and somebody else found you can't seem to build tiny-cuda-nn with Pascal or Maxwell: NVlabs/tiny-cuda-nn#10

Tensor cores were introduced in Volta? So you'd need a V100, Titan V, or RTX 20xx or better to try this project.
Edit: sounds like tiny-cuda-nn might require Turing tensor cores, so no V100 support :( #13

What would be really cool is if tiny-cuda-nn and/or this project could provide a fused ops / network that does not require tensor cores and can work for the older GPU architectures-- it will be slower but would still probably be faster than altneratives (pytorch / tensorflow etc). TensorRT has fused ops for the older architectures and these might provide easy drop-ins (at least, likely for inference).

0 replies

myagues · 2022-01-17T19:40:52Z

myagues
Jan 17, 2022

It should be possible to run on Colab now that lower compute capabilities are allowed, but I'm stuck at compilation with the following error:

[ 98%] Linking CXX executable testbed
/usr/lib/gcc/x86_64-linux-gnu/7/../../../x86_64-linux-gnu/libGL.so: undefined reference to `_glapi_tls_Current'
collect2: error: ld returned 1 exit status
CMakeFiles/testbed.dir/build.make:115: recipe for target 'testbed' failed
make[2]: *** [testbed] Error 1
CMakeFiles/Makefile2:199: recipe for target 'CMakeFiles/testbed.dir/all' failed
make[1]: *** [CMakeFiles/testbed.dir/all] Error 2
make[1]: *** Waiting for unfinished jobs....
[100%] Linking CXX shared library pyngp.cpython-37m-x86_64-linux-gnu.so
[100%] Built target pyngp
Makefile:90: recipe for target 'all' failed
make: *** [all] Error 2

Here is a link for reproducing it.

0 replies

Tom94 · 2022-01-17T19:53:51Z

Tom94
Jan 17, 2022
Maintainer

Progress! Thanks for reporting!

Does python3 scripts/run.py --scene data/nerf/fox run?
There's currently a half-working CMake option NGP_BUILD_WITH_GUI=OFF that's supposed to build the project without GUI (and thus without linking to OpenGL and GLFW). I'm saying "half-working", because I haven't removed all references to GL-related symbols yet -- in light of your problem I'll prioritize this tomorrow and report back.

Edit: you can now run cmake -DNGP_BUILD_WITH_GUI=OFF <remaining params> to build instant-ngp without linking GLFW, ImGUI, and OpenGL for headless operation. Hopefully this will work around the linker error you encountered.

0 replies

pwais · 2022-01-17T22:12:43Z

pwais
Jan 17, 2022

FWIW, at least EGL works in Colab, see e.g. the pyrender demo notebook: https://colab.research.google.com/drive/1pcndwqeY8vker3bLKQNJKr3B-7-SYenE?usp=sharing

There's no X11 though. It would be pretty nice to have imgui over websocket for Colab / Jupyter (e.g. via https://github.com/ggerganov/imgui-ws -- see in-browser demos ) but I don't see anybody has tried that yet.

0 replies

Tom94 · 2022-01-18T10:37:25Z

Tom94
Jan 18, 2022
Maintainer

If someone with access to a K80 machine could check whether it runs now, that'd be appreciated. :)

0 replies

pwais · 2022-01-18T22:04:10Z

pwais
Jan 18, 2022

🔥 🔥 🔥 Thanks @Tom94 !! 🔥 🔥 🔥

I had to remove transforms_val.json from the lego scene to avoid an OOM during train, and I made some small mods for test time. I think the testbed just loads all images / rays into GPU memory (even temp memory) for all dataset transform_*.json meta files no matter if they get used or not, so skipping that might help avoid some OOMs.

Overall: The K80 is about 60x slower than a 30-series GPU, but also about 60x cheaper at the time of writing (YMMV but check ebay). I've seen 100x slowdown for pytorch stuff so 60x is pretty good.

(What about K40? Note that K40 seems to be compute 35, while K80 is compute 37. A K80 is basically two K40s on the same card. At the time of writing, an AWS p2.xlarge with a single K80 (two separate devices, 11gb memory each) is ~$0.90/hr or $0.25/hr spot price. In Google Colab free version or Kaggle, you're likely to get a K80 or slightly better).

Other than that one training change, here's what I see for Nerf lego train out of the box on a K80 :

python3 scripts/run.py --scene=data/nerf_synthetic/lego/ --mode=nerf --screenshot_transforms=data/nerf_synthetic/lego/transforms_test.json --screenshot_w=800 --screenshot_h=800 --screenshot_dir=data/nerf_synthetic/lego/screenshots --save_snapshot=data/nerf_synthetic/lego/snapshot.msgpack --n_steps=1000
21:38:39 INFO     Loading NeRF dataset from
21:38:39 INFO       data/nerf_synthetic/lego/transforms_test.json
21:38:39 INFO       data/nerf_synthetic/lego/transforms_train.json
21:38:39 SUCCESS  Loaded 300 images of size 800x800 after 0s
21:38:39 INFO       cam_aabb=[min=[0.5,0.5,0.5], max=[0.5,0.5,0.5]]
21:38:39 INFO     Loading network config from: /opt/instant-ngp/configs/nerf/base.json
21:38:39 INFO     GridEncoding:  Nmin=16 b=1.38191 F=2 T=2^19 L=16
Warning: FullyFusedMLP is not supported for the selected architecture 37. Falling back to CutlassMLP. For maximum performance, raise the target GPU architecture to 75+.
Warning: FullyFusedMLP is not supported for the selected architecture 37. Falling back to CutlassMLP. For maximum performance, raise the target GPU architecture to 75+.
21:38:39 INFO     Density model: 3--[HashGrid]-->32--[FullyFusedMLP(neurons=64,layers=3)]-->1
21:38:39 INFO     Color model:   3--[SphericalHarmonics]-->16+16--[FullyFusedMLP(neurons=64,layers=4)]-->3
21:38:39 INFO       total_encoding_params=12196240 total_network_params=9728
Screenshot transforms from  data/nerf_synthetic/lego/transforms_test.json
Training:  34%|███████████████████████████████▉                                                               | 336/1000 [01:44<03:45,  2.94step/s, loss=0.00474]

Final train time (as reported) was 05:54 with loss=0.00275.

nvidia-smi during training:

|   1  Tesla K80           Off  | 00000000:04:00.0 Off |                    0 |
| N/A   45C    P0    92W / 149W |  11271MiB / 11441MiB |    100%      Default |
|                               |                      |                  N/A |

So the K80 is about 60x slower than a 30-series GPU (6 seconds -> 360 seconds). In my experience, pytorch stuff (high i/o) is a 50x-100x lag, so this is pretty nice! Clearly the implementation helps a ton.

Once the model finishes training, I do get an OOM when rendering tries to start:
RuntimeError: Could not allocate memory: CUDA Error: cudaMalloc(&rawptr, n_bytes+DEBUG_GUARD_SIZE*2) failed with error out of memory

For rendering, I did this:

swapped the above command's save_snapshot for load_snapshot
commented out the train loop in run.py:

instant-ngp/scripts/run.py

Line 167 in 9ad813b

if n_steps > 0:
added a tqdm to monitor rendering progress:

instant-ngp/scripts/run.py

Line 288 in 9ad813b

for idx in args.screenshot_frames:

I see moderate GPU memory usage:

|   1  Tesla K80           Off  | 00000000:04:00.0 Off |                    0 |
| N/A   43C    P0    93W / 149W |   3319MiB / 11441MiB |    100%      Default |
|                               |                      |                  N/A |

Rendering is about 11/sec per frame: 5/200 [00:55<36:09, 11.12s/it]

Most importantly, the render looks good, no different for 1000 iters as other GPU:

0 replies

Tom94 · 2022-01-19T06:39:09Z

Tom94
Jan 19, 2022
Maintainer

Awesome, thank you so much for testing!

You don’t actually need to delete transforms_val.json et al. You can directly pass a path to the training transforms to testbed — then it will train from just that one .json file rather than all it finds in the folder.

In the above, I believe you ended up training using also the testing transforms, so there’s more memory to be saved by not loading their respective images.

0 replies

pwais · 2022-01-19T07:51:59Z

pwais
Jan 19, 2022

@Tom94 oh my bad! I have not been able to use the GUI yet so I didn't know --scene would concatenate everything (perhaps that doesn't happen in the GUI?). The README just provides the lego dir. So the command should be:

python3 scripts/run.py --scene=data/nerf_synthetic/lego/transforms_train.json ...

That does save memory, and, erm, results in more correct training too :) 🎉

0 replies

myagues · 2022-01-19T11:58:18Z

myagues
Jan 19, 2022

Can confirm it works in Colab (link) (with a T4), only downside is it takes some 5-10 min of compile time given that Colab allocates only 2 CPUs.

Maybe an approach could be copying the compiled folder to the user's GDrive, so it could be reusable in next runs, and avoid recompilation, hoping you get the same GPU in the Colab lottery.

4 replies

1kaiser Feb 21, 2022

https://colab.research.google.com/drive/10TgQ4gyVejlHiinrmm5XOvQQmgVziK3i

can you update it aa little so that

!mv {compiled files} {some_folder_in_google_drive}

can move those compiled files

and
!mv {compiled_files_in_google_drive} {current_session_file_system}

for drive to running session in their respective folders

useronym Feb 21, 2022

I have a copy here that does what you ask https://colab.research.google.com/drive/1g8mB_RAvyqn73y1EChV-sUQrKLw4uY1d?usp=sharing

It clones the repo into your google drive and does the compilation there

1kaiser Feb 21, 2022

okay 🙏🏻 cool , it keeps the compiled files required in google drive. thats a cool approach >>> doing everything insiide google drive and reconnecting the drive anywhere we like !!!!

1kaiser Oct 30, 2022

nowwe can run this instant ngp called NerfStudio but it still requires to be run inside google drive <<<

pwais · 2022-01-19T20:53:08Z

pwais
Jan 19, 2022

the repo builds & works in docker with nvidia/cuda base image(s) ( #20 ) so it's likely that building a binary in a CUDA 11.2 base image (seems that's what Colab was using there) could work in Colab.

5-10 mins isn't that bad tho, there are many Colab notebooks like Nerfies ( https://colab.research.google.com/github/google/nerfies/blob/main/notebooks/Nerfies_Capture_Processing.ipynb ) that can take 30 mins or more to set up or run.

Huggingface spaces wouldn't offer the notebook environment, but since this project has its own nice GUI, it might be a better match https://huggingface.co/spaces/launch

0 replies

tlightsky · 2022-01-30T03:56:44Z

tlightsky
Jan 30, 2022

It should be possible to run on Colab now that lower compute capabilities are allowed, but I'm stuck at compilation with the following error:

[ 98%] Linking CXX executable testbed
/usr/lib/gcc/x86_64-linux-gnu/7/../../../x86_64-linux-gnu/libGL.so: undefined reference to `_glapi_tls_Current'
collect2: error: ld returned 1 exit status
CMakeFiles/testbed.dir/build.make:115: recipe for target 'testbed' failed
make[2]: *** [testbed] Error 1
CMakeFiles/Makefile2:199: recipe for target 'CMakeFiles/testbed.dir/all' failed
make[1]: *** [CMakeFiles/testbed.dir/all] Error 2
make[1]: *** Waiting for unfinished jobs....
[100%] Linking CXX shared library pyngp.cpython-37m-x86_64-linux-gnu.so
[100%] Built target pyngp
Makefile:90: recipe for target 'all' failed
make: *** [all] Error 2

Here is a link for reproducing it.

met exact issue in colab

0 replies

Tom94 · 2022-01-30T07:16:11Z

Tom94
Jan 30, 2022
Maintainer

Hi there, you can avoid this error by compiling testbed without GUI support

cmake -DNGP_BUILD_WITH_GUI=OFF <remaining params>

This way, it won't try to link to OpenGL, which you presumably don't need when running in colab. (You can still render out images as numpy arrays.)

0 replies

loboere · 2022-02-01T22:58:04Z

loboere
Feb 1, 2022

how to see the rendering in colab?

0 replies

pwais · 2022-02-01T23:56:54Z

pwais
Feb 1, 2022

how to see the rendering in colab?

It's gonna be really hard to do that :( There might be a path thru websockets (e.g. https://github.com/ggerganov/imgui-ws ) or perhaps some way of standing up an X server / VNC on colab. The GUI is pretty killer though, it could be worth the hassle.

1 reply

cduguet Feb 17, 2022

how to see the rendering in colab?

It's gonna be really hard to do that :( There might be a path thru websockets (e.g. https://github.com/ggerganov/imgui-ws ) or perhaps some way of standing up an X server / VNC on colab. The GUI is pretty killer though, it could be worth the hassle.

I managed to have it running on AWS with NICE-DCV. Just follow the instructions in their guide and you're pretty much set. You could try vnc or other options as well.

I'm testing Ubuntu 18.04 with an NVIDIA A10, and it's working wonderfully.

Tom94 · 2022-02-02T07:42:40Z

Tom94
Feb 2, 2022
Maintainer

If rendering only a single image (or a handful) is desired, you can call testbed.render(width, height, spp=8, linear=False) to get a numpy array that you can imshow or similar.

spp refers to "samples per pixel" and it'll mostly wind up doing anti-aliasing for you, aside from getting rid of a little bit of raymarching noise. If performance is a concern, you can set it to 1 for fastest rendering.

Note that the returned colors will be sRGB if linear == False, which is likely what you want if you'd like to directly display the image or save it as png/jpg. Use linear colors only if you want to tonemap the image yourself.

0 replies

loboere · 2022-02-05T05:48:07Z

loboere
Feb 5, 2022

If rendering only a single image (or a handful) is desired, you can call testbed.render(width, height, spp=8, linear=False) to get a numpy array that you can imshow or similar.

spp refers to "samples per pixel" and it'll mostly wind up doing anti-aliasing for you, aside from getting rid of a little bit of raymarching noise. If performance is a concern, you can set it to 1 for fastest rendering.

Note that the returned colors will be sRGB if linear == False, which is likely what you want if you'd like to directly display the image or save it as png/jpg. Use linear colors only if you want to tonemap the image yourself.

how run testbed.render(width, height, spp=8, linear=False) ?
i get this error

NameError Traceback (most recent call last)
in ()
----> 1 testbed.render(width, height, spp=8, linear=False)

NameError: name 'testbed' is not defined

0 replies

Tom94 · 2022-02-05T08:36:48Z

Tom94
Feb 5, 2022
Maintainer

You'll have to first instantiate a testbed object and train it (or load a snapshot) before rendering makes sense. I recommend consulting scripts/run.py for an example of how it can be used. Its screenshot functionality uses the .render method.

0 replies

GeorvityLabs · 2022-05-07T22:37:43Z

GeorvityLabs
May 7, 2022

@myagues can you add a feature to stop at say 20k iterations and automatically save tghe .msgpack file?
that way we could train on colab gpu and test it on local system , by loading in the snapshot.

3 replies

myagues May 12, 2022

I think this can be done with:

!./build/testbed --scene data/nerf/fox --n_steps 20000 --save_snapshot fox_20k.msgpack

and use --load_snapshot if you want to resume training:

!./build/testbed --scene data/nerf/fox --n_steps 20000 --load_snapshot fox_20k.msgpack --save_snapshot fox_40k.msgpack

You can find the list of arguments in run.py.

enn72 Aug 5, 2022

I tried this but testbed doesn't seem to have n_steps and save_snapshot flags.

enn72 Aug 6, 2022

I need help on how to change the number of iterations and where to find the output. Also, is it possible to get mesh as the output?

3a1b2c3 · 2022-06-01T23:20:26Z

3a1b2c3
Jun 1, 2022

I can train nicely on colab (Linux machine). Because i want to use the gui locally with the snapshot i downloaded *.msgpack to my windows build but the snapshot doesnt seem compatible.
What graphics card did you pick on NICE Dev? https://aws.amazon.com/marketplace/pp/prodview-rqbx26yyzexuu

Need to go through the code to understand that rendering required the width flag, would be nice in the help.

08:42:31 INFO Loading network config from: d:\pworkspace\seeingSpace\nerfs\candidates\selected__\instant-ngp\data\nerf\fox\chair\chair\chairabb8.msgpack
Traceback (most recent call last):
File "./scripts/run.py", line 117, in
testbed.load_snapshot(args.load_snapshot)
RuntimeError: [json.exception.type_error.302] type must be number, but is null

0 replies

innovinitylabs · 2022-07-22T01:49:47Z

GeorvityLabs · 2022-08-05T19:11:29Z

GeorvityLabs
Aug 5, 2022

what about completely headless, i.e : upload .mp4 to colab , convert to images , get annotations using colmap , get .msgpack file , and then finally render it out in the browser using MobileNeRF?
https://github.com/google-research/jax3d/tree/main/jax3d/projects/mobilenerf

2 replies

3a1b2c3 Aug 6, 2022

Avoids setting up cameras, so no user input needed. Completely different tech stack though?

tlightsky Aug 6, 2022

what about completely headless, i.e : upload .mp4 to colab , convert to images , get annotations using colmap , get .msgpack file , and then finally render it out in the browser using MobileNeRF? https://github.com/google-research/jax3d/tree/main/jax3d/projects/mobilenerf

can mobile nerf use model from instant-ngp?

Colab demo? / Headless server version? #147

Replies: 22 comments · 12 replies

Tom94 Jan 14, 2022 Maintainer

Tom94 Jan 17, 2022 Maintainer

Tom94 Jan 18, 2022 Maintainer

Tom94 Jan 19, 2022 Maintainer

Tom94 Jan 30, 2022 Maintainer

Tom94 Feb 2, 2022 Maintainer

how run testbed.render(width, height, spp=8, linear=False) ? i get this error

Tom94 Feb 5, 2022 Maintainer

Replies: 22 comments 12 replies

Tom94
Jan 14, 2022
Maintainer

Tom94
Jan 17, 2022
Maintainer

Tom94
Jan 18, 2022
Maintainer

Tom94
Jan 19, 2022
Maintainer

Tom94
Jan 30, 2022
Maintainer

Tom94
Feb 2, 2022
Maintainer

how run testbed.render(width, height, spp=8, linear=False) ?
i get this error

Tom94
Feb 5, 2022
Maintainer