-
Notifications
You must be signed in to change notification settings - Fork 1.9k
This issue was moved to a discussion.
You can continue the conversation there. Go to discussion →
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Colab demo? / Headless server version? #6
Comments
No, we haven't looked into colab at all, actually. It would be great to have, thank you! The CPU memory usage is relatively tame. GPU memory usage is on the order of the size of the dataset (plus a few GB for temporary training, inference, and render buffers). Unfortunately, the codebase is not particularly optimized for being memory-economical. We've been spoiled by the 3090's 24GB. Example GPU RAM usages:
|
with Colab, you might need to get lucky and get a V100 to get anywhere (might be Colab Pro only?)... the P100s and K80s don't have Tensor cores, and somebody else found you can't seem to build tiny-cuda-nn with Pascal or Maxwell: NVlabs/tiny-cuda-nn#10 Tensor cores were introduced in Volta? So you'd need a V100, Titan V, or RTX 20xx or better to try this project. What would be really cool is if tiny-cuda-nn and/or this project could provide a fused ops / network that does not require tensor cores and can work for the older GPU architectures-- it will be slower but would still probably be faster than altneratives (pytorch / tensorflow etc). TensorRT has fused ops for the older architectures and these might provide easy drop-ins (at least, likely for inference). |
It should be possible to run on Colab now that lower compute capabilities are allowed, but I'm stuck at compilation with the following error: [ 98%] Linking CXX executable testbed
/usr/lib/gcc/x86_64-linux-gnu/7/../../../x86_64-linux-gnu/libGL.so: undefined reference to `_glapi_tls_Current'
collect2: error: ld returned 1 exit status
CMakeFiles/testbed.dir/build.make:115: recipe for target 'testbed' failed
make[2]: *** [testbed] Error 1
CMakeFiles/Makefile2:199: recipe for target 'CMakeFiles/testbed.dir/all' failed
make[1]: *** [CMakeFiles/testbed.dir/all] Error 2
make[1]: *** Waiting for unfinished jobs....
[100%] Linking CXX shared library pyngp.cpython-37m-x86_64-linux-gnu.so
[100%] Built target pyngp
Makefile:90: recipe for target 'all' failed
make: *** [all] Error 2 Here is a link for reproducing it. |
Progress! Thanks for reporting!
Edit: you can now run |
FWIW, at least EGL works in Colab, see e.g. the pyrender demo notebook: https://colab.research.google.com/drive/1pcndwqeY8vker3bLKQNJKr3B-7-SYenE?usp=sharing There's no X11 though. It would be pretty nice to have imgui over websocket for Colab / Jupyter (e.g. via https://github.com/ggerganov/imgui-ws -- see in-browser demos ) but I don't see anybody has tried that yet. |
If someone with access to a K80 machine could check whether it runs now, that'd be appreciated. :) |
🔥 🔥 🔥 Thanks @Tom94 !! 🔥 🔥 🔥 I had to remove Overall: The K80 is about 60x slower than a 30-series GPU, but also about 60x cheaper at the time of writing (YMMV but check ebay). I've seen 100x slowdown for pytorch stuff so 60x is pretty good. (What about K40? Note that K40 seems to be compute 35, while K80 is compute 37. A K80 is basically two K40s on the same card. At the time of writing, an AWS p2.xlarge with a single K80 (two separate devices, 11gb memory each) is ~$0.90/hr or $0.25/hr spot price. In Google Colab free version or Kaggle, you're likely to get a K80 or slightly better). Other than that one training change, here's what I see for Nerf lego train out of the box on a K80 :
Final train time (as reported) was 05:54 with nvidia-smi during training:
So the K80 is about 60x slower than a 30-series GPU (6 seconds -> 360 seconds). In my experience, pytorch stuff (high i/o) is a 50x-100x lag, so this is pretty nice! Clearly the implementation helps a ton. Once the model finishes training, I do get an OOM when rendering tries to start: For rendering, I did this:
I see moderate GPU memory usage:
Rendering is about 11/sec per frame: Most importantly, the render looks good, no different for 1000 iters as other GPU: |
Awesome, thank you so much for testing! You don’t actually need to delete transforms_val.json et al. You can directly pass a path to the training transforms to testbed — then it will train from just that one .json file rather than all it finds in the folder. In the above, I believe you ended up training using also the testing transforms, so there’s more memory to be saved by not loading their respective images. |
@Tom94 oh my bad! I have not been able to use the GUI yet so I didn't know
That does save memory, and, erm, results in more correct training too :) 🎉 |
Can confirm it works in Colab (link) (with a T4), only downside is it takes some 5-10 min of compile time given that Colab allocates only 2 CPUs. Maybe an approach could be copying the compiled folder to the user's GDrive, so it could be reusable in next runs, and avoid recompilation, hoping you get the same GPU in the Colab lottery. |
the repo builds & works in docker with 5-10 mins isn't that bad tho, there are many Colab notebooks like Nerfies ( https://colab.research.google.com/github/google/nerfies/blob/main/notebooks/Nerfies_Capture_Processing.ipynb ) that can take 30 mins or more to set up or run. Huggingface spaces wouldn't offer the notebook environment, but since this project has its own nice GUI, it might be a better match https://huggingface.co/spaces/launch |
met exact issue in colab |
Hi there, you can avoid this error by compiling testbed without GUI support cmake -DNGP_BUILD_WITH_GUI=OFF <remaining params> This way, it won't try to link to OpenGL, which you presumably don't need when running in colab. (You can still render out images as numpy arrays.) |
how to see the rendering in colab? |
It's gonna be really hard to do that :( There might be a path thru websockets (e.g. https://github.com/ggerganov/imgui-ws ) or perhaps some way of standing up an X server / VNC on colab. The GUI is pretty killer though, it could be worth the hassle. |
If rendering only a single image (or a handful) is desired, you can call
Note that the returned colors will be sRGB if |
You'll have to first instantiate a testbed object and train it (or load a snapshot) before rendering makes sense. I recommend consulting |
This issue was moved to a discussion.
You can continue the conversation there. Go to discussion →
If there isn't a colab demo already, will send a PR. Please let me know if there will be any OOM issues or any technical issues that I many face.
The text was updated successfully, but these errors were encountered: