Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GPU memory not freed when closing recording, but keeping viewer open #6755

Closed
roym899 opened this issue Jul 4, 2024 · 3 comments · Fixed by #7531
Closed

GPU memory not freed when closing recording, but keeping viewer open #6755

roym899 opened this issue Jul 4, 2024 · 3 comments · Fixed by #7531
Labels
🪳 bug Something isn't working 📉 performance Optimization, memory use, etc 🔺 re_renderer affects re_renderer itself

Comments

@roym899
Copy link
Collaborator

roym899 commented Jul 4, 2024

Describe the bug
It seems like GPU memory is not freed when closing the recording for meshes (only noticed / tested for meshes with large number of vertices, not sure how generally this applies).

As a workaround I have to close the viewer occasionally.

To Reproduce
Run the following code a few times while monitoring GPU memory usage. Memory keeps growing and closing the recordings does not free it.

import numpy as np
import rerun as rr

rr.init("Large Mesh", spawn=True)
vertices = np.random.rand(3 * 3000000, 3)
rr.log("mesh", rr.Mesh3D(vertex_positions=vertices))

Expected behavior
Memory should be freed when closing a recording in the viewer.

Screenshots

vram.mp4

memory
Note that "Counted GPU" only goes up, never down.

Backtrace

Desktop (please complete the following information):

  • OS: tested on Ubuntu 22.04 and Arch Linux

Rerun version

rerun-cli 0.17.0-alpha.8 [rustc 1.76.0 (07dca489a 2024-02-04), LLVM 17.0.6] x86_64-unknown-linux-gnu 6649/merge a6a666f, built 2024-06-25T19:47:07Z

Also tested with 0.16.1.

@roym899 roym899 added 🪳 bug Something isn't working 👀 needs triage This issue needs to be triaged by the Rerun team labels Jul 4, 2024
@Wumpf Wumpf added 🔺 re_renderer affects re_renderer itself and removed 👀 needs triage This issue needs to be triaged by the Rerun team labels Jul 4, 2024
@Wumpf
Copy link
Member

Wumpf commented Jul 4, 2024

Likely at least partially related to

we're also not good at freeing up anything in that staging belt.

However, one would expect to at least the meshes to disappear from the MeshCache, that's apparently not happening!

@emilk emilk added the 📉 performance Optimization, memory use, etc label Jul 4, 2024
@Wumpf Wumpf added this to the 0.18 - Chunks milestone Jul 8, 2024
@Wumpf Wumpf mentioned this issue Sep 25, 2024
6 tasks
@teh-cmc
Copy link
Member

teh-cmc commented Sep 26, 2024

Why did #7513 not help with this at all?

@Wumpf Wumpf self-assigned this Sep 26, 2024
@Wumpf Wumpf mentioned this issue Sep 27, 2024
6 tasks
@Wumpf
Copy link
Member

Wumpf commented Sep 27, 2024

I though #7531 would fix this issue, however turns out that it is already fixed. The ratched behavior of the CpuWriteGpuRead belt is significant though.
That said: for large/many meshes this works out fine now on main

memory.free.mp4

@Wumpf Wumpf closed this as completed Sep 27, 2024
@Wumpf Wumpf removed their assignment Sep 27, 2024
teh-cmc added a commit that referenced this issue Sep 27, 2024
### What

* ~Fixes #6755~
   * upon counter check with `main` this was already fixed 

Removes the re_renderer `MeshManager` in its entirety.

The vision here was at some point to provide a manager to deal with both
short-lived and long-lived resource that are identified by a content
hash/identifier similar to the texture manager. However, as things
shifted around this didn't end up being useful: _all_ meshes ever
supplied were marked as long-lived and held alive by a ref counter. We
ended up just holding on to the ref count in a central place rather than
tracking resource creation & destruction properly. Overall architecture
of gpu resource handling leaned more and more towards shared ownership
via caches, making the MeshManager an unnecessary layer.
The drawback of this is that we have a lot more Arcs which might bite us
at some point (overhead + hard to track who owns), but I don't see that
happening any time soon (I'm sure this is gonna get quoted a year from
now when it stops working 😁 ).

---------

Co-authored-by: Clement Rey <cr.rey.clement@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🪳 bug Something isn't working 📉 performance Optimization, memory use, etc 🔺 re_renderer affects re_renderer itself
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants