Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Shadow of the Tomb Raider] fossilize eats all RAM until it OOMs #194

Open
philipl opened this issue Jun 20, 2022 · 11 comments
Open

[Shadow of the Tomb Raider] fossilize eats all RAM until it OOMs #194

philipl opened this issue Jun 20, 2022 · 11 comments

Comments

@philipl
Copy link

philipl commented Jun 20, 2022

The symptoms here seem to be the same as #84 but that one was fixed and closed, so I was asked to file a new issue.

In the last month or two, I've noticed that fossilize seems happy to consume all system memory without bounds, and this is particularly evident with Shadow of the Tomb Raider (native Linux version), which consistently sees fossilize gobble up all the memory and then eventually get OOM killed and you have to kick it by turning background processing off and on again. It does eventually complete because it doesn't start from zero each time, but it makes the system unusable for anything else until it finally completes. I've been preemptively turning background processing off and on while watching the memory usage to get it to complete a bit faster, but that's no fun.

I suspect that the behaviour here is not unique to SotTW - it's just that this seems to have enough work to do that it can saturate my system.

System Details:

  • CPU: Core i9-12900K
  • Memory: 64GB (no swap)
  • GPU: Nvidia 515.48.07
  • OS: Ubuntu 22.04 with 5.18.4 mainline kernel
  • Steam Client: Beta 2022-06-18 00:17:34 1655513879

Thanks.

@kakra
Copy link
Contributor

kakra commented Jun 21, 2022

I wonder if adding 1-2 GB of swap would work around the issue? The Linux kernel memory manager does not really like to work completely without swap unless you carefully control memory resources of your processes (multi-generational LRU memory management should solve this, afaik, Google developers currently do upstream efforts for such a patch which is also or will be used for Android). Also, if you have /proc/pressure/{io,memory}, fossilize should be able to control its memory usage before it runs the system into OOM situations. This can be enabled through the kernel PSI feature.

@philipl
Copy link
Author

philipl commented Jun 22, 2022

The kernel I've been using has /proc/pressure/{io,memory} so by itself that was certainly not enough to prevent this behaviour.

@philipl
Copy link
Author

philipl commented Jul 21, 2022

I've tested with 2GB of swap and 32GB of swap, and even then it's happy to gobble up all memory and then OOM. The swapfile makes no difference to how fossilize behaves.

@kakra
Copy link
Contributor

kakra commented Jul 21, 2022

What's the resident and virtual size of the fossilize processes when the problem builds up? Is it really fossilize itself, or does it rather dominate the cache and dirty pages?

@philipl
Copy link
Author

philipl commented Jul 22, 2022

It's basically just fossilize itself. Here is the state of affairs just before I run out of 64GB on my system:

image

@philipl
Copy link
Author

philipl commented Jul 22, 2022

FWIW, when all is said and done, my on disk cache size for SotTR is 1.3GB.

@kakra
Copy link
Contributor

kakra commented Jul 22, 2022

Did you manually increase the fossilize worker number? "to speed things up"? It should run two workers by default, and put workers into T state if something is running out of control. All this does not seem to work for you. Or maybe you're running a flatpak version of Steam which may not be able to access PSI?

@philipl
Copy link
Author

philipl commented Jul 22, 2022

This is an Ubuntu system with the standard deb bootstrap package that is then downloading and running the client. No flatpak cleverness or anything like that. I have not tried to tweak the number of workers - I didn't even know it was possible. It's just doing whatever it does.

@philipl
Copy link
Author

philipl commented Aug 3, 2022

Just for fun, I added 128GB of swap and it was still happy to OOM.

@kakra
Copy link
Contributor

kakra commented Aug 22, 2022

Maybe related to #196 which reproduces it using the pipeline cache and running fossilize manually?

@philipl
Copy link
Author

philipl commented Oct 15, 2022

Latest update. With a 6.0.x kernel, the OOM killer no longer kicks in. I still experience multiple seconds of system unresponsiveness when memory is exhausted but the fossilize processes do back off in response to pressure and nothing gets killed. I guess the kernel requirements here are exceptionally steep.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants