Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mac client can stall entire game loop #1547

Open
colincornaby opened this issue Dec 27, 2023 · 1 comment
Open

Mac client can stall entire game loop #1547

colincornaby opened this issue Dec 27, 2023 · 1 comment

Comments

@colincornaby
Copy link
Contributor

This is one more follow up to the MSAA performance issues that were found in testing the Mac client. The Mac was enabling 8x MSAA on older hardware that only supports 4x on the Windows side in Plasma.

It was noticed during that investigation that resource loading was extremely slow. Resource loading shares a the main render thread. So if the rendering gets stalled, so will resource loading.

It seems like there are a few issues at play. Metal requires explicit management of display refresh sync and front/back buffers - so there are some extra challenges here. Some of these issues also apply to OpenGL on the Mac or possibly other platforms.

Metal stalls when it runs out of buffers

Metal will performing rendering on a secondary thread once all the render commands are encoded. However - if Metal runs out of buffers, it will wait up to a second for a buffer to become available.

The timeout does not seem deeply configured. It can only be turned off. But if the timeout is turned off - Metal will wait for another framebuffer forever.

https://developer.apple.com/documentation/quartzcore/cametallayer/2887086-allowsnextdrawabletimeout?language=objc

If the GPU is becoming overwhelmed (i.e. because of MSAA 8x antialiasing) this will cause the CPU to stall.

We might need a new way to do framebuffer swaps that avoids the stall. We can access the next framebuffer on a secondary thread - so we could force only the secondary thread to stall. However framebuffer requests cannot be cancelled - so this would need to be done carefully. Done badly this could cause a secondary starvation by repeatedly requesting frames that never get used.

Game loop is vsynced

On macOS, in all renderers, the game loop itself is vsynced. This means if rendering targets are missed, macOS will begin servicing the game loop less.

There may need to be a secondary gate to allow the game loop to run continuously - but only enter the renderer's draw code when a vsync callback is returned by the system. The vsync timer is universal to all renderers on macOS and is not Metal specific.

I've done some initial research into how this is handled in the D3D9 pipeline. It looks like Windows might have similar issues. It looks like on Windows - vsync will stall the render present command until the next frame is ready. By my understanding - this would block the game loop until the next thread is ready. Plasma could be mitigating this - I have not studied the D3D pipeline deep enough to find if it is.

@colincornaby
Copy link
Contributor Author

Digging into this more - it's possible that one is the solution for the other.

Apple suggests we feed our performance measurements back into the vsync timer. This will cause the vsync timer to self limit - and not try to call us at an FPS beyond what we are currently capable of. In turn - this should mitigate the problem of us causing a framebuffer underflow when the GPU is drowning in work. We should be only receiving draw events at a rate relative to which frame buffers can be supplied.

This still runs into the issue of the client not being able to pass vsync events into the renderer though. D3D regulates itself - so Plasma was never designed for platform/renderer agnostic vsync support.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant