Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Increased latency in Fullscreen Display Mode after v6.0 #1304

Closed
serdargitthub opened this issue Jun 6, 2024 · 14 comments
Closed

Increased latency in Fullscreen Display Mode after v6.0 #1304

serdargitthub opened this issue Jun 6, 2024 · 14 comments
Milestone

Comments

@serdargitthub
Copy link

serdargitthub commented Jun 6, 2024

Fullscreen Display Mode acts as if i am in Borderless Windowed Mode.

Windowed Mode causes my decoding time and general latency to increase
Previous vers. (5.0.1) has no any issue on this topic

Affected games
Affected everything

Moonlight settings

Have any settings been adjusted from defaults?
YES. Same modified settings from previous vers.

Client PC details

OS: Windows 11 R3H2
Moonlight Version: 6.0
GPU: Intel UHD Graphics 630

Server PC details

OS: Windows 11 R3H2
Sunshine or GeForce Experience version: Sunshine v0.23.1
GPU: AMD Radeon RX 6800 XT
GPU driver: 24.5.1

@cgutman
Copy link
Member

cgutman commented Jun 6, 2024

The similarity between fullscreen and borderless windowed is intentional as a result of switching from D3D9 to D3D11, which uses newer flip model swapchains. Prior versions of Moonlight already used D3D11 for borderless windowed, windowed, and full-screen exclusive with V-Sync disabled, but v6.0.0 is the first version to also use it for full-screen exclusive mode with V-Sync enabled.

The latency regression between D3D9 and D3D11 is not expected though and doesn't occur on any of my test hardware in any test scenarios I have tried. I will try to reproduce the issue here if you can provide some more information about your setup.

For me to debug this issue, please provide:

  • Moonlight log file of an affected stream from your %TEMP% folder on the client (it will be called Moonlight-<some numbers>.log)
  • CPU model number in your client
  • Resolution, frame rate, and bitrate selected in Moonlight
  • Whether HDR is enabled or not in Moonlight
  • The version of your GPU driver (this can be found on the Performance tab of Task Manager if you click on your GPU, or via Device Manager)
  • Does disabling V-sync in Moonlight make a difference?
  • Does your client PC have a discrete GPU also, or just an integrated GPU?

@serdargitthub
Copy link
Author

serdargitthub commented Jun 7, 2024

Hello , Thank you for reply!

When using Moonlight version 5.0.1, I get high decoding time rates (50-100ms) on all display mode settings except Fullscreen.
With fullscreen selected, everything just works normally (even at 100FPS).

In Moonlight version 6 , I get high decoding times in all display modes, including fullscreen.
GPU utilization exceeds 80% - 90% .
If I reduce the fps setting to 60, the decoding time drops to normal values.

Do you think the issue is poor performance of client gpu ? thank you!

CPU model number in your client

Computer Model : HP ProDesk 400 G5 mini (Intel Core i5-9500T)

Resolution, frame rate, and bitrate selected in Moonlight

3440x1440 , 80FPS - 100FPS (tried both) , Intel UHD Graphics 630 , 60mbps - 100mbps (tried both)

Whether HDR is enabled or not in Moonlight

HDR is disabled

The version of your GPU driver

Intel UHD Graphics 630
31.0.101.2115

Does disabling V-sync in Moonlight make a difference?

No

Does your client PC have a discrete GPU also, or just an integrated GPU?

Integrated GPU . Intel UHD Graphics 630

@cgutman cgutman changed the title Fullscreen Display Mode not working as intended after v6.0 Increased latency in Fullscreen Display Mode after v6.0 Jun 16, 2024
@cgutman cgutman pinned this issue Jun 16, 2024
@makedir
Copy link

makedir commented Jun 16, 2024

"which uses newer flip model swapchains." wouldnt that mean it would trigger this old bug with Intel GPUs?

https://issues.chromium.org/issues/40140837

Decode swap chains are broken on most Intel GPUs in driver, so Chrome implemented "fix" as workaround to disable all decode swap chain for video for most Intel GPUs, I think all up to UHD 630.

Plus, there is some bug with DX11 decode that videos are blurry when not maximized, you can see this in Chrome for example, I also reported it here:

https://bugs.chromium.org/p/chromium/issues/detail?id=1483750

I am sure this means the same then now for Moonlight. Bad idea to switch to DX11. Maybe you can leave DX9 and give option in options what to use.

@serdargitthub
Copy link
Author

The issue is increased decoding times due to high gpu usage after 6.0. which effects every dynamic while streaming :)
Title does not reflect the real issue .

@cgutman
Copy link
Member

cgutman commented Jun 17, 2024

Yeah, technically it's additional GPU usage which I believe is because we're using DXVA video processing for YUV to RGB conversion and scaling on DX9 but we're using a shader for that with DX11. I'm going to write a code path that uses DXVA video processing on DX11 and see if that helps bring the GPU load back down. I think these weaker iGPU have dedicated hardware for this that DXVA VP can use to deliver higher performance than running generic pixel shaders like we do now.

The issue title doesn't capture all this detail but I think that's fine. The current title should help other users find it which is the biggest priority at this point.

@SkipXS
Copy link

SkipXS commented Jun 17, 2024

I'm noticing the same issue. Bought an Intel N100 Mini-PC with Windows 11 to stream to my living room.
I'm using a 4k 60fps AV1 stream. Trying out the display modes i get real high decoding times on all modes except for using window and resizing it to make it smaller. Than I get sub 1 ms decoding times.

It seems upscaling has the issue while downscaling works fine.

Also tried using Fedora Linux and got a perfect 4k60 stream with < 1ms decoding, although i ran into the issue of having audi dropouts - but that's another topic.

Moonlight-1718615781.log

@SkipXS
Copy link

SkipXS commented Jun 17, 2024

Tried again with moonlight 5.0 and Intel Arc Overlay. It seems no matter what I do on Windows I get 99% GPU-Utilization in Fullscreen Mode using Moonlight 5, 6 and even Steam link with bad performance.

Not sure if this may be an Intel Windows issue. Using Fedora the 4k60 stream is fine despite my audio issues.

@cgutman cgutman added this to the v6.0.1 milestone Jun 19, 2024
@jimlwk
Copy link

jimlwk commented Jun 22, 2024

Also encountered similar latency issues yesterday using V6.0.0. My settings were 4k 60 fps. Reducing the settings to 1080p or 720p didn't produce the latency. Nonetheless, would like to play in 4K so reverted to previous version and all is perfect again.

CPU model number in your client
N100

Resolution, frame rate, and bitrate selected in Moonlight
4k 60fps 20-150 mbps shows latency

Whether HDR is enabled or not in Moonlight
No

The version of your GPU driver (this can be found on the Performance tab of Task Manager if you click on your GPU, or via Device Manager)
Intel® UHD Graphics

Does disabling V-sync in Moonlight make a difference?
No

Does your client PC have a discrete GPU also, or just an integrated GPU?
Intel® UHD Graphics

@cgutman
Copy link
Member

cgutman commented Jun 22, 2024

Okay, this turned out to be a quite interesting bug. I profiled Moonlight on an affected Intel Celeron J4125 using Intel's Graphics Performance Analyzer. What I found was that a fix I made years ago (a6fccf9) to the D3D11 renderer way back in Moonlight v4.3.0 turned to have a massive performance hit on these low end Intel GPUs. This bug was latent in the code for years because full-screen mode (the default) used D3D9 as long as HDR wasn't not used and a dGPU wasn't present (very uncommon for these low end CPUs).

Users like @serdargitthub had unknowingly encountered the same bug on v5.0.1 but it went unreported since using full-screen was fine. When I finally flipped the switch in Moonlight v6.0.0 to use the D3D11 renderer everywhere, these machines which had dodged the bug by pure luck now hit the slow code path.

To fix the performance issue, I replaced the old code with newer code that handles the condition without requiring each frame to be copied to another buffer. The new fix (94943d2) not only fixes the fullscreen regression in v6.0.0, but also fixes the performance issues with windowed, borderless windowed, and HDR on these Intel GPUs too.

Please try the fixed build and let me know how it goes:
https://ci.appveyor.com/project/cgutman/moonlight-qt/builds/50073991/job/31fylmre0mjdqnvf/artifacts

@serdargitthub
Copy link
Author

@cgutman , i can't download the build. it says download limit (1024 MB/day) exceeded.

@cgutman
Copy link
Member

cgutman commented Jun 23, 2024

@serdargitthub I reached out to AppVeyor and they raised our download limit for us, so you should be able to download it now :)

@makedir
Copy link

makedir commented Jun 23, 2024

@cgutman I tested this build. I dont see any difference in latency on my XPS 15 8750h Intel UHD630 laptop for 1080p for h265 both same as good,

but it fixes this issue I reported before for my Intel N100 PCs with AV1:

These are all borderless windowed mode:

#1296

nightly build, latency still jumps to 2ms with no information change sometimes on AV1, but not as worse as 6.0.0

Moonlight_2024-06-23_04-36-59

Decoding time jumps up to 90ms on 6.0.0 with N100 and AV1 when no image change, example Diablo 4 map:

Moonlight_2024-06-23_04-37-34

Also it lowers iGPU usage too by ~40-50%:

image

@serdargitthub
Copy link
Author

@cgutman ,
Fortunately the new fix solved my poor decoding performance issues with 6.0. I am now able to stream in any display mode without any issue. I can now also use vsync and hdr options . These options were giving me performance issues on all previous versions.

But somehow i had to close and re-open the fixed build 3-4 times for the new changes to take affect. fyi

thank you for spending time on this issue

@siw1973
Copy link

siw1973 commented Jun 24, 2024

The portable hotfix seems to be a bit better than the main 6.0.0, but still does seem to have more lag than pre 6.

I am using it on a not so fast Intel Celeron N4020 with 4GB RAM using Intel UHD 600 though......

@cgutman cgutman closed this as completed Jun 29, 2024
@cgutman cgutman unpinned this issue Aug 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants