Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve latency, fix rectangular "tearing", better performance metrics with frame pacing #968

Merged
merged 16 commits into from
Jun 12, 2023

Conversation

yshui
Copy link
Owner

@yshui yshui commented Dec 13, 2022

Latency

Rendering a frame too early can cause long latency. Generally we want to render a frame as late as possible, as long as it finishes in time before the vblank period.

Tearing

Sometimes user will see rectangular shaped tearing when use-damage is enabled. This is because we started rendering early, before all the damages for that frame has arrived. Resulting in only part of an updated window being presented on to the screen.

Metrics

Previously, we never measured how long each frame takes to render. We need this information to pace the frames properly, i.e. to achieve the things listed above. So as a nice side effect, we now know exactly how inefficient we actually are 🥲

Known problem

  • Need to detect dpms turning screen off, which would throw off our refresh rate estimations

@yshui yshui changed the title Improve latency, fix blocky looking "tearing", understand performance better with frame pacing Improve latency, fix rectangular "tearing", understand performance better with frame pacing Dec 13, 2022
@yshui yshui changed the title Improve latency, fix rectangular "tearing", understand performance better with frame pacing Improve latency, fix rectangular "tearing", better performance metrics with frame pacing Dec 13, 2022
@yshui
Copy link
Owner Author

yshui commented Dec 13, 2022

Render times don't look good. It is ~5ms/frame here. Need to investigate what's so slow.

@yshui
Copy link
Owner Author

yshui commented Dec 14, 2022

I wonder how well this works for NVIDIA users. I would be surprised if this works for them at all 🤔

@absolutelynothelix
Copy link
Collaborator

I wonder how well this works for NVIDIA users. I would be surprised if this works for them at all 🤔

if there is no time limit for this pull request to be merged i can test it in maybe 5 days on an average nvidia gpu (gtx 1650). unfortunately, i’m ill now and my pc is not with me.

also, is it possible to test/debug picom remotely? i think i could give you remote access to my pc from time to time via rdp or something like this. if you’re interested you can contact me in telegram with the same login.

@yshui
Copy link
Owner Author

yshui commented Dec 14, 2022

@absolutelynothelix no need to hurry! i still need to tweak this PR. i would be interested to see your results.

@yshui
Copy link
Owner Author

yshui commented Dec 14, 2022

for me, if i can get the render time down to reasonable levels (transparent-clipping helps, because there's a lot less to draw), the latency feels pretty low, it's actually quite nice.

@yshui
Copy link
Owner Author

yshui commented Dec 14, 2022

With dual_kawase blur strength 20, the render time goes up to ~11ms 🥲

@tryone144
Copy link
Collaborator

I wonder how well this works for NVIDIA users. I would be surprised if this works for them at all 🤔

At least on my GTX 1070 with driver 515.43.04 it compiles and draws fine. 🤷 Just a very preliminary test though.

Logs
[ 14.12.2022 22:02:43.318 glx_has_extension INFO ] Found GLX extension GLX_SGI_video_sync.
[ 14.12.2022 22:02:43.318 glx_has_extension INFO ] Found GLX extension GLX_SGI_swap_control.
[ 14.12.2022 22:02:43.318 glx_has_extension INFO ] Missing GLX extension GLX_OML_sync_control.
[ 14.12.2022 22:02:43.318 glx_has_extension INFO ] Missing GLX extension GLX_MESA_swap_control.
[ 14.12.2022 22:02:43.318 glx_has_extension INFO ] Found GLX extension GLX_EXT_swap_control.
[ 14.12.2022 22:02:43.318 glx_has_extension INFO ] Found GLX extension GLX_EXT_texture_from_pixmap.
[ 14.12.2022 22:02:43.318 glx_has_extension INFO ] Found GLX extension GLX_ARB_create_context.
[ 14.12.2022 22:02:43.318 glx_has_extension INFO ] Found GLX extension GLX_EXT_buffer_age.
[ 14.12.2022 22:02:43.318 glx_has_extension INFO ] Found GLX extension GLX_ARB_create_context_robustness.
[ 14.12.2022 22:02:43.318 glx_has_extension INFO ] Missing GLX extension GLX_MESA_query_renderer.
[ 14.12.2022 22:02:43.755 glGetUniformLocationChecked INFO ] Failed to get location of uniform 'time'. This is normal when using custom shaders.
[ 14.12.2022 22:02:43.757 gl_init INFO ] Using back buffer format 0x8051
[ 14.12.2022 22:02:43.757 gl_has_extension INFO ] Missing GL extension GL_GREMEDY_string_marker.
[ 14.12.2022 22:02:43.757 gl_init INFO ] GL vendor is NVIDIA, enable xrender sync fence.
[ 14.12.2022 22:02:43.757 gl_has_extension INFO ] Missing GL extension GL_EXT_EGL_image_storage.
[ 14.12.2022 22:02:43.757 gl_create_kernel_blur_context INFO ] Uniform locations: 3 2 1 0 4
[ 14.12.2022 22:02:43.757 gl_create_kernel_blur_context INFO ] Uniform locations: 3 2 1 0 4

followed by the occasional A frame is still being rendered, not queueing redraw. warning.

@yshui
Copy link
Owner Author

yshui commented Dec 15, 2022

@tryone144 nice! i was worried that NVIDIA's present extension implementation might use a different time base from CLOCK_MONOTONIC.

@codecov
Copy link

codecov bot commented Dec 15, 2022

Codecov Report

Merging #968 (d0c121e) into next (3aed559) will decrease coverage by 0.12%.
The diff coverage is 35.44%.

Impacted file tree graph

@@            Coverage Diff             @@
##             next     #968      +/-   ##
==========================================
- Coverage   37.76%   37.64%   -0.12%     
==========================================
  Files          48       49       +1     
  Lines       10869    11137     +268     
==========================================
+ Hits         4105     4193      +88     
- Misses       6764     6944     +180     
Impacted Files Coverage Δ
src/backend/driver.c 34.14% <ø> (+0.81%) ⬆️
src/backend/gl/egl.c 26.38% <ø> (+0.24%) ⬆️
src/backend/gl/gl_common.h 30.23% <ø> (ø)
src/backend/gl/glx.c 38.74% <ø> (+0.25%) ⬆️
src/common.h 70.37% <ø> (-5.63%) ⬇️
src/config.c 51.26% <ø> (ø)
src/config.h 23.52% <ø> (ø)
src/options.c 19.15% <0.00%> (-0.05%) ⬇️
src/render.c 1.30% <0.00%> (ø)
src/utils.h 29.41% <0.00%> (-18.98%) ⬇️
... and 6 more

... and 1 file with indirect coverage changes

@yshui yshui force-pushed the pacing branch 5 times, most recently from fadea0f to 572b17e Compare December 15, 2022 19:04
@yshui yshui marked this pull request as ready for review December 16, 2022 09:13
@yshui yshui force-pushed the pacing branch 2 times, most recently from e9064a2 to dfa9dbe Compare December 16, 2022 20:29
@yshui
Copy link
Owner Author

yshui commented Dec 16, 2022

hmph, there is a raging memory leak somewhere i need to fix.

@yshui yshui force-pushed the pacing branch 3 times, most recently from 640addb to d72f835 Compare December 17, 2022 19:38
yshui added 5 commits June 8, 2023 20:35
Signed-off-by: Yuxuan Shui <yshuiv7@gmail.com>
Timer can sometimes drift randomly, like when the system is suspended.
we don't want to disable frame pacing for that

Signed-off-by: Yuxuan Shui <yshuiv7@gmail.com>
Sometimes X sends bogus complete notifies with invalid msc number.
Instead use a saved msc number to get the next complete notify.

Signed-off-by: Yuxuan Shui <yshuiv7@gmail.com>
So when the screen is off, we calls queue_redraw, hoping draw_callback
will be called and unredirects the screen. However, queue_redraw doesn't
queue another redraw when one is already queued. Redraws can be queued
in two ways: one is timer based, which is fine, because it will be
triggered no matter what; the other is frame based, which is triggered
by Present events.

When the screen is off, X server, depends on the driver, could send
abnormal Present events, which we ignore. But that also means queued
frame based redraw will stop being triggered.

Those two factors combined means sometimes unredirection does not happen
when the screen is off. Which means we aren't going to free the GL
context, which are still receiving Present events, but can't handle
them, because we are not rendering anything with GL. In the end all
these causes memory usage to balloon, until the screen is turned on and
we start rendering again.

And all these is not caught by leak checkers because this technically is
not a leak, as everything is eventually freed.

Signed-off-by: Yuxuan Shui <yshuiv7@gmail.com>
We only checked render time. If we don't have frame time estimates, we
would divide by zero and end up with wild scheduling delays.

Signed-off-by: Yuxuan Shui <yshuiv7@gmail.com>
@yshui
Copy link
Owner Author

yshui commented Jun 8, 2023

BTW, I plotted some render times:

Figure_1

Looks multimodal to me. A running maximum would force us to be too pessimistic (those tiny bars at the right end). But I don't know what the best strategy is. Current thought is maybe use a 95 percentile.

@yshui yshui force-pushed the pacing branch 4 times, most recently from d72e80f to 3872813 Compare June 10, 2023 13:31
@yshui
Copy link
Owner Author

yshui commented Jun 10, 2023

@absolutelynothelix hey, i made some effort to make things smoother. can you try to see if it's better for you?

Add a render_statistics type to encapsulate all the statistics done on
rendering times. And use that to estimate the time budget for rendering
and frame pacing.

Tweak the rolling window utilities a bit so we can reuse one rolling
window for multiple statistics.

Signed-off-by: Yuxuan Shui <yshuiv7@gmail.com>
Sometimes a scheduled render can end up doing nothing, e.g. if the
damage region is empty. In that case we don't have valid data to
collect and thus shouldn't update the statistics.

Signed-off-by: Yuxuan Shui <yshuiv7@gmail.com>
Make picom realtime to reduce latency, and make rendering times more
predictable to help pacing.

Signed-off-by: Yuxuan Shui <yshuiv7@gmail.com>
Signed-off-by: Yuxuan Shui <yshuiv7@gmail.com>
@absolutelynothelix
Copy link
Collaborator

@yshui, i'd say it feels pretty much same. i can record a new video if you wish.

@yshui
Copy link
Owner Author

yshui commented Jun 10, 2023

@absolutelynothelix the logs would be more useful.

@absolutelynothelix
Copy link
Collaborator

absolutelynothelix commented Jun 10, 2023

the logs would be more useful.

@yshui, are you interested in particular messages or in all messages introduced with this pull request?

@yshui
Copy link
Owner Author

yshui commented Jun 10, 2023

@absolutelynothelix just throw the whole trace logs here, i know how to use grep ;)

@absolutelynothelix
Copy link
Collaborator

i was moving a small terminal window from where i launched picom in front of a big terminal window with nonsense activity. i stopped picom some time after window movement started to lag, so all the interesting information should be somewhere in the end.

picom.log

p.s.:

@yshui
Copy link
Owner Author

yshui commented Jun 10, 2023

@absolutelynothelix yeah the frames just took too long to render, that's a different problem we need to resolve aside from this PR

[ 06/10/2023 19:19:45.794 schedule_render TRACE ] Delay: 0.011906 s, last_msc: 23556707859, render_budget: 16658, frame_time: 16654, now_us: 23556712603, next_msc: 23556741167, target_msc: 1413461, divisor: 2

Signed-off-by: Yuxuan Shui <yshuiv7@gmail.com>
@yshui
Copy link
Owner Author

yshui commented Jun 11, 2023

@EpsilonKu I joined! My discord username is yshui#7811, can you make me a moderator? I will also add a link to it in README.

@yshui yshui merged commit 608423a into next Jun 12, 2023
@yshui yshui deleted the pacing branch June 12, 2023 10:46
@yshui
Copy link
Owner Author

yshui commented Jun 17, 2023

@tryone144 do you have a discord account by any chance?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants