Picom spams sgi_video_sync_scheduler_callback with latest nvidia driver #1265

ShadiestGoat · 2024-05-21T21:14:26Z

Platform

Arch (Linux 6.9.1-arch1-1)

GPU, drivers, and screen setup

NVIDIA GeForce GTX 1650 Ti Mobile
Single Monitor
Laptop (w/ optimus-manager)
Also integrated GPU: AMD Radeon RX Vega 6 (Ryzen 4000/5000 Mobile Series)
nvidia-dkms 550.78-1
xf86-video-amdgpu v23.0.0-2
mesa 1:24.0.7-3

glxinfo -B

name of display: :0                                                                                                                                                                         
display: :0  screen: 0
direct rendering: Yes
Memory info (GL_NVX_gpu_memory_info):
    Dedicated video memory: 4096 MB
    Total available memory: 4096 MB
    Currently available dedicated video memory: 3521 MB
OpenGL vendor string: NVIDIA Corporation
OpenGL renderer string: NVIDIA GeForce GTX 1650 Ti/PCIe/SSE2
OpenGL core profile version string: 4.6.0 NVIDIA 550.78
OpenGL core profile shading language version string: 4.60 NVIDIA
OpenGL core profile context flags: (none)
OpenGL core profile profile mask: core profile

OpenGL version string: 4.6.0 NVIDIA 550.78
OpenGL shading language version string: 4.60 NVIDIA
OpenGL context flags: (none)
OpenGL profile mask: (none)

OpenGL ES profile version string: OpenGL ES 3.2 NVIDIA 550.78
OpenGL ES profile shading language version string: OpenGL ES GLSL ES 3.20

Environment

Bspwm

picom version

vgit-9a839

Diagnostics

Version: vgit-9a839

Extensions:

Shape: Yes
RandR: Yes
Present: Present

Misc:

Use Overlay: Yes
Config file specified: None
Config file used: /home/shady/.config/picom/picom.conf

Drivers (inaccurate):

NVIDIA, modesetting

Backend: glx

Driver vendors:
GLX: NVIDIA Corporation
GL: NVIDIA Corporation
GL renderer: NVIDIA GeForce GTX 1650 Ti/PCIe/SSE2

Backend: egl

Driver vendors:
EGL: NVIDIA
GL: NVIDIA Corporation
GL renderer: NVIDIA GeForce GTX 1650 Ti/PCIe/SSE2

Configuration:

Configuration file

shadow = false;

corner-radius = 8;
rounded-corners-exclude = [
  "class_g = 'Polybar'"
];
round-borders = 8;

fading = true;
no-fading-openclose = false;
fade-in-step = 0.1;
fade-out-step = 0.1;
fade-delta = 9;

# nice kawase blur
blur: {
  method = "dual_kawase";
  strength = 3;
  background = false;
  background-frame = false;
  background-fixed = false;
}

backend = "glx";
vsync = true

mark-wmwin-focused = true;
mark-ovredir-focused = true;
detect-client-opacity = true;
detect-client-leader = true;

blur-background-exclude = [
  "class_g ?= 'zoom'",
  "name = 'rect-overlay'",
  "_GTK_FRAME_EXTENTS@:c",
  "class_g = 'LibreWolf'",
  "window_type *= 'menu'",
#  "(class_g = 'Firefox' || class_g = 'Thunderbird') && (window_type = 'utility' || window_type = 'popup_menu') && argb",
#  "window_type = 'menu'",
#  "window_type = 'dropdown_menu'",
#  "window_type = 'popup_menu'",
#  "window_type = 'tooltip'",
];

Steps of reproduction

Start picom
Get output

Expected behavior

No warning spam

Current Behavior

Spams:

[ 05/21/2024 22:12:02.008 c2_parse_target WARN ] Type specifier is deprecated. Type "c" specified on target "_GTK_FRAME_EXTENTS" will be ignored, you can remove it.
[ 05/21/2024 22:12:02.334 sgi_video_sync_scheduler_callback WARN ] Duplicate vblank event found with msc 0. Possible NVIDIA bug?
[ 05/21/2024 22:12:02.334 sgi_video_sync_scheduler_callback WARN ] Resetting the vblank scheduler
[ 05/21/2024 22:12:02.408 sgi_video_sync_scheduler_callback WARN ] Duplicate vblank event found with msc 28826. Possible NVIDIA bug?
[ 05/21/2024 22:12:02.408 sgi_video_sync_scheduler_callback WARN ] Resetting the vblank scheduler
[ 05/21/2024 22:12:02.457 sgi_video_sync_scheduler_callback WARN ] Duplicate vblank event found with msc 28826. Possible NVIDIA bug?
[ 05/21/2024 22:12:02.457 sgi_video_sync_scheduler_callback WARN ] Resetting the vblank scheduler
[ 05/21/2024 22:12:02.505 sgi_video_sync_scheduler_callback WARN ] Duplicate vblank event found with msc 28826. Possible NVIDIA bug?
[ 05/21/2024 22:12:02.505 sgi_video_sync_scheduler_callback WARN ] Resetting the vblank scheduler
[ 05/21/2024 22:12:02.552 sgi_video_sync_scheduler_callback WARN ] Duplicate vblank event found with msc 28826. Possible NVIDIA bug?
[ 05/21/2024 22:12:02.552 sgi_video_sync_scheduler_callback WARN ] Resetting the vblank scheduler
[ 05/21/2024 22:12:02.597 sgi_video_sync_scheduler_callback WARN ] Duplicate vblank event found with msc 28826. Possible NVIDIA bug?
[ 05/21/2024 22:12:02.597 sgi_video_sync_scheduler_callback WARN ] Resetting the vblank scheduler
[ 05/21/2024 22:12:02.647 sgi_video_sync_scheduler_callback WARN ] Duplicate vblank event found with msc 28826. Possible NVIDIA bug?
[ 05/21/2024 22:12:02.647 sgi_video_sync_scheduler_callback WARN ] Resetting the vblank scheduler
[ 05/21/2024 22:12:02.694 sgi_video_sync_scheduler_callback WARN ] Duplicate vblank event found with msc 28826. Possible NVIDIA bug?
[ 05/21/2024 22:12:02.694 sgi_video_sync_scheduler_callback WARN ] Resetting the vblank scheduler

Additionally, things like glxgears & vkcube slow to a crawl when picom is active, though I havent tested various configurations to see if they fix all my problems yet!

The text was updated successfully, but these errors were encountered:

ShadiestGoat · 2024-05-21T21:36:26Z

Update: doesn't happen if vsync = false. Same with the massive performance hit mentioned in the footnote

noctuid · 2024-06-01T13:10:58Z

Welp, I guess I'll set vsync false for now. I'm also seeing this, though my system was often completely freezing not just slowing down.

yshui/picom#1265 After a somewhat recent update, picom with vsync enabled causes slowdown/freezing, and I'm having other nvidia issues (e.g. various issues with mpv failing to open). Because I've disable vsync, screen tearing is very bad by default, so I'm enabling force full composition pipeline now to prevent it.

wmkmn · 2024-07-23T19:06:28Z

I started seeing the same issue after upgrading to Nvidia 555.58.02. (Not sure what version I was running before that).

The release notes for that version of the drivers mention: Updated glXWaitVideoSyncSGI() to be more efficient. This reduces frame stutter in some KDE configurations with GSP offload. Which sounds like it might be related.

I tried disabling GSP using the nvidia NVreg_EnableGpuFirmware=0 kernel option. Unfortunately that did not seem to make a difference.

I also tried running picom using the --no-frame-pacing option. That seem to have helped quite a bit. I only got the Duplicate vblank event found message once at startup. No more after that.

absolutelynothelix · 2024-07-23T19:28:57Z

it's known to spam in certain cases e.g. when the monitor is turned off. does it spam when you just use the pc normally?

I also tried running picom using the --no-frame-pacing option. That seem to have helped quite a bit. I only got the Duplicate vblank event found message once at startup. No more after that.

afaik --no-frame-pacing should turn off the vblank scheduler so having this message, even only once, is kinda weird.

absolutelynothelix · 2024-07-23T19:32:47Z

if you'd like to have frame pacing enabled (which is recommended ig as it reduces the latency) and like experiments, you can try setting the PICOM_DEBUG environment variable to force_vblank_sched=present, e.g. PICOM_DEBUG=force_vblank_sched=present picom .... picom will use the x present extension vblank scheduler instead of the glx_sgi_video_sync glx extension one. despite yshui claims it to be unreliable (see this thread for example) i didn't notice much difference myself.

absolutelynothelix · 2024-07-23T19:35:23Z

@ShadiestGoat, @noctuid, a less harmful fix for this should be no-frame-pacing.

wmkmn · 2024-07-23T19:51:38Z

afaik --no-frame-pacing should turn off the vblank scheduler so having this message, even only once, is kinda weird.

Makes sense. I probably misattributed that log message to the wrong picom invocation then.

wmkmn · 2024-07-23T20:48:36Z

I also tried running picom with force_vblank_sched=present (and frame-pacing enabled again) and that seems to work well for me. No vsync messages in the logs and rendering is smooth, without apparent frame drops.

Thanks for the suggestion @absolutelynothelix.

yshui · 2024-07-23T21:03:42Z

Huh, maybe we should just make present vsync the default?

absolutelynothelix · 2024-07-23T21:27:06Z

@yshui, iirc my very scientific very helix benchmarks showed that it's the smart frame pacing that sucks on nvidia no matter what vblank scheduler is used. without it both vblank schedulers work more or less the same at least for me. but maybe more testing from nvidia users is needed.

yshui · 2024-08-06T04:36:51Z

@awused oh btw, can you try PICOM_DEBUG=force_vblank_sched=present too?

awused · 2024-08-06T06:47:47Z

Sure, though I did notice that, based on the timestamps, #1306 happened while the computer was almost completely idle and no one was home. I would have been connected over ssh but not doing anything graphically intensive, at worst just running a few compilations.

yshui · 2024-08-06T10:42:22Z

This problem is usually triggered by monitors turning off.

pijulius · 2024-08-06T11:18:30Z

hi @yshui can confirm this happening to me too:
[ 08/06/2024 14:04:05.212 sgi_video_sync_scheduler_callback WARN ] Duplicate vblank event found with msc 0. Possible NVIDIA bug?
[ 08/06/2024 14:04:05.212 sgi_video_sync_scheduler_callback WARN ] Resetting the vblank scheduler
[ 08/06/2024 14:04:05.889 sgi_video_sync_scheduler_callback WARN ] Duplicate vblank event found with msc 0. Possible NVIDIA bug?
[ 08/06/2024 14:04:05.889 sgi_video_sync_scheduler_callback WARN ] Resetting the vblank scheduler

and indeed it happens when monitor is turned off using dpms and then turned on. Unfortunately this freezes picom and so need to kill it to have a working display again.

This started to happen just 2 days ago so it has been affected me with something from the latest changes, really hope it can be fixed, if there is anything to help you track it down please let me know and will try to see what I can find.

Thanks again for all your hard work!

pijulius · 2024-08-06T11:27:52Z

OOO, seems like this fixes the problem:

PICOM_DEBUG=force_vblank_sched=present
but makes the animation to look really bad, like they are skipping frames. PS: on a 120hz monitor, also tried this:

--no-frame-pacing
and this seems to fix it too and animations still work just fine so will have it running like this and let you know if still facing the problem.

absolutelynothelix · 2024-08-06T11:47:33Z

jesus christ, how about we just pretend that nvidia doesn't exist?

yshui · 2024-08-06T16:43:23Z

guess the idea of using present by default is out the window then :/

need to find a better way to mitigate the bad nvidia behavior...

yshui · 2024-08-07T20:38:28Z

OK, I am going to test some different strategies and see what works.

I created a branch called nvidia-pain, please try that and tell me what it looks like. It's probably not going to work, but it would be useful to know how it reacts.

pijulius · 2024-08-10T12:35:43Z

hi @yshui Thanks for the quick fix, can confirm that nvidia-pain does fix the problem and all works well BUT: the problem for me wasn't caused by your code but by the new nVidia driver.

So with the latest NVIDIA-Linux-x86_64-550.107.02 I got the following errors:

The above error where waking up from suspend
vsync=true makes the whole animations sluggish (looks really bad)
vsync=false makes animations look better but cpu usage goes up when monitor off and also when for e.g. i3lock used and so on

ALL these errors go away if reverted to NVIDIA-Linux-x86_64-550.100 so not sure what nvidia did but the new driver seems to cause a lot of problems irrelevant to your coding.

yshui · 2024-08-11T03:29:30Z

@pijulius

wasn't caused by your code but by the new nVidia driver.

yeah it is the new nvidia driver we are trying to fix (or rather, workaround) here.

I got the following errors:

Are these results from when you run without PICOM_DEBUG=force_vblank_sched=present?

pijulius · 2024-08-11T09:51:44Z

@yshui

Are these results from when you run without PICOM_DEBUG=force_vblank_sched=present?

yes, simply running picom-nvidia-pain without any arguments at all. If I run normal picom with the same config on the 550.100 driver there are no problems at all.

please note: it's not just the above errors, for e.g. full screen animations are slow as hell like i3lock, also when unlocking the keypress events are shown almost like back in time and also flameshot is waay slower, for both launching or selecting an area on the screen so seems like something global is going on.

pijulius · 2024-08-13T14:47:05Z

hi @yshui this
next...nvidia-pain

seems to be good to go in as do have better results if I'm using this version than the old one. Unfortunately noticed some inconsistency between old and new nvidia drivers, even new one does work time to time and old one does brake time to time so can't replicate that reliably but do think/notice that with this nvidia-pain I do get better results so far it always woke up with this but did not with the old versions.

thank you for the quick fix!

awused · 2024-09-18T05:10:19Z

After updating to nvidia 560, the issue I had in #1306 seemed to go away, even without the nvidia-pain branch or any special parameters. But since nvidia 560 is broken in other ways and I downgraded to 555 that issue has come back so I'm back to using special parameters.

I'm not sure if whatever changes that happened in nvidia-pain actually need to be merged into the main branch. nvidia has just had a bad couple of drivers and it seems like it'll probably work itself out. That or 560's breakage could have been masking the breakage in here or #1306. Hard to say with confidence until the next driver revision.

bubbleguuum · 2024-10-17T12:59:58Z

Continuing discussion from #1367.

Any plan for a workaround to detect sgi_video_sync_scheduler_callback being called repeatedly in a busy loop when the screen is off and avoiding high CPU usage because of it ?

yshui · 2024-10-17T13:13:24Z

@bubbleguuum can you test the nvidia-pain branch mentioned earlier in this issue? remember to remove PICOM_DEBUG=force_vblank_sched=present

bubbleguuum · 2024-10-17T13:41:55Z

With the nvidia-pain version, when the screen is off the console still spams:

[ 10/17/2024 15:35:11.574 sgi_video_sync_scheduler_callback WARN ] Duplicate vblank event found with msc 41168. Possible NVIDIA bug?
[ 10/17/2024 15:35:11.590 sgi_video_sync_scheduler_callback WARN ] Duplicate vblank event found with msc 41169. Possible NVIDIA bug?
[ 10/17/2024 15:35:11.607 sgi_video_sync_scheduler_callback WARN ] Duplicate vblank event found with msc 41170. Possible NVIDIA bug?

However (with screen off), picom CPU usage (monitored via ssh terminal running on another machine) is low, being most of the time at 0% and occasionally reported at 10%.

When I wake up the screen, the log traces above keep being spammed with CPU usage being constant 10%.

That's different than the v12.3 version where when I wake up the screen, spam log traces immediately stop and CPU usage is back to normal.

yshui · 2024-10-17T13:50:52Z

ok, at least it fixes the 100% cpu usage. i think i know what is going on with the log spam.

yshui · 2024-10-17T14:30:12Z

@bubbleguuum i've updated nvidia-pain, can you test again?

bubbleguuum · 2024-10-17T14:46:56Z

I confirm it fixes the problem: no more log spam and normal CPU usage when waking up the screen.

bubbleguuum · 2024-10-17T14:55:44Z

Also commenting line below that is spammed when screen is off makes picom use 0% CPU all the time when the screen is off (otherwise it reports about 10% briefly once every 20-30 seconds).

log_warn("Duplicate vblank event found with msc %d. Possible NVIDIA bug? "
               "Number of duplicates so far: %d",
               msc, sched->vblank_inserted);

yshui · 2024-10-17T14:58:44Z

i feel 10% cpu every 20-30 seconds is a fair price to pay if we get to complain about nvidia 😈

bubbleguuum · 2024-10-17T15:01:35Z

I would only display it once but it's your call. I could see it spam log files. Thanks for the fix anyway !

yshui · 2024-10-17T15:20:57Z

that's very fair, i will do that.

yshui · 2024-10-17T15:34:00Z

closed, thanks for testing the fix

We used to teardown the whole vblank thread and restart it every time we got a duplicate msc. This used to work OK, but newer NVIDIA drivers broke this. And to recap, simply wait for vblank again upon reception of duplicate MSCs _does not_ work either, and will just stuck us in an infinite loop. After some experimentation, I found that rendering a new frame gets us out of the infinite duplicate msc loop. So that's what we do now, i.e. inserting a synthetic vblank to trigger a new frame. Some care is taken to make sure synthetic vblanks' msc numbers don't conflict with real ones. Fixes #1265 Signed-off-by: Yuxuan Shui <yshuiv7@gmail.com>

yshui mentioned this issue Aug 6, 2024

Crash: log_get_level_tls: Assertion `tls_logger' failed #1306

Closed

yshui mentioned this issue Oct 17, 2024

High CPU usage on NVIDIA when screen is off with GLX backend, vsync and frame pacing enabled #1367

Closed

yshui closed this as completed in e65ebd7 Oct 17, 2024

bubbleguuum mentioned this issue Oct 17, 2024

Stuttery scrolling with NVIDIA and GLX backend after v12.3 (regression) #1368

Closed

Picom spams sgi_video_sync_scheduler_callback with latest nvidia driver #1265

Picom spams sgi_video_sync_scheduler_callback with latest nvidia driver #1265

Comments

ShadiestGoat commented May 21, 2024

Platform

GPU, drivers, and screen setup

Environment

picom version

Extensions:

Misc:

Drivers (inaccurate):

Backend: glx

Backend: egl

Configuration:

Steps of reproduction

Expected behavior

Current Behavior

ShadiestGoat commented May 21, 2024

noctuid commented Jun 1, 2024

wmkmn commented Jul 23, 2024 • edited Loading

absolutelynothelix commented Jul 23, 2024

absolutelynothelix commented Jul 23, 2024 • edited Loading

absolutelynothelix commented Jul 23, 2024

wmkmn commented Jul 23, 2024

wmkmn commented Jul 23, 2024

yshui commented Jul 23, 2024

absolutelynothelix commented Jul 23, 2024

yshui commented Aug 6, 2024

awused commented Aug 6, 2024

yshui commented Aug 6, 2024

pijulius commented Aug 6, 2024

pijulius commented Aug 6, 2024

absolutelynothelix commented Aug 6, 2024

yshui commented Aug 6, 2024

yshui commented Aug 7, 2024

pijulius commented Aug 10, 2024

yshui commented Aug 11, 2024

pijulius commented Aug 11, 2024

pijulius commented Aug 13, 2024

awused commented Sep 18, 2024

bubbleguuum commented Oct 17, 2024

yshui commented Oct 17, 2024

bubbleguuum commented Oct 17, 2024 • edited Loading

yshui commented Oct 17, 2024

yshui commented Oct 17, 2024

bubbleguuum commented Oct 17, 2024

bubbleguuum commented Oct 17, 2024

yshui commented Oct 17, 2024

bubbleguuum commented Oct 17, 2024 • edited Loading

yshui commented Oct 17, 2024

yshui commented Oct 17, 2024

wmkmn commented Jul 23, 2024 •

edited

Loading

absolutelynothelix commented Jul 23, 2024 •

edited

Loading

bubbleguuum commented Oct 17, 2024 •

edited

Loading

bubbleguuum commented Oct 17, 2024 •

edited

Loading