Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

xpra desktop has psychotic update habits in a local terminal #4319

Closed
karlkleinpaste opened this issue Aug 9, 2024 · 15 comments
Closed

xpra desktop has psychotic update habits in a local terminal #4319

karlkleinpaste opened this issue Aug 9, 2024 · 15 comments
Labels
bug Something isn't working

Comments

@karlkleinpaste
Copy link

Large local xpra desktop, managed by fvwm, within which I maintain a bunch of terminals for this and several other machines, where I do things like sysadmin tasks together.

Within that desktop, I have xpra-driven terminals for the other machines, but a local terminal started out of fvwm menus. So the terminal, on its own, has no special display properties that it wouldn't have if it were displayed on the native desktop.

Activity on that terminal shows deeply aberrant redisplay of old content as I perform routine tasks, in this case, a dnf update download. Attached .mp4 shows what happens.

Entirely fedora39 36448. Entirely local, no remote access.

  1. XPRA_LOG_PREFIX=[minor:28] xpra start-desktop :28 --attach=no --resize-display=5112x1362 --start-child=vfvwm
  2. XPRA_LOG_PREFIX=[minor:28] xpra attach :28 --border=turquoise,2
  3. Use fvwm menus to open a terminal. This terminal is not driven by xpra, other than living within the xpra-driven fvwm desktop.

Do stuff in the terminal. It will mis-"remember" old content. As the dnf command works, content that was in the terminal from an early point keeps coming back up. When the command ends, the final state is again garbage, but when I hit a few Enter keys, it suddenly reverts to the state that it should and does actually have.

Apologies for the fuzzy video; github wouldn't accept the 12Mbyte original, so I downsized with with ffmpeg -vf "scale=iw/2:ih/2".

xpra-desktop-terminal-insane-half-size.mp4
@karlkleinpaste karlkleinpaste added the bug Something isn't working label Aug 9, 2024
@karlkleinpaste
Copy link
Author

What the heck. The embedded video can't be played? I opened the bug report specifically to have a rendezvous point to show that video. As I see it, the video is entirely greyed out, no controls are active to my mouse.

@karlkleinpaste
Copy link
Author

Use ftp to reach ftp.xiphos.org, where you will find the original .mkv in directory /pub/video.

@totaam
Copy link
Collaborator

totaam commented Aug 10, 2024

I haven't looked at the video yet but from your description, this sounds an awful lot like #4201
If so, then turning opengl off, or using --encodings=no-scroll should workaround it.
Normally, I would be thrilled to hear of a new symptom as this would help me narrow down on the bug, in this case however, I have already spent weeks (on and off) chasing this particular bug - and by the look of things, I have made things worse..

@karlkleinpaste
Copy link
Author

Just to be clear...
I know opengl=no is a client side option.
Is encodings=no-scroll for server or client side? Perhaps both? The man page does not mention much detail about encodings, simply cautioning against using it at all. When working through other bugs in the past, you have recommended things like encodings=-h264 to leave other options present, so I'm wondering about the effect of setting a single value like no-scroll.

@totaam
Copy link
Collaborator

totaam commented Aug 10, 2024

Is encodings=no-scroll for server or client side? Perhaps both?

Both. It should be listed as a server-driven option, meaning that the client cannot enable encodings that the server has not enabled.
In this case, using on the client should be enough.
As for -h264 vs no-scroll, the no-XXX syntax was added recently.
I assume that you're running git master since you're hitting the scroll bug.

@karlkleinpaste
Copy link
Author

I still have my 36439 build available, and installed that; the problem does not occur with 36439. So whatever has happened has been after 36439 and up to 36448 -- I hope that can help you determine where the cause is.

@karlkleinpaste
Copy link
Author

Re-installed 36448, can confirm --opengl=no makes things OK again.

Also, I can't quantify or document this exactly, but I have an impression that, only in 36448, moving windows within the desktop, i.e. any of the several terminal windows there, is ... jerky. The windows don't move smoothly, they jump several pixels at a time as I grab a title bar and they are moved from one spot on the desktop to another. No idea if this is related.

@totaam
Copy link
Collaborator

totaam commented Aug 11, 2024

@karlkleinpaste is 0cac56a enough to make things mostly work again?
Not perfectly until #4201 is properly fixed, but usable?


Looking at the recent opengl commits, eddc282 is the only other commit that is suspicious.

@karlkleinpaste
Copy link
Author

I just rebuilt 36472 and restarted that desktop. Unfortunately, the moment I open the first local terminal and start writing commands, it's leaving behind garbage visual traces. The white blocks at the end of "noRecall" and beneath "Last login" should not be there, being apparent incomplete remnants of the block cursor that was briefly there.

screenshot1

As I do other random activity in the terminal, garbage spots like this clear themselves up, but they come back irregularly.

This is without opengl=no, of course.

@totaam
Copy link
Collaborator

totaam commented Aug 13, 2024

Assuming that the revision scripts work exactly the same way on your systems as they do on mine:

Obtained by running:

git checkout eddc282c348b3c4bb5bae00588eb3d32b6dd66b6
rm xpra/src_info.py
./fs/bin/add_build_info.py | grep REVISION

Looking at these 9 commits, the only drawing related entries are:

  • ad25c8c only affects cairo (non-opengl) painting?
  • 4132b7b could be! but only if pointer-overlay or paint-debug are enabled
  • a64d4b7 also cairo
  • eddc282 is the "good" commit, but also suspicious

@totaam
Copy link
Collaborator

totaam commented Aug 13, 2024

OK, so I managed to reproduce some very strange behaviour by using desktop scaling, and this allowed me to bisect #4324 (comment) - leading back to #2467, which is nowhere near the commits above... but, the same fix also worked for #4201
So, I am hoping that what you were seeing was a result of desktop scaling (fixed) or also related to FBO initialization somehow.
@karlkleinpaste can you try again please?

@karlkleinpaste
Copy link
Author

I'll get to it later today.

@karlkleinpaste
Copy link
Author

I think we're there. I'm not using --opengl=no and I'm not seeing mis-remembered garbage.
Thanx.

@totaam
Copy link
Collaborator

totaam commented Aug 14, 2024

Unfortunately, I am still seeing occasional issues where the opengl renderer gets wedged. :(
It's as if the windows' opengl contexts are somehow interfering with each other.

Reproduced by just running glxgears from an xterm with the client using --opengl=force.

totaam added a commit that referenced this issue Aug 14, 2024
there are many different types of scaling..
the ones that matter here are:
* the "scale factor" given to us by the opengl backend, which usually comes from the OS and converts from the GTK window geometry to the actual geometry used on-screen,
* the ratio between the backing's size and the render size, this is normally set using the desktop-scaling option

I have no idea why we need to call glClear only when scaling, but not doing so produces garbage in the window, despite glBlitFramebuffer coming afterwards and covering the same area..
@totaam
Copy link
Collaborator

totaam commented Aug 14, 2024

Well, that turned out to be a nightmare and I still don't know why the fix was needed: 775d435
glClear does something that is required when we use scaling with glBlitFramebuffer.
I found that out by just trying things at random out of sheer desperation.

(the geometry fixes aren't actually important since we always use double-buffered visuals and those always paint the whole window - so although the arguments were the wrong ones, the result was not)

@totaam totaam closed this as completed Aug 14, 2024
totaam added a commit that referenced this issue Aug 14, 2024
there are many different types of scaling..
the ones that matter here are:
* the "scale factor" given to us by the opengl backend, which usually comes from the OS and converts from the GTK window geometry to the actual geometry used on-screen,
* the ratio between the backing's size and the render size, this is normally set using the desktop-scaling option

I have no idea why we need to call glClear only when scaling, but not doing so produces garbage in the window, despite glBlitFramebuffer coming afterwards and covering the same area..
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants