Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sixels are too small in Windows Terminal #211

Closed
veltza opened this issue Aug 19, 2024 · 31 comments
Closed

Sixels are too small in Windows Terminal #211

veltza opened this issue Aug 19, 2024 · 31 comments

Comments

@veltza
Copy link

veltza commented Aug 19, 2024

I recently tried the patched version of chafa-test.zip on a nightly build of Windows Terminal Canary (V1.22.240816001-llm) and found that the -s option doesn't work properly. It prints a sixel that's way too small:

chafa

Even if you don't use the -s option, the sixels still stay small and don't fill the entire window like they should:

chafafull

@veltza
Copy link
Author

veltza commented Aug 20, 2024

I just discovered that Windows Terminal uses a virtual cell concept (see PRs 17504 and 17421), where the cell size is always 10x20 from the sixel perspective. I'm pretty sure this is causing the size issues.

@hpjansson
Copy link
Owner

Yes, I think that's it. And we don't have a way to probe the window's pixel dimensions on Windows (we use ioctl() for this on other platforms), so we're falling back to a bad default assumption of 8x8-pixel cells.

@j4james - what's the best client approach here, in your opinion? Is there a way to probe the dimensions passively? Also, is there something in the environment we can use to differentiate a sixel-capable Windows Terminal from an older version?

@j4james
Copy link

j4james commented Aug 20, 2024

@hpjansson As veltza mentioned, the Windows Terminal cell size is always 10x20, and this would also be the case on a real VT340 as well all the commercial terminal emulators that I'm aware of. So that would be my recommendation for the fallback size if you have no other way of querying the terminal.

But if you also want to support terminals that use a non-standard size, and which don't (or can't) set TIOCGSIZE, then I would recommend querying some combination of CSI 16 t (the cell size in pixels), or CSI 14 t and CSI 18 t the (window size in pixels and characters, from which you can estimate the cell size). CSI 16 t is more reliable, but less commonly supported (at least that was a case a few years back when I lasted tested).

It's also worth mentioning that TIOCGSIZE won't always report zero for the pixel fields when they aren't known. I've often seen them reported as a random value > 32000. I haven't looked at your code, so maybe you're already accounting for that. Just wanted to let you know in case you weren't.

@j4james
Copy link

j4james commented Aug 20, 2024

Also, is there something in the environment we can use to differentiate a sixel-capable Windows Terminal from an older version?

The sixel-capable version will report the sixel feature (parameter value 4) in the primary device attributes report, the same as most other sixel-capable terminals.

@hpjansson
Copy link
Owner

hpjansson commented Aug 20, 2024

@hpjansson As veltza mentioned, the Windows Terminal cell size is always 10x20, and this would also be the case on a real VT340 as well all the commercial terminal emulators that I'm aware of. So that would be my recommendation for the fallback size if you have no other way of querying the terminal.

Yes. I think I'll change the default in the CLI tool (this affects non-sixel modes like symbols too -- but the platforms where square cells can be expected are in a very definite minority). The library will have to keep 8x8 as the default, since it's API.

But if you also want to support terminals that use a non-standard size, and which don't (or can't) set TIOCGSIZE, then I would recommend querying some combination of CSI 16 t (the cell size in pixels), or CSI 14 t and CSI 18 t the (window size in pixels and characters, from which you can estimate the cell size). CSI 16 t is more reliable, but less commonly supported (at least that was a case a few years back when I lasted tested).

Thanks, this is very useful information for when I implement active probing.

It's also worth mentioning that TIOCGSIZE won't always report zero for the pixel fields when they aren't known. I've often seen them reported as a random value > 32000. I haven't looked at your code, so maybe you're already accounting for that. Just wanted to let you know in case you weren't.

I am :-) But I don't think this is something people run into frequently with Chafa, fortunately.

The sixel-capable version will report the sixel feature (parameter value 4) in the primary device attributes report, the same as most other sixel-capable terminals.

I was hoping for something I can grab instantly/passively, like a version string in an environment variable, or a system call.

Tangentially: What do TIOCGWINSZ and tcgetwinsize() report on a Linux host when you're in an ssh session originating from Windows Terminal?

@j4james
Copy link

j4james commented Aug 20, 2024

Tangentially: What do TIOCGWINSZ and tcgetwinsize() report on a Linux host when you're in an ssh session originating from Windows Terminal?

Windows Terminal doesn't have a built in ssh client, and a stand alone ssh client will likely just report 0, because it wouldn't have any more knowledge of the pixel size than any other app.

I know there are some terminals on Windows that do have a built in ssh client, and I've seen at least one that reported meaningful pixel values in TIOCGWINSZ over ssh, but not all of them do.

@veltza
Copy link
Author

veltza commented Aug 20, 2024

But if you also want to support terminals that use a non-standard size, and which don't (or can't) set TIOCGSIZE, then I would recommend querying some combination of CSI 16 t (the cell size in pixels), or CSI 14 t and CSI 18 t the (window size in pixels and characters, from which you can estimate the cell size). CSI 16 t is more reliable, but less commonly supported (at least that was a case a few years back when I lasted tested).

I don't want to go off topic too much, but do you plan to support XTSMGRAPHICS sequences as well, since they can also be used to query sixel properties?

For example, lsix uses them to query the number of sixel color registers and sixel geometry (or window size).

@j4james
Copy link

j4james commented Aug 20, 2024

do you plan to support XTSMGRAPHICS sequences

No.

@j4james
Copy link

j4james commented Aug 20, 2024

I am :-) But I don't think this is something people run into frequently with Chafa, fortunately.

@hpjansson I just looked at your code now, and I don't think that would catch the cases that I've seen (assuming I'm reading it right). For me the size has always been greater than 32000 but less than 32767. I've seen this on every terminal I've tested when connected to a telnet server on Ubuntu.

@hpjansson
Copy link
Owner

Windows Terminal doesn't have a built in ssh client, and a stand alone ssh client will likely just report 0, because it wouldn't have any more knowledge of the pixel size than any other app.

Ok. I was asking because Linux openssh picks this information up on the client side and relays it to the host, so the ioctl reports the same dimensions in the remote session. Maybe it could do this in Windows Terminal too, if it could query the terminal device directly a la ioctl (I suspect the openssh maintainers would be reluctant to emit and parse control sequences for this purpose).

I know there are some terminals on Windows that do have a built in ssh client, and I've seen at least one that reported meaningful pixel values in TIOCGWINSZ over ssh, but not all of them do.

I see. That reminds me I'll have to check how we fare in PuTTY sometime.

@hpjansson I just looked at your code now, and I don't think that would catch the cases that I've seen (assuming I'm reading it right). For me the size has always been greater than 32000 but less than 32767. I've seen this on every terminal I've tested when connected to a telnet server on Ubuntu.

Interesting. I'll lower PIXEL_EXTENT_MAX, then. The case where the user's terminal is maximized on a desktop more than four 8k monitors wide is likely rarer anyway :-)

@j4james
Copy link

j4james commented Aug 20, 2024

I was asking because Linux openssh picks this information up on the client side and relays it to the host, so the ioctl reports the same dimensions in the remote session. Maybe it could do this in Windows Terminal too,

But openssh would pick up this information the same way you would when running locally. On Linux that's assumedly achieved with ioctl, but Windows doesn't have that. On Windows they assumedly get the char size with one of the win32 console APIs (maybe GetConsoleScreenBufferInfo), but there isn't an equivalent API for the pixel size. If there was, you could've used that too.

@hpjansson
Copy link
Owner

Indeed. I guess I'm saying it's high on my wishlist. Thanks for the tips (and congrats on getting sixel support merged).

@hpjansson
Copy link
Owner

Test archive including this fix and the one for #210: chafa-test-gh211.zip

Hope this solves the issue. Thanks for submitting it, @veltza!

@veltza
Copy link
Author

veltza commented Aug 21, 2024

There is still something odd there. Look at sizes 30 and 40:

chafa10to40

I examined the sixels that Chafa produces and the widths seem to be correct:

> s=30; chafa -f sixels -s ${s}x{s} c13d4f6d.jpg | hexyl | head -n 4
┌────────┬─────────────────────────┬─────────────────────────┬────────┬────────┐
│00000000│ 1b 5b 3f 32 35 6c 1b 5b ┊ 3f 38 30 6c 1b 5b 3f 38 │•[?25l•[┊?80l•[?8│
│00000010│ 34 35 32 6c 1b 50 30 3b ┊ 30 3b 30 71 22 31 3b 31 │452l•P0;┊0;0q"1;1│
│00000020│ 3b 33 30 30 3b 31 35 36 ┊ 23 30 3b 32 3b 37 3b 36 │;300;156┊#0;2;7;6│
>
> s=40; chafa -f sixels -s ${s}x{s} c13d4f6d.jpg | hexyl | head -n 4
┌────────┬─────────────────────────┬─────────────────────────┬────────┬────────┐
│00000000│ 1b 5b 3f 32 35 6c 1b 5b ┊ 3f 38 30 6c 1b 5b 3f 38 │•[?25l•[┊?80l•[?8│
│00000010│ 34 35 32 6c 1b 50 30 3b ┊ 30 3b 30 71 22 31 3b 31 │452l•P0;┊0;0q"1;1│
│00000020│ 3b 34 30 30 3b 32 31 36 ┊ 23 30 3b 32 3b 37 3b 36 │;400;216┊#0;2;7;6│
>

So it looks like Chafa works properly, but the virtual cell system in Windows Terminal doesn't.

@hpjansson
Copy link
Owner

What happens if you pass --stretch to Chafa?

@veltza
Copy link
Author

veltza commented Aug 21, 2024

What happens if you pass --stretch to Chafa?

That actually fixed the issue! I've never needed or used that option on my linux machine, so I didn't even know it existed.

So everything works as expected. Good job!

@j4james
Copy link

j4james commented Aug 21, 2024

@hpjansson You should be aware that if you're setting the sixel background size to something larger than the actual image content (as is the case in some of veltza's examples above), that can produce some weird effects, which you probably aren't expecting, when the user's background color scheme doesn't match the sixel background color.

@veltza
Copy link
Author

veltza commented Aug 22, 2024

j4james is right, the sixels have black bars when the images are not stretched to cover every cell. I didn't even notice it because I was using the black background color on my terminal.

So apparently Windows Terminal fills the empty pixels with color0, while some other terminals use the terminal background color. For example, when using Chafa, this bird animation has a black background on Windows Terminal, but not on Foot terminal. I don't know which one is the right way, but if you ask the users, they will choose Foot's way.

But that's another issue, if that's an issue.

@j4james
Copy link

j4james commented Aug 22, 2024

Was just coming here to post an example image demonstrating the problem. As veltza mentioned, not all terminals work the same way, but here's a screenshot showing Xterm, WezTerm, and Windows Temrinal.

image

The above test case was produced on Windows using a 1920x1080 image with this command line:

chafa -f sixel -s 30x30 c13d4f6d.jpg

I redirected that to a file so I could cat it on various terminals to make sure they were all testing the exact same content.

Edit: If you actually intended that area to be transparent, you should set the sixel P2 parameter to 1.

@veltza
Copy link
Author

veltza commented Aug 22, 2024

Edit: If you actually intended that area to be transparent, you should set the sixel P2 parameter to 1.

The sixel transparency works fine in still images but not in animated gifs or you can get something like below. This is why transparency is disabled in Chafa (maybe it could be enabled in still images and disabled in animated gifs?). And here are some reasons why Foot chose their way.

So, here's how it looks on Windows Terminal when transparency is enabled in that animated gif to avoid the black background:

wt-anim

Maybe this issue needs to be reopened or maybe everything has already been said.

@j4james
Copy link

j4james commented Aug 22, 2024

The sixel transparency works fine in still images but not in animated gifs

If you want a transparent animation, you'll need to clear the screen between frames. You can't just keep blitting transparent images on top of each other and expect that to work. But I'm not suggesting that needs to be fixed - the current gif implementation is clearly not intended to be transparent and that's fine.

My original point was that you were requesting a 30x30 size image in one of your tests above, and I would expect that to be 300 pixels wide at a cell size of 10x20. But what chafa produced was an image that was 277 pixels wide, padded out to 300 with a background fill. That seemed wrong to me, but if that was the intended behavior, that's also fine. If it wasn't the intended behavior, then it's worth raising another bug report.

hpjansson added a commit that referenced this issue Aug 23, 2024
We can clear the background in animation frames by setting the alpha
threshold now, like we do for Kitty.

Ref: #211 (GitHub).
Ref: #147 (GitHub).
hpjansson added a commit that referenced this issue Aug 23, 2024
Stills maintain their transparency (the default alpha threshold is
.5). This can be overridden with the -t flag.

Ref: #211 (GitHub).
Ref: #147 (GitHub).
@hpjansson
Copy link
Owner

Edit: If you actually intended that area to be transparent, you should set the sixel P2 parameter to 1.

The sixel transparency works fine in still images but not in animated gifs or you can get something like below. This is why transparency is disabled in Chafa (maybe it could be enabled in still images and disabled in animated gifs?). And here are some reasons why Foot chose their way.

Yes - this was easy to improve; I went with the same solution as for Kitty, setting the default alpha threshold to 1 for animation frames (but not stills). It can be overridden with the -t argument.

@hpjansson
Copy link
Owner

If you want a transparent animation, you'll need to clear the screen between frames. You can't just keep blitting transparent images on top of each other and expect that to work. But I'm not suggesting that needs to be fixed - the current gif implementation is clearly not intended to be transparent and that's fine.

Assuming it could be made to not flicker somehow, what's the best-practice way of clearing the space occupied by a sixel image to transparency?

@j4james
Copy link

j4james commented Aug 23, 2024

what's the best-practice way of clearing the space occupied by a sixel image to transparency?

The simplest way would probably be with an erase to end of screen (i.e. \e[J). For a command line utility that's probably OK, because everything below your cursor position would typically be cleared anyway.

But if you want to make sure you're only erasing the actual area covered by the image, I guess a series of ECH sequences might be the next best thing. If you're in a position to check the capabilities of the terminal first, you could possibly optimise that with a macro. That could even include the common header of the sixel frames if you really want to make the most of it.

Another possible option is DECERA, but again that would require you checking the capabilities of the terminal first. It also requires that you know the current coordinates where your image is being output, so it's possibly not ideal for a command line utility.

Assuming it could be made to not flicker somehow

This should probably be OK if you can make sure to output everything in a single write, although that's maybe not guaranteed to work over a network connection.

But I should point out that this is potentially already an issue with your current use of the sixel background fill, because that's essentially doing the same thing on terminals that support progressive output. The background is first filled with the dimensions you specify, and the image content is later written over that. If the content is not all received at the same time, you can end up with part of the background fill showing through in some of the frames.

For a completely clean animation I would recommend double buffering with VT pages, but you're not going to find many terminals supporting that. And now that I think about, I'm not sure there are that many terminals supporting transparent sixels either (at least that didn't use to be the case when last I tested, but maybe things have improved since then).

@hpjansson
Copy link
Owner

hpjansson commented Aug 26, 2024

VT paging is an interesting option I hadn't considered, but perhaps most useful in a full-screen client. Maybe the synchronized output extension could be usable too.

The other approaches are likely to result in flicker; there's no way to ensure big writes are atomic, and likewise no guarantee that the receiving terminal won't do smaller reads and interleave with visual updates. That's why I like the possibility of having a SRC image overwrite policy in addition to the traditional OVER. mlterm (probably wrongly) uses SRC regardless of P2 setting, while foot overwrites with the current ANSI background color when P2=0 or P2=2 (see @veltza's link above). If you define the default ANSI background color as transparency, then it's a viable SRC operation.

As a client-side implementer, I like foot's approach, since it allows for both OVER and SRC alpha operations, is easy to implement and doesn't rely on any hacks. I assume you've already thought about it and perhaps rejected it, but I'd be interested to know your rationale in any case. Feel free to link me to a previous response if you've already fielded this question :-)

@j4james
Copy link

j4james commented Aug 26, 2024

My intention for Windows Terminal was for it to be an accurate emulation of the original DEC terminals, so I have no interest in any reinterpretation of the protocol that breaks backwards compatibility.

That said, if there was an agreement amongst other terminals on an extension that had real added value, but which didn't break backwards compatibility, I'd be open to considering it, but past experience suggests there's zero chance of that happening.

That's why I like the possibility of having a SRC image overwrite policy in addition to the traditional OVER.

I'm not sure I understand that. If by OVER you mean the 0 bits overwrite the underlying buffer, and SRC leaves 0 bit pixels unchanged, then sixel is always SRC - if it didn't work that way you wouldn't be able to build up a multicolor image. The only difference between P2=0 and P2=1 is whether an area of the buffer is prefilled before layering more content on top of it - it doesn't actually change the interpretation of 0 bits.

Now if you wanted to use sixel to prefill with the default text background color, you could potentially query that RGB value, and then set the sixel 0 color table entry with the same RGB (more or less). However that's not exactly the same thing. If the terminal has transparency enabled, then the default text background will show through whatever is behind the terminal window, but that wouldn't be the case for sixel pixels that merely shared the same RGB.

In any event, if you're using P2=0, that is still going to have the potential to flicker, regardless of what color you're filling with, unless you also set the RA dimensions to something like 1x1 (which is almost the same thing as P2=1).

@hpjansson
Copy link
Owner

My intention for Windows Terminal was for it to be an accurate emulation of the original DEC terminals, so I have no interest in any reinterpretation of the protocol that breaks backwards compatibility.

I don't want that either. The spec I'm looking at says this:

P2 selects how the terminal draws the background color. You can use one of three values.

P2 Meaning
0 or 2 (default) Pixel positions specified as 0 are set to the current background color.
1 Pixel positions specified as 0 remain at their current color.

Unless there's a clarification to the spec I've missed, I think it's most reasonable to interpret "current background color" as the currently set ANSI BG color, which could be the default color where some terminals will substitute a gradient, wallpaper, window manager transparency, etc.

The foot author has explained this better than I can.

That's why I like the possibility of having a SRC image overwrite policy in addition to the traditional OVER.

I'm not sure I understand that. If by OVER you mean the 0 bits overwrite the underlying buffer, and SRC leaves 0 bit pixels unchanged, then sixel is always SRC - if it didn't work that way you wouldn't be able to build up a multicolor image. The only difference between P2=0 and P2=1 is whether an area of the buffer is prefilled before layering more content on top of it - it doesn't actually change the interpretation of 0 bits.

I'm just referencing Porter-Duff, so it's the other way around. SRC is a straight copy (including copying the alpha channel), while OVER blends SRC and DST. They're the most common image composition ops (e.g. pixman, ImageMagick). I'm referring to the pixels that are left transparent after all sixel bands are applied, in the resulting image buffer. Whenever the image buffer is composited on the terminal (due to scrolling, expose events, etc) or on another image buffer, the operation dictates what the transparent pixels do. I use SRC to mean the transparency is copied and results in transparency on DST. I use OVER to mean they're assigned DST's pixel values.

My conceptual model is this:

Sixels --decode--> Image Buffer (RGBA) --src/over--> Terminal Content (RGBA) --over--> Background (RGB)

...where Terminal Content is whatever's in the terminal's display area (at this point, the default BG color is 0x00000000, or full transparency), and Background is the default background, be it a solid color, gradient, picture, desktop, etc.

Now if you wanted to use sixel to prefill with the default text background color, you could potentially query that RGB value, and then set the sixel 0 color table entry with the same RGB (more or less). However that's not exactly the same thing. If the terminal has transparency enabled, then the default text background will show through whatever is behind the terminal window, but that wouldn't be the case for sixel pixels that merely shared the same RGB.

Right.

In any event, if you're using P2=0, that is still going to have the potential to flicker, regardless of what color you're filling with, unless you also set the RA dimensions to something like 1x1 (which is almost the same thing as P2=1).

Makes sense - but I haven't come across this problem in practice, at least not yet. Do you know about configurations where progressive updates are or will be implemented, presumably leading to this issue?

@j4james
Copy link

j4james commented Aug 28, 2024

Unless there's a clarification to the spec I've missed, I think it's most reasonable to interpret "current background color" as the currently set ANSI BG color

What you've missed is that we've tested this on a real DEC VT340, so we know how it's supposed to work, thereby avoiding the need to guess what is "reasonable". None of the DEC sixel terminals actually supported ANSI colors, but the VT340 does have a palette of 16 colors, and the text background and the sixel background fill will both typically use palette entry 0, so most of the time the sixel background fill will match the text background.

However, on modern terminals, the text palette and sixel palette are not usually the same thing, because the text palette on a modern terminal is ANSI compatible, while the sixel palette matches the VT340 (at least for the first 16 colors). So while the sixel palette entry 0 will be black by default, and in many color schemes the default text background will be black, that's not guaranteed (and technically isn't guaranteed on the VT340 either).

If you're unsure about any aspect of sixel, and genuinely want to know how it is supposed to work, I'd encourage you to raise an issue on the vt340test repo.

I'm just referencing Porter-Duff, so it's the other way around. SRC is a straight copy (including copying the alpha channel), while OVER blends SRC and DST.

OK, that makes a lot more sense now.

But I think the misunderstanding here is coming from your interpretation of sixel as an image format that gets decoded and then blitted onto the page. But it's really more of a drawing protocol that plots pixels individually. There's no alpha channel - there are just pixels that are plotted (the 1 bits) and pixels that are skipped (the 0 bits). And in the case of P2=0, there's potentially also a rectangular area that is filled.

Do you know about configurations where progressive updates are or will be implemented, presumably leading to this issue?

Yes. Windows Terminal supports progressive updates, as does Reflection Desktop, and IBM's Personal Communications terminal emulator.

@hpjansson
Copy link
Owner

Was just coming here to post an example image demonstrating the problem. As veltza mentioned, not all terminals work the same way, but here's a screenshot showing Xterm, WezTerm, and Windows Temrinal.

I think I have the blank space issue solved now; it looks ok in Windows Terminal canary here. Attached an x64 binary in #212 for testing.

@hpjansson
Copy link
Owner

Edit: If you actually intended that area to be transparent, you should set the sixel P2 parameter to 1.

As per our discussion, I changed the code to set P2 to 1 always, and filling all pixels manually for animations. Hopefully this also addresses issues with progressive redraw, although I haven't actually tested this.

However, on modern terminals, the text palette and sixel palette are not usually the same thing, because the text palette on a modern terminal is ANSI compatible, while the sixel palette matches the VT340 (at least for the first 16 colors). So while the sixel palette entry 0 will be black by default, and in many color schemes the default text background will be black, that's not guaranteed (and technically isn't guaranteed on the VT340 either).

That's exactly what I needed to know. Thank you!

But I think the misunderstanding here is coming from your interpretation of sixel as an image format that gets decoded and then blitted onto the page. But it's really more of a drawing protocol that plots pixels individually. There's no alpha channel - there are just pixels that are plotted (the 1 bits) and pixels that are skipped (the 0 bits). And in the case of P2=0, there's potentially also a rectangular area that is filled.

I think that part is pretty clear, but the spec's "pixel set to the current background color" isn't enough to go on, and conceptualizing it as a compositing op with the background and 1-bit alpha helps explain why some modern TEs see it as a valid interpretation (e.g. when their current background color is some kind of wallpaper). This is, of course, moot when the test unit shows otherwise :-)

@j4james
Copy link

j4james commented Sep 4, 2024

I think I have the blank space issue solved now; it looks ok in Windows Terminal canary here. Attached an x64 binary in #212 for testing.

Yep, that looks good to me. Thanks.

hpjansson added a commit that referenced this issue Sep 9, 2024
This makes sixels print at the correct size in terminals with a fixed
virtual cell size, like Windows Terminal and legacy DEC terminals, if
this can not be detected at runtime.

Fixes #211 (GitHub).
hpjansson added a commit that referenced this issue Sep 9, 2024
We can clear the background in animation frames by setting the alpha
threshold now, like we do for Kitty.

Ref: #211 (GitHub).
Ref: #147 (GitHub).
hpjansson added a commit that referenced this issue Sep 9, 2024
Stills maintain their transparency (the default alpha threshold is
.5). This can be overridden with the -t flag.

Ref: #211 (GitHub).
Ref: #147 (GitHub).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants