TextEmbed Widget Discussion #2
Replies: 14 comments 17 replies
-
@AnonymouX47 OK, so about TextEmbed. Now that you've basically done the impossible, I'll tell you more about my use case and we'll see how closely the widget lines up with those requirements 😄 In Toot, Mastodon status posts are displayed using a custom widget - an urwid.Pile containing, among other things, an urwid.Text widget rendering the status content with multiple lines of styled text. Here's a good example. For now we're concerned with the urwid.Text widget rendering the area that's bracketed in red: Mostly plain text, with styled hashtags. Also, some kind of weird glyph after #biz , but that might be some Windows Terminal weirdness so let's ignore that. There's a URL in the text as well. This example has no embedded custom server emoji :shortcodes: , which is typical - they are more frequently used in display names, but they sometimes do appear in status content text. Ideally, what I'd like to do is this. I'd like to be able to drop in the TextEmbed widget in place of an urwid.Text widget and have it work like an urwid.Text widget (same options for color, format, alignment). But with TextEmbed, I can also specify placeholders in the text, and drop in other widgets to render in the placeholder's location. (What you have done with the { } syntax.) I can then use the UrwidImage widget to render custom server emoji in a placeholder location, and I can use URLText to render a hyperlink with OSC 8 codes in another placeholder location. This means that I will be doing the following, for terminals that have pixel rendering capabilty:
For terminals that can only render to cells, I will do a subset of the above - just steps 1, 2, 4, 6, 7. :shortcodes: will be left as is. Given that the number of :shortcode: images and URLs I need to embed will vary from one status post to another, it'd be easiest for me to provide a list of widgets to TextEmbed, rather than sending named parameters. |
Beta Was this translation helpful? Give feedback.
-
As for the name, |
Beta Was this translation helpful? Give feedback.
-
I've opened the PR but note, it's only in the initial stage. |
Beta Was this translation helpful? Give feedback.
-
By the way, if you're planning an official release of term-image, I think it's reasonable to do a release without including TextEmbed, then include TextEmbed in a future release. The existing UrwidImage widget (and the others) are worthy of a release on their own. |
Beta Was this translation helpful? Give feedback.
-
(rewriting my entire comment because I now understand that you intend for TextEmbed to be a separate entity from TermImage)
If you're willing to host TextEmbed and maintain it as a MIT licensed library, we could incorporate it into Toot that way, and direct anyone who wants to change it, to submit a PR to you. That way you would be able to get all changes under MIT license from Day 1. |
Beta Was this translation helpful? Give feedback.
-
Hello! 😃 Sorry for keeping this waiting for sooo long. Happily, there's no need for excuses anymore... It's finally here! Source: https://github.com/AnonymouX47/urwidgets/blob/main/src/urwidgets/text_embed.py I ended up implementing something even beyond my initial imagination. Here's a little demo: simplescreenrecorder-2023-04-27_15.56.52.mp4Lest I forget, this Also, the test script I used in the demo above: https://pastebin.com/HFknitc0 (You'll need to install |
Beta Was this translation helpful? Give feedback.
-
Here goes something 😃: https://urwidgets.readthedocs.io/en/latest/api.html#urwidgets.parse_text |
Beta Was this translation helpful? Give feedback.
-
Had to simplify your example a bit to get it working in Toot for Mastodon statuses (there's no Markdown syntax available, so I'm just recognizing embedded URLs in plaintext via regex) Here's an example of a URL surrounded by plaintext; the TextEmbed and Hyperlink widgets work correctly, the line wrapping works, and the OSC 8 functionality gives the correct behavior in Windows Terminal! I tried half a dozen different ways to make this work by hacking the Text widget; this is much, much better. 😁 There's more work to be done before this can be submittted as a PR. I need to make it play nicely with some other features. And then I need to rewrite it entirely once PR 348 is accepted. (Then I'll be rendering real HTML anchor tags in the terminal) Anyway, this is excellent work, thanks again! |
Beta Was this translation helpful? Give feedback.
-
It's so close to being 100% identical in layout / formatting to the master branch now, with one issue that I'm having a hard time figuring out. Is it possible that TextEmbed is calculating a unicode character's width in columns, and in this one case is doing so incorrectly? Here's how the text is rendered using standard urwid.Text: Here's how it looks using TextEmbed: I should note that I'm using parse_text with a regex to detect the URL and build the markup list. With TextEmbed, everything looks fine with the camera emoji and the single space after it. The Camera emoji is U+1F4F7, introduced in Unicode 6.0. The Unicode strings are: '📷 https://www.windy.com/-Webcams/webcams/1577050386' FWIW, Toot uses the wcwidth library to calculate unicode character widths. |
Beta Was this translation helpful? Give feedback.
-
The above two strings are rendered as two separate I should build a small test app to attempt to reproduce the problem outside the entire Toot application. Will try to do that after I take care of my responsibilities today. |
Beta Was this translation helpful? Give feedback.
-
I've tried it on my end and couldn't reproduce this behaviour. Though, there was one difference between your output and mine... the glyph for the U+1F5FA symbol occupied It seems whatever is responsible for the issue is specific to
What exactly did you mean by this and in what way does it affect the output of Or better still, this:
Thanks. |
Beta Was this translation helpful? Give feedback.
-
Re: wcwidth. Toot doesn't use it everywhere, just in our custom widgets, for example when setting the width of our custom button widgets that include emoji in their titles, to properly measure the width of the Unicode text, so the buttons display their titles at full width without line-wrapping. There was a push to integrate wcwidth into Urwid itself; see this issue that didn't get resolved. That link shows a lot of references to Urwid issues where the width of certain Unicode characters - especially newer emoji, or combining characters - were being calculated differently by Urwid than the width the terminal app itself was using to draw them. This is a nasty unsolved problem across terminal apps - basically, the low level code that a lot of software uses to calculate Unicode character widths is out of date, hasn't been updated since Unicode 5.0 days, and doesn't support combining characters well. The TUI app and the terminal software have to agree on the character widths or things start getting out of alignment. wcwidth goes part of the way towards solving these problems; another requirement is to determine at runtime the unicode version supported by the terminal in use, so wcwidth can return values that agree with the terminal's unicode version. Windows Terminal is pretty good with recent Unicode standards, I see some width problems with a few emoji, but not a lot. Other terminal programs like xterm.js have more problems because they've not been keeping up with Unicode standards. Anyway, to bring it back around to the bug at hand - when I saw this I thought we might have some kind of emoji width calculation bug going on, which is why I brought up wcwidth. Based on your testing, this may not be the case. If you can't reproduce on your side, I'll try to do so here. This is really the only glitch i've seen so far in comparing output of the old Text vs new TextEmbed widgets. |
Beta Was this translation helpful? Give feedback.
-
Beta Was this translation helpful? Give feedback.
-
I'm not sure this is even fixable by anyone other than the terminal emulator authors. So, I think the best thing to do is to live with it. It's not a bug in TextEmbed. If it is possible to fix on the TUI side, it would probably require integration of wcwidth and ucs-detect in the Urwid canvas rendering code. Ucs-detect at startup to determine the highest version of Unicode supported by the terminal, and wcwidth to provide the true width of each Unicode character as it will be rendered in the terminal. |
Beta Was this translation helpful? Give feedback.
-
Moving the discussion of TextEmbed here (from here, because it really shouldn't be in a PR)
Beta Was this translation helpful? Give feedback.
All reactions