-
Notifications
You must be signed in to change notification settings - Fork 8.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support Right to Left (RTL) text #538
Comments
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
Ubuntu doesn't refer to a particular terminal emulator. Some of its terminal emulators (e.g. konsole, pterm, mlterm) do some right-to-left rendering, and so does macOS's Terminal.app. And they all do it wrong. Just to give you an idea, let me mention here one of the many problems they all suffer from. They unconditionally "reverse" every Farsi/Arabic/Hebrew/etc. text. This fixes them in the command line and the in the output of simple utilities, and at the same time, hopelessly, unfixably breaks them for more complex applications (e.g. Emacs). See my work at https://terminal-wg.pages.freedesktop.org/bidi/ (along with a WIP implementation in VTE / GNOME Terminal) about a more detailed description of the problems and a proposed specification for the desired behavior. |
Added this issue in my gist to track: https://gist.github.com/XVilka/a0e49e1c65370ba11c17 |
Partial fix, in #1873 |
#1873) This is a partial fix of #538 . This does *not* change the Console RTL behavior, it does however fix an issue in the rendering. Basically, DirectX expects the origin to be on the right if it is going to draw RTL text. This PR is a simple fix for that. Rather than draw with the left point and then move the origin rightwards, we check if it's RTL, if so, we move the origin rightwards immediately, and then draw. LTR rendering is unchanged. This doesn't fix underlying questions of RTL handling in the console. It's just a render bugfix. However, this render bugfix should still be a big help and solve the low-hanging issues. ## Validation Steps Performed Behavior was tested. No changes were made to underlying console. Three sample cases: 1. RTL text input Before: ![image](https://user-images.githubusercontent.com/16987694/60816422-6737e100-a1a2-11e9-9e14-c62323fd5b02.png) After: ![image](https://user-images.githubusercontent.com/16987694/60816395-5ab38880-a1a2-11e9-9f0a-17b03f8268ce.png) 2. Hebrew Output Before (the Hebrew text is all being drawn to the left of the screen, hence the phantom text): ![image](https://user-images.githubusercontent.com/16987694/60816527-93ebf880-a1a2-11e9-9ba3-d3ebb46cc404.png) After: ![image](https://user-images.githubusercontent.com/16987694/60816456-77e85700-a1a2-11e9-9783-9e69849f026d.png) 3. Mixed Output So, this is where this is partial. Due to inherent stuff with RTL behavior, it doesn't look perfect. But the rendering itself is no longer at fault. Before: ![image](https://user-images.githubusercontent.com/16987694/60816593-b5e57b00-a1a2-11e9-82be-0fcabb80f7d4.png) After: ![image](https://user-images.githubusercontent.com/16987694/60816607-bb42c580-a1a2-11e9-849a-12846ec4d5c0.png)
microsoft#1873) This is a partial fix of microsoft#538 . This does *not* change the Console RTL behavior, it does however fix an issue in the rendering. Basically, DirectX expects the origin to be on the right if it is going to draw RTL text. This PR is a simple fix for that. Rather than draw with the left point and then move the origin rightwards, we check if it's RTL, if so, we move the origin rightwards immediately, and then draw. LTR rendering is unchanged. This doesn't fix underlying questions of RTL handling in the console. It's just a render bugfix. However, this render bugfix should still be a big help and solve the low-hanging issues. ## Validation Steps Performed Behavior was tested. No changes were made to underlying console. Three sample cases: 1. RTL text input Before: ![image](https://user-images.githubusercontent.com/16987694/60816422-6737e100-a1a2-11e9-9e14-c62323fd5b02.png) After: ![image](https://user-images.githubusercontent.com/16987694/60816395-5ab38880-a1a2-11e9-9f0a-17b03f8268ce.png) 2. Hebrew Output Before (the Hebrew text is all being drawn to the left of the screen, hence the phantom text): ![image](https://user-images.githubusercontent.com/16987694/60816527-93ebf880-a1a2-11e9-9ba3-d3ebb46cc404.png) After: ![image](https://user-images.githubusercontent.com/16987694/60816456-77e85700-a1a2-11e9-9783-9e69849f026d.png) 3. Mixed Output So, this is where this is partial. Due to inherent stuff with RTL behavior, it doesn't look perfect. But the rendering itself is no longer at fault. Before: ![image](https://user-images.githubusercontent.com/16987694/60816593-b5e57b00-a1a2-11e9-82be-0fcabb80f7d4.png) After: ![image](https://user-images.githubusercontent.com/16987694/60816607-bb42c580-a1a2-11e9-849a-12846ec4d5c0.png)
Consecutive RTL GlyphRuns are drawn from the last to the first. References #538, #7149, all those issues asking for RTL closed as dupes. As @miniksa suggested in a comment on #7149 -- handle the thingy on the render side. If we have GlyphRuns abcdEFGh, where EFG are RTL, we draw them now in order abcdGFEh. This has ransom-noting, because I didn't touch the font scaling at all. This should fix the majority of RTL issues, except it *doesn't* fix issues with colors, because those get split in the TextBuffer phase in the renderer I think, so they show up separately by the GlyphRun phase.
Consecutive RTL GlyphRuns are drawn from the last to the first. References #538, #7149, all those issues asking for RTL closed as dupes. As @miniksa suggested in a comment on #7149 -- handle the thingy on the render side. If we have GlyphRuns abcdEFGh, where EFG are RTL, we draw them now in order abcdGFEh. This has ransom-noting, because I didn't touch the font scaling at all. This should fix the majority of RTL issues, except it *doesn't* fix issues with colors, because those get split in the TextBuffer phase in the renderer I think, so they show up separately by the GlyphRun phase. (cherry picked from commit 60b44c8)
Using terminal preview The RTL now works! 🥳 Although there are some unsightly joining gaps... it pretty much renders correctly!
But in Vim
EDIT: The joining in Vim can be fixed by either:
It does work in nano though... |
How should the command prompt be handled with Right-to-Left support?
Is something like that what would be expected. I only speak english, so have little experience of what would feel right or comfortable. English apps will probably still output Left to Right, unless it specifically adds support for RtL. But if it does add support, would Input be right aligned? |
@adueck commented on Aug 20, 2020, 8:51 AM GMT+4:30:
The letters are still disjoint though. Any idea why is that? |
The disjoint aspect is going to be because it is splitting GlyphRuns due to scaling -- I expect that this shouldn't happen if you switch to a font that supports Arabic |
This commit is larger than it appears to due fanout from threading through bidi parameters. The main changes are: * When clustering cells, add an additional phase to resolve embedding levels and further sub-divide a cluster based on the resolved bidi runs; this is where we get the direction for a run and this needs to be passed through to the shaper. * When doing bidi, the forced cluster boundary hack that we use to de-ligature when cursoring through text needs to be disabled, otherwise the cursor appears to push/rotate the text in that cluster when moving through it! We'll need to find a different way to handle shading the cursor that eliminates the original cursor/ligature/black issue. * In the shaper, the logic for coalescing unresolved runs for font fallback assumed LTR and needed to be adjusted to cluster RTL. That meant also computing a little index of codepoint lengths. * Added `experimental_bidi` boolean option that defaults to false. When enabled, it activates the bidi processing phase in clustering with a strong hint that the paragraph is LTR. This implementation is incomplete and/or wrong for a number of cases: * The config option should probably allow specifying the paragraph direction hint to use by default. * https://terminal-wg.pages.freedesktop.org/bidi/recommendation/paragraphs.html recommends that bidi be applied to logical lines, not physical lines (or really: ranges within physical lines) that we're doing at the moment * The paragraph direction hint should be overridden by cell attributes and other escapes; see 85a6b17 and probably others. However, as of this commit, if you `experimental_bidi=true` then ``` echo This is RTL -> عربي فارسی bidi ``` (that text was sourced from: microsoft/terminal#538 (comment)) then wezterm will display the text in the same order as the text renders in Chrome for that github comment. ``` ; ./target/debug/wezterm --config experimental_bidi=false ls-fonts --text "عربي فارسی ->" LeftToRight 0 ع \u{639} x_adv=8 glyph=300 wezterm.font(".Geeza Pro Interface", {weight="Regular", stretch="Normal", italic=false}) /System/Library/Fonts/GeezaPro.ttc index=2 variation=0, CoreText 2 ر \u{631} x_adv=3.78125 glyph=273 wezterm.font(".Geeza Pro Interface", {weight="Regular", stretch="Normal", italic=false}) /System/Library/Fonts/GeezaPro.ttc index=2 variation=0, CoreText 4 ب \u{628} x_adv=4 glyph=244 wezterm.font(".Geeza Pro Interface", {weight="Regular", stretch="Normal", italic=false}) /System/Library/Fonts/GeezaPro.ttc index=2 variation=0, CoreText 6 ي \u{64a} x_adv=4 glyph=363 wezterm.font(".Geeza Pro Interface", {weight="Regular", stretch="Normal", italic=false}) /System/Library/Fonts/GeezaPro.ttc index=2 variation=0, CoreText 8 \u{20} x_adv=8 glyph=2 wezterm.font("Operator Mono SSm Lig", {weight="DemiLight", stretch="Normal", italic=false}) /Users/wez/.fonts/OperatorMonoSSmLig-Medium.otf, FontDirs 9 ف \u{641} x_adv=11 glyph=328 wezterm.font(".Geeza Pro Interface", {weight="Regular", stretch="Normal", italic=false}) /System/Library/Fonts/GeezaPro.ttc index=2 variation=0, CoreText 11 ا \u{627} x_adv=4 glyph=240 wezterm.font(".Geeza Pro Interface", {weight="Regular", stretch="Normal", italic=false}) /System/Library/Fonts/GeezaPro.ttc index=2 variation=0, CoreText 13 ر \u{631} x_adv=3.78125 glyph=273 wezterm.font(".Geeza Pro Interface", {weight="Regular", stretch="Normal", italic=false}) /System/Library/Fonts/GeezaPro.ttc index=2 variation=0, CoreText 15 س \u{633} x_adv=10 glyph=278 wezterm.font(".Geeza Pro Interface", {weight="Regular", stretch="Normal", italic=false}) /System/Library/Fonts/GeezaPro.ttc index=2 variation=0, CoreText 17 ی \u{6cc} x_adv=4 glyph=664 wezterm.font(".Geeza Pro Interface", {weight="Regular", stretch="Normal", italic=false}) /System/Library/Fonts/GeezaPro.ttc index=2 variation=0, CoreText 19 \u{20} x_adv=8 glyph=2 wezterm.font("Operator Mono SSm Lig", {weight="DemiLight", stretch="Normal", italic=false}) /Users/wez/.fonts/OperatorMonoSSmLig-Medium.otf, FontDirs 20 - \u{2d} x_adv=8 glyph=276 wezterm.font("Operator Mono SSm Lig", {weight="DemiLight", stretch="Normal", italic=false}) /Users/wez/.fonts/OperatorMonoSSmLig-Medium.otf, FontDirs 21 > \u{3e} x_adv=8 glyph=338 wezterm.font("Operator Mono SSm Lig", {weight="DemiLight", stretch="Normal", italic=false}) /Users/wez/.fonts/OperatorMonoSSmLig-Medium.otf, FontDirs ``` ``` ; ./target/debug/wezterm --config experimental_bidi=true ls-fonts --text "عربي فارسی ->" RightToLeft 17 ی \u{6cc} x_adv=9 glyph=906 wezterm.font(".Geeza Pro Interface", {weight="Regular", stretch="Normal", italic=false}) /System/Library/Fonts/GeezaPro.ttc index=2 variation=0, CoreText 15 س \u{633} x_adv=10 glyph=277 wezterm.font(".Geeza Pro Interface", {weight="Regular", stretch="Normal", italic=false}) /System/Library/Fonts/GeezaPro.ttc index=2 variation=0, CoreText 13 ر \u{631} x_adv=4.78125 glyph=272 wezterm.font(".Geeza Pro Interface", {weight="Regular", stretch="Normal", italic=false}) /System/Library/Fonts/GeezaPro.ttc index=2 variation=0, CoreText 11 ا \u{627} x_adv=4 glyph=241 wezterm.font(".Geeza Pro Interface", {weight="Regular", stretch="Normal", italic=false}) /System/Library/Fonts/GeezaPro.ttc index=2 variation=0, CoreText 9 ف \u{641} x_adv=5 glyph=329 wezterm.font(".Geeza Pro Interface", {weight="Regular", stretch="Normal", italic=false}) /System/Library/Fonts/GeezaPro.ttc index=2 variation=0, CoreText 8 \u{20} x_adv=8 glyph=2 wezterm.font("Operator Mono SSm Lig", {weight="DemiLight", stretch="Normal", italic=false}) /Users/wez/.fonts/OperatorMonoSSmLig-Medium.otf, FontDirs 6 ي \u{64a} x_adv=9 glyph=904 wezterm.font(".Geeza Pro Interface", {weight="Regular", stretch="Normal", italic=false}) /System/Library/Fonts/GeezaPro.ttc index=2 variation=0, CoreText 4 ب \u{628} x_adv=4 glyph=243 wezterm.font(".Geeza Pro Interface", {weight="Regular", stretch="Normal", italic=false}) /System/Library/Fonts/GeezaPro.ttc index=2 variation=0, CoreText 2 ر \u{631} x_adv=5 glyph=273 wezterm.font(".Geeza Pro Interface", {weight="Regular", stretch="Normal", italic=false}) /System/Library/Fonts/GeezaPro.ttc index=2 variation=0, CoreText 0 ع \u{639} x_adv=6 glyph=301 wezterm.font(".Geeza Pro Interface", {weight="Regular", stretch="Normal", italic=false}) /System/Library/Fonts/GeezaPro.ttc index=2 variation=0, CoreText LeftToRight 0 \u{20} x_adv=8 glyph=2 wezterm.font("Operator Mono SSm Lig", {weight="DemiLight", stretch="Normal", italic=false}) /Users/wez/.fonts/OperatorMonoSSmLig-Medium.otf, FontDirs 1 - \u{2d} x_adv=8 glyph=480 wezterm.font("Operator Mono SSm Lig", {weight="DemiLight", stretch="Normal", italic=false}) /Users/wez/.fonts/OperatorMonoSSmLig-Medium.otf, FontDirs 2 > \u{3e} x_adv=8 glyph=470 wezterm.font("Operator Mono SSm Lig", {weight="DemiLight", stretch="Normal", italic=false}) /Users/wez/.fonts/OperatorMonoSSmLig-Medium.otf, FontDirs ; ``` refs: #784
I'd like to add to the discussion that it seems though the RTL renders correctly, the modification/insertion of characters doesn't seem to behave correct. Using It looks like the character insertion happens in reverse order. That is, inserting a character on the 3rd index (from right), then it inserts a character on the 3rd index (from left). This behaviour is consistent on Windows terminal regardless of the application open (Vim, bash, CMD, etc). |
As an Arabic speaker, I would prefer when setting my terminal in an RTL mode (so everything is right aligned), to have even LTR language output to be right-aligned. This is because having to move my eyes up and down is much easier than diagonally. It is also generally good UX principles to align things vertically to make it easier to read. So perhaps something like below would be an reasonable solution.
|
mixing ANSI code breaks the alignment inside words for example, take Related issue: rust-lang/rust#97020 |
For reference, this what needs implementing it seems: https://www.unicode.org/reports/tr9/ |
@zadjii-msft hey, would you be able to expand on this a bit (and link to other issues/responses explaining this): why has Microsoft team decided to build a newer and better terminal on the "old grounds" when they are so outdated? (not saying 'broken', they were built in another age, so they're just obsolete now). |
Well, that also assumes that there is a right way to do RTL text in terminals, and that's a notoriously unsolved issue. Even if one terminal emulator tried to fix it on their own, it's not a problem that can be solved solely by the terminal emulator - it needs cooperation from the CLI application itself, too. Ass @egmontkob wrote here (which is possibly the definitive treatise on the topic)
On the bright side, by reusing the text buffer from conhost, we gain two main benefits:
|
What you're doing and what's happening: When writing some right-to-left characters (Farsi, Arabic, ...) after some left-to-right characters, the rtl text goes from the cursor position into the left, causing it to be mixed with the ltr text written before.
Steps to reproduce: Open a Windows Terminal window. write some english characters, like
abcdef
. Then write some Farsi/Arabic text, like:Notice that the Farsi text goes inside the english text.
Screenshot:
The text was updated successfully, but these errors were encountered: