Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support Right to Left (RTL) text #538

Open
MahdiGhiasi opened this issue May 7, 2019 · 22 comments
Open

Support Right to Left (RTL) text #538

MahdiGhiasi opened this issue May 7, 2019 · 22 comments
Labels
Area-Rendering Text rendering, emoji, complex glyph & font-fallback issues Help Wanted We encourage anyone to jump in on these. Issue-Feature Complex enough to require an in depth planning process and actual budgeted, scheduled work. Product-Terminal The new Windows Terminal.
Milestone

Comments

@MahdiGhiasi
Copy link

MahdiGhiasi commented May 7, 2019

  • What you're doing and what's happening: When writing some right-to-left characters (Farsi, Arabic, ...) after some left-to-right characters, the rtl text goes from the cursor position into the left, causing it to be mixed with the ltr text written before.

  • Steps to reproduce: Open a Windows Terminal window. write some english characters, like abcdef. Then write some Farsi/Arabic text, like:

      سلام
    

    Notice that the Farsi text goes inside the english text.

  • Screenshot:

image

  • what should be happening instead: As console is generally left to right, the rtl text should begin more on the right (cursor should go to right), and it should end exactly when the ltr text ended. See how Notepad (correctly) handles this situation:

image

  • Windows build number: 10.0.18362.86
@MahdiGhiasi MahdiGhiasi changed the title Right to left text direction in Windows Terminal Right to left text getting mixed with left to right text in Windows Terminal May 7, 2019
@jalchr

This comment was marked as off-topic.

@DHowett-MSFT

This comment was marked as off-topic.

@DHowett-MSFT DHowett-MSFT added Area-Rendering Text rendering, emoji, complex glyph & font-fallback issues Issue-Bug It either shouldn't be doing this or needs an investigation. labels May 9, 2019
@miniksa miniksa added Issue-Feature Complex enough to require an in depth planning process and actual budgeted, scheduled work. and removed Issue-Bug It either shouldn't be doing this or needs an investigation. labels May 9, 2019
@jalchr

This comment was marked as off-topic.

@zadjii-msft

This comment was marked as off-topic.

@jalchr

This comment was marked as off-topic.

@ghost ghost added the Needs-Tag-Fix Doesn't match tag requirements label May 17, 2019
@miniksa miniksa added Product-Terminal The new Windows Terminal. and removed Mass-Chaos labels May 17, 2019
@ghost ghost removed the Needs-Tag-Fix Doesn't match tag requirements label May 17, 2019
@zadjii-msft zadjii-msft added the Help Wanted We encourage anyone to jump in on these. label May 24, 2019
@zadjii-msft zadjii-msft added this to the Terminal Backlog milestone May 24, 2019
@mtlive
Copy link

mtlive commented Jun 3, 2019

Old codebases aren't that bad 😁:
win98

@egmontkob
Copy link

while Ubuntu and Mac can do

Ubuntu doesn't refer to a particular terminal emulator. Some of its terminal emulators (e.g. konsole, pterm, mlterm) do some right-to-left rendering, and so does macOS's Terminal.app.

And they all do it wrong.

Just to give you an idea, let me mention here one of the many problems they all suffer from. They unconditionally "reverse" every Farsi/Arabic/Hebrew/etc. text. This fixes them in the command line and the in the output of simple utilities, and at the same time, hopelessly, unfixably breaks them for more complex applications (e.g. Emacs).

See my work at https://terminal-wg.pages.freedesktop.org/bidi/ (along with a WIP implementation in VTE / GNOME Terminal) about a more detailed description of the problems and a proposed specification for the desired behavior.

@miniksa miniksa changed the title Right to left text getting mixed with left to right text in Windows Terminal Support Right to Left (RTL) text Jun 27, 2019
@XVilka
Copy link

XVilka commented Jul 2, 2019

Added this issue in my gist to track: https://gist.github.com/XVilka/a0e49e1c65370ba11c17

@schorrm
Copy link
Contributor

schorrm commented Jul 8, 2019

Partial fix, in #1873

DHowett-MSFT pushed a commit that referenced this issue Jul 10, 2019
#1873)

This is a partial fix of #538 . This does *not* change the Console RTL behavior, it does however fix an issue in the rendering. Basically, DirectX expects the origin to be on the right if it is going to draw RTL text. This PR is a simple fix for that. Rather than draw with the left point and then move the origin rightwards, we check if it's RTL, if so, we move the origin rightwards immediately, and then draw. LTR rendering is unchanged.
This doesn't fix underlying questions of RTL handling in the console. It's just a render bugfix. However, this render bugfix should still be a big help and solve the low-hanging issues.

## Validation Steps Performed
Behavior was tested. No changes were made to underlying console.
Three sample cases:
1. RTL text input
Before:
![image](https://user-images.githubusercontent.com/16987694/60816422-6737e100-a1a2-11e9-9e14-c62323fd5b02.png)
After:
![image](https://user-images.githubusercontent.com/16987694/60816395-5ab38880-a1a2-11e9-9f0a-17b03f8268ce.png)
2. Hebrew Output
Before (the Hebrew text is all being drawn to the left of the screen, hence the phantom text):
![image](https://user-images.githubusercontent.com/16987694/60816527-93ebf880-a1a2-11e9-9ba3-d3ebb46cc404.png)
After:
![image](https://user-images.githubusercontent.com/16987694/60816456-77e85700-a1a2-11e9-9783-9e69849f026d.png)
3. Mixed Output
So, this is where this is partial. Due to inherent stuff with RTL behavior, it doesn't look perfect. But the rendering itself is no longer at fault.
Before:
![image](https://user-images.githubusercontent.com/16987694/60816593-b5e57b00-a1a2-11e9-82be-0fcabb80f7d4.png)
After:
![image](https://user-images.githubusercontent.com/16987694/60816607-bb42c580-a1a2-11e9-849a-12846ec4d5c0.png)
mcpiroman pushed a commit to mcpiroman/terminal that referenced this issue Jul 23, 2019
microsoft#1873)

This is a partial fix of microsoft#538 . This does *not* change the Console RTL behavior, it does however fix an issue in the rendering. Basically, DirectX expects the origin to be on the right if it is going to draw RTL text. This PR is a simple fix for that. Rather than draw with the left point and then move the origin rightwards, we check if it's RTL, if so, we move the origin rightwards immediately, and then draw. LTR rendering is unchanged.
This doesn't fix underlying questions of RTL handling in the console. It's just a render bugfix. However, this render bugfix should still be a big help and solve the low-hanging issues.

## Validation Steps Performed
Behavior was tested. No changes were made to underlying console.
Three sample cases:
1. RTL text input
Before:
![image](https://user-images.githubusercontent.com/16987694/60816422-6737e100-a1a2-11e9-9e14-c62323fd5b02.png)
After:
![image](https://user-images.githubusercontent.com/16987694/60816395-5ab38880-a1a2-11e9-9f0a-17b03f8268ce.png)
2. Hebrew Output
Before (the Hebrew text is all being drawn to the left of the screen, hence the phantom text):
![image](https://user-images.githubusercontent.com/16987694/60816527-93ebf880-a1a2-11e9-9ba3-d3ebb46cc404.png)
After:
![image](https://user-images.githubusercontent.com/16987694/60816456-77e85700-a1a2-11e9-9783-9e69849f026d.png)
3. Mixed Output
So, this is where this is partial. Due to inherent stuff with RTL behavior, it doesn't look perfect. But the rendering itself is no longer at fault.
Before:
![image](https://user-images.githubusercontent.com/16987694/60816593-b5e57b00-a1a2-11e9-82be-0fcabb80f7d4.png)
After:
![image](https://user-images.githubusercontent.com/16987694/60816607-bb42c580-a1a2-11e9-849a-12846ec4d5c0.png)
DHowett pushed a commit that referenced this issue Aug 7, 2020
Consecutive RTL GlyphRuns are drawn from the last to the first.

References
#538, #7149, all those issues asking for RTL closed as dupes.

As @miniksa suggested in a comment on #7149 -- handle the thingy on the
render side.

If we have GlyphRuns abcdEFGh, where EFG are RTL, we draw them now in
order abcdGFEh.

This has ransom-noting, because I didn't touch the font scaling at all.
This should fix the majority of RTL issues, except it *doesn't* fix
issues with colors, because those get split in the TextBuffer phase in
the renderer I think, so they show up separately by the GlyphRun phase.
DHowett pushed a commit that referenced this issue Aug 11, 2020
Consecutive RTL GlyphRuns are drawn from the last to the first.

References
#538, #7149, all those issues asking for RTL closed as dupes.

As @miniksa suggested in a comment on #7149 -- handle the thingy on the
render side.

If we have GlyphRuns abcdEFGh, where EFG are RTL, we draw them now in
order abcdGFEh.

This has ransom-noting, because I didn't touch the font scaling at all.
This should fix the majority of RTL issues, except it *doesn't* fix
issues with colors, because those get split in the TextBuffer phase in
the renderer I think, so they show up separately by the GlyphRun phase.

(cherry picked from commit 60b44c8)
@adueck
Copy link

adueck commented Aug 20, 2020

Using terminal preview Version: 1.2.2234.0

The RTL now works! 🥳 Although there are some unsightly joining gaps... it pretty much renders correctly!

This is RTL -> عربي فارسی

rtl-working

But in Vim version 8.1.2269 (Ubuntu 20.04) it doesn't quite work. 😢 The RTL ordering shows up correctly, but unfortunately the joining is broken.

This is RTL in vim not joining properly -> عربي فارسی

rtl-not-joining

EDIT: The joining in Vim can be fixed by either:

  • setting termbidi to true (:set termbidi)
    • default is false but true for mlterm
  • setting arabicshape to false (:set noarabicshape)
    • default is true

It does work in nano though...

rtl-nano

@schorrm
Copy link
Contributor

schorrm commented Aug 20, 2020

@adueck - yup, this would've been my fix in #7190.
It looks to me like Vim is corntolling its own thing, I don't think there's a real issue here.

@mdtauk
Copy link

mdtauk commented Jan 19, 2021

How should the command prompt be handled with Right-to-Left support?

|                                                  output  |
|                                                          |
|                                                  █ </:C  |
____________________________________________________________

Is something like that what would be expected. I only speak english, so have little experience of what would feel right or comfortable.

English apps will probably still output Left to Right, unless it specifically adds support for RtL.

But if it does add support, would Input be right aligned?

@NightMachinery
Copy link

@adueck commented on Aug 20, 2020, 8:51 AM GMT+4:30:

It does work in nano though...

rtl-nano

The letters are still disjoint though. Any idea why is that?

@schorrm
Copy link
Contributor

schorrm commented Apr 26, 2021

The disjoint aspect is going to be because it is splitting GlyphRuns due to scaling -- I expect that this shouldn't happen if you switch to a font that supports Arabic

@zadjii-msft zadjii-msft modified the milestones: Terminal Backlog, Backlog Jan 4, 2022
wez added a commit to wez/wezterm that referenced this issue Jan 25, 2022
This commit is larger than it appears to due fanout from threading
through bidi parameters.  The main changes are:

* When clustering cells, add an additional phase to resolve embedding
  levels and further sub-divide a cluster based on the resolved bidi
  runs; this is where we get the direction for a run and this needs
  to be passed through to the shaper.
* When doing bidi, the forced cluster boundary hack that we use to
  de-ligature when cursoring through text needs to be disabled,
  otherwise the cursor appears to push/rotate the text in that
  cluster when moving through it! We'll need to find a different
  way to handle shading the cursor that eliminates the original
  cursor/ligature/black issue.
* In the shaper, the logic for coalescing unresolved runs for font
  fallback assumed LTR and needed to be adjusted to cluster RTL.
  That meant also computing a little index of codepoint lengths.
* Added `experimental_bidi` boolean option that defaults to false.
  When enabled, it activates the bidi processing phase in clustering
  with a strong hint that the paragraph is LTR.

This implementation is incomplete and/or wrong for a number of cases:

* The config option should probably allow specifying the paragraph
  direction hint to use by default.
* https://terminal-wg.pages.freedesktop.org/bidi/recommendation/paragraphs.html
  recommends that bidi be applied to logical lines, not physical
  lines (or really: ranges within physical lines) that we're doing
  at the moment
* The paragraph direction hint should be overridden by cell attributes
  and other escapes; see 85a6b17

and probably others.

However, as of this commit, if you `experimental_bidi=true` then

```
echo This is RTL -> عربي فارسی bidi
```

(that text was sourced from:
microsoft/terminal#538 (comment))

then wezterm will display the text in the same order as the text
renders in Chrome for that github comment.

```
; ./target/debug/wezterm --config experimental_bidi=false ls-fonts --text "عربي فارسی ->"
LeftToRight
 0 ع    \u{639}      x_adv=8  glyph=300  wezterm.font(".Geeza Pro Interface", {weight="Regular", stretch="Normal", italic=false})
                                      /System/Library/Fonts/GeezaPro.ttc index=2 variation=0, CoreText
 2 ر    \u{631}      x_adv=3.78125 glyph=273  wezterm.font(".Geeza Pro Interface", {weight="Regular", stretch="Normal", italic=false})
                                      /System/Library/Fonts/GeezaPro.ttc index=2 variation=0, CoreText
 4 ب    \u{628}      x_adv=4  glyph=244  wezterm.font(".Geeza Pro Interface", {weight="Regular", stretch="Normal", italic=false})
                                      /System/Library/Fonts/GeezaPro.ttc index=2 variation=0, CoreText
 6 ي    \u{64a}      x_adv=4  glyph=363  wezterm.font(".Geeza Pro Interface", {weight="Regular", stretch="Normal", italic=false})
                                      /System/Library/Fonts/GeezaPro.ttc index=2 variation=0, CoreText
 8      \u{20}       x_adv=8  glyph=2    wezterm.font("Operator Mono SSm Lig", {weight="DemiLight", stretch="Normal", italic=false})
                                      /Users/wez/.fonts/OperatorMonoSSmLig-Medium.otf, FontDirs
 9 ف    \u{641}      x_adv=11 glyph=328  wezterm.font(".Geeza Pro Interface", {weight="Regular", stretch="Normal", italic=false})
                                      /System/Library/Fonts/GeezaPro.ttc index=2 variation=0, CoreText
11 ا    \u{627}      x_adv=4  glyph=240  wezterm.font(".Geeza Pro Interface", {weight="Regular", stretch="Normal", italic=false})
                                      /System/Library/Fonts/GeezaPro.ttc index=2 variation=0, CoreText
13 ر    \u{631}      x_adv=3.78125 glyph=273  wezterm.font(".Geeza Pro Interface", {weight="Regular", stretch="Normal", italic=false})
                                      /System/Library/Fonts/GeezaPro.ttc index=2 variation=0, CoreText
15 س    \u{633}      x_adv=10 glyph=278  wezterm.font(".Geeza Pro Interface", {weight="Regular", stretch="Normal", italic=false})
                                      /System/Library/Fonts/GeezaPro.ttc index=2 variation=0, CoreText
17 ی    \u{6cc}      x_adv=4  glyph=664  wezterm.font(".Geeza Pro Interface", {weight="Regular", stretch="Normal", italic=false})
                                      /System/Library/Fonts/GeezaPro.ttc index=2 variation=0, CoreText
19      \u{20}       x_adv=8  glyph=2    wezterm.font("Operator Mono SSm Lig", {weight="DemiLight", stretch="Normal", italic=false})
                                      /Users/wez/.fonts/OperatorMonoSSmLig-Medium.otf, FontDirs
20 -    \u{2d}       x_adv=8  glyph=276  wezterm.font("Operator Mono SSm Lig", {weight="DemiLight", stretch="Normal", italic=false})
                                      /Users/wez/.fonts/OperatorMonoSSmLig-Medium.otf, FontDirs
21 >    \u{3e}       x_adv=8  glyph=338  wezterm.font("Operator Mono SSm Lig", {weight="DemiLight", stretch="Normal", italic=false})
                                      /Users/wez/.fonts/OperatorMonoSSmLig-Medium.otf, FontDirs
```

```
; ./target/debug/wezterm --config experimental_bidi=true ls-fonts --text "عربي فارسی ->"
RightToLeft
17 ی    \u{6cc}      x_adv=9  glyph=906  wezterm.font(".Geeza Pro Interface", {weight="Regular", stretch="Normal", italic=false})
                                      /System/Library/Fonts/GeezaPro.ttc index=2 variation=0, CoreText
15 س    \u{633}      x_adv=10 glyph=277  wezterm.font(".Geeza Pro Interface", {weight="Regular", stretch="Normal", italic=false})
                                      /System/Library/Fonts/GeezaPro.ttc index=2 variation=0, CoreText
13 ر    \u{631}      x_adv=4.78125 glyph=272  wezterm.font(".Geeza Pro Interface", {weight="Regular", stretch="Normal", italic=false})
                                      /System/Library/Fonts/GeezaPro.ttc index=2 variation=0, CoreText
11 ا    \u{627}      x_adv=4  glyph=241  wezterm.font(".Geeza Pro Interface", {weight="Regular", stretch="Normal", italic=false})
                                      /System/Library/Fonts/GeezaPro.ttc index=2 variation=0, CoreText
 9 ف    \u{641}      x_adv=5  glyph=329  wezterm.font(".Geeza Pro Interface", {weight="Regular", stretch="Normal", italic=false})
                                      /System/Library/Fonts/GeezaPro.ttc index=2 variation=0, CoreText
 8      \u{20}       x_adv=8  glyph=2    wezterm.font("Operator Mono SSm Lig", {weight="DemiLight", stretch="Normal", italic=false})
                                      /Users/wez/.fonts/OperatorMonoSSmLig-Medium.otf, FontDirs
 6 ي    \u{64a}      x_adv=9  glyph=904  wezterm.font(".Geeza Pro Interface", {weight="Regular", stretch="Normal", italic=false})
                                      /System/Library/Fonts/GeezaPro.ttc index=2 variation=0, CoreText
 4 ب    \u{628}      x_adv=4  glyph=243  wezterm.font(".Geeza Pro Interface", {weight="Regular", stretch="Normal", italic=false})
                                      /System/Library/Fonts/GeezaPro.ttc index=2 variation=0, CoreText
 2 ر    \u{631}      x_adv=5  glyph=273  wezterm.font(".Geeza Pro Interface", {weight="Regular", stretch="Normal", italic=false})
                                      /System/Library/Fonts/GeezaPro.ttc index=2 variation=0, CoreText
 0 ع    \u{639}      x_adv=6  glyph=301  wezterm.font(".Geeza Pro Interface", {weight="Regular", stretch="Normal", italic=false})
                                      /System/Library/Fonts/GeezaPro.ttc index=2 variation=0, CoreText
LeftToRight
 0      \u{20}       x_adv=8  glyph=2    wezterm.font("Operator Mono SSm Lig", {weight="DemiLight", stretch="Normal", italic=false})
                                      /Users/wez/.fonts/OperatorMonoSSmLig-Medium.otf, FontDirs
 1 -    \u{2d}       x_adv=8  glyph=480  wezterm.font("Operator Mono SSm Lig", {weight="DemiLight", stretch="Normal", italic=false})
                                      /Users/wez/.fonts/OperatorMonoSSmLig-Medium.otf, FontDirs
 2 >    \u{3e}       x_adv=8  glyph=470  wezterm.font("Operator Mono SSm Lig", {weight="DemiLight", stretch="Normal", italic=false})
                                      /Users/wez/.fonts/OperatorMonoSSmLig-Medium.otf, FontDirs
;
```

refs: #784
@othmanalikhan
Copy link

I'd like to add to the discussion that it seems though the RTL renders correctly, the modification/insertion of characters doesn't seem to behave correct. Using version: 1.11.3471.0, see the behaviour below:

2022-03-26_11-17

It looks like the character insertion happens in reverse order. That is, inserting a character on the 3rd index (from right), then it inserts a character on the 3rd index (from left). This behaviour is consistent on Windows terminal regardless of the application open (Vim, bash, CMD, etc).

@othmanalikhan
Copy link

othmanalikhan commented Mar 26, 2022

How should the command prompt be handled with Right-to-Left support?

|                                                  output  |
|                                                          |
|                                                  █ </:C  |
____________________________________________________________

Is something like that what would be expected. I only speak english, so have little experience of what would feel right or comfortable.

English apps will probably still output Left to Right, unless it specifically adds support for RtL.

But if it does add support, would Input be right aligned?

As an Arabic speaker, I would prefer when setting my terminal in an RTL mode (so everything is right aligned), to have even LTR language output to be right-aligned. This is because having to move my eyes up and down is much easier than diagonally. It is also generally good UX principles to align things vertically to make it easier to read.

So perhaps something like below would be an reasonable solution.

|                                                                output  |
|                                                                        |
|                             ls طعام*txt.bk █ <Arduino/Program Files/:C  |
|                                                            طعاام.txt.bk  |
|                                                                        |
__________________________________________________________________________

@Araxeus
Copy link

Araxeus commented May 13, 2022

mixing ANSI code breaks the alignment inside words

for example, take أهلا or שלום and color one of the letters in a different color
image

Related issue: rust-lang/rust#97020

@pyglot
Copy link

pyglot commented Jun 26, 2022

For reference, this what needs implementing it seems: https://www.unicode.org/reports/tr9/

@TheBestPessimist
Copy link

TheBestPessimist commented May 3, 2023

Much of the new Terminal's codebase is directly shared with the old console codebase, and there are lots of places in that code that assume the text is all LTR. Any fixes for the Terminal would need to make sure to not break existing console behavior in this regard.

@zadjii-msft hey, would you be able to expand on this a bit (and link to other issues/responses explaining this): why has Microsoft team decided to build a newer and better terminal on the "old grounds" when they are so outdated? (not saying 'broken', they were built in another age, so they're just obsolete now).
Why not start from scratch? Wouldn't that have made this a non-issue?

@zadjii-msft
Copy link
Member

Well, that also assumes that there is a right way to do RTL text in terminals, and that's a notoriously unsolved issue. Even if one terminal emulator tried to fix it on their own, it's not a problem that can be solved solely by the terminal emulator - it needs cooperation from the CLI application itself, too.

Ass @egmontkob wrote here (which is possibly the definitive treatise on the topic)

With graphical applications, it’s the responsibility of one single application to do BiDi rendering, i.e. to convert the external data it handles (e.g. document, web page) along with its own UI to the pixel-by-pixel user-visible representation. In case of the terminal emulator, it’s the joint responsibility of two components: the emulator, and the application inside. The exact responsibility of each party and the interface between them needs to be well thought out.

On the bright side, by reusing the text buffer from conhost, we gain two main benefits:

  • Quicker initial prototyping. We didn't have to start from scratch. This probably saved months of work initially getting the Terminal out the door
  • Improvements to the terminal flow back to the inbox console as well. So, any perf improvements, rendering improvements, etc - the vintage console gets those too. This includes a lot of the work that's been done for [Epic] Text Buffer rewrite  #8000, which will make this more possible in the years to come.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area-Rendering Text rendering, emoji, complex glyph & font-fallback issues Help Wanted We encourage anyone to jump in on these. Issue-Feature Complex enough to require an in depth planning process and actual budgeted, scheduled work. Product-Terminal The new Windows Terminal.
Projects
None yet
Development

No branches or pull requests