-
Notifications
You must be signed in to change notification settings - Fork 30.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for RTL languages (such as Arabic / Hebrew / Persian etc.) #86667
Comments
@MicrosoftSam , can you please review this issue ? |
@tomerm PR welcome. If anything special needs to be done, it could be done in Our goal with the monaco editor and vscode is to be a code editor, i.e. the plain text case editing is not a priority. Regarding programming language editing, have you actually tried these scenarios in the monaco-editor, I am not an expert, but we do support RTL and bi-di to quite some extent:
|
@alexandrudima thanks a lot for your quick reply. The examples you provided are rendered properly. However, this is just one use case - you used exclusively Hebrew characters (no numbers, no English text in the middle of Hebrew text etc. ). In other words I totally agree that Monaco editor provides correct basic reordering (taking text stored in the buffer and transforming it for the sake of presentation on the screen). If you start using mixed (English + Hebrew) text in comments / constants (not mentioning using Hebrew in variable, class names etc.) the display will become considerably less readable. It will be very similar to display we get in Notepad. However, while Notepad is not aware of any programming language syntax (and thus is not expected to enforce it), Monaco is aware of such syntax and should assure it is preserved on the display layer. |
Thank you for explaining the issue. Indeed, it would be better to interpret syntax characters as Left-To-Right instead of Neutral when rendering source code. Here is a quick test I did for (notice how github doesn't get this right either): var ת = "מיותר קודמות צ'ט של, אם לשון העברית שינויים ויש, אם"; Monaco EditorAce EditorCodeMirrorWordNotepadA browser textareaSublimeText (no support at all)And the only one that appears to get it right? Visual Studio |
@alexandrudima , several examples of editors in which we do get expected behavior: Orion Eclipse ACE Non code editors |
@alexandrudima , I have tested this fix using the latest code. |
@amirbrans Thank you for verifying. I have FF 50.1.0 on Windows 10 and it appears to work there. Does your test URL contain |
@alexandrudima , actually yes. it does contain the editor=dev parameter. I'm using FF ESR 45.6 |
Following up, document with Bidi issues discovered on Monaco editor: |
@alexandrudima "STT" mentioned in @amirbrans report above refers to "STructured Text". It is a general term we use to describe a text with internal structure which in general is not preserved by UBA (Unicode Bidi Algorithm) in case it includes bidi text. Please let us know if you need any help with fixing issues in the report above. I just want to make sure we don't duplicate our efforts if you have plans to address those issues yourself. Many thanks in advance for your attention and help. |
We're looking for a great code editor for our content authors (using Jekyll, YAML, html, css, ..) and I am interested in VS Code or may be even just in Monaco. Since the content is in hebrew, arabic and english, this RTL issue is critical for us. |
Thank you for the extra testing. The root cause is a bit silly: the same colors are used by the theme for identifiers and syntactical text. If the theme would give even a slight different color, the text would be split into multiple tokens and the "dir" trick might work... |
@alexandrudima I highly appreciate if you kindly update us how it is going in VSCode team side. |
@tomerm |
Unfortunately vs code has no this feature .I hope to see it soon. I found the best editor to support bidirectional editing of scripts written in rtl is Emacs. So the trick for time being, is editing rtl scripts in emacs and then upload it to vs code |
@tomerm @alexdima hello, I would be VERY grateful and interested by your replies. I also need to be able to type a line of code which contains Hebrew+English + lots of parenthesizes without having the order of the parenthesizes or letters in Hebrew or the words swapped (like it is the case in most editors). This is what I want (written with Textmate): This is what one gets now on vscode (not correct) (when you copy-past corect line above on vscode --> it ruins the parenthesizes order ): This is what is displayed when you copy-paste the same line of code as seen previously on Textmate(the correct one, I want to achive) , here on Github (not correct either) :
At the end of the day, @tomerm could you please please tell me what editor/platform (which enables all what you mentioned in your first post) do you advise me to use (aka works well and is nice/strong as vscode ) ? תודה רבה / thank you @alexdima |
Hello guys
You can also put a button up in tab bar or another places that inserts this character. And I offer to show a mark for the character in text though the user knows if the character exists. As the other guys mentioned, it is very useful for LaTeX and related tools. |
Is this issue anywhere near fixed? We're still seeing broken rendering in RTL languages (mine is Arabic). The problem currently lies mainly when text starts with RTL text such as Arabic letters. The image has 5 examples, unicode-bidi: plaintext; unicode-bidi: bidi-override;
direction: rtl; <span dir="auto"></span> I suppose the second modification would require some detection, while the first one potentially breaks other stuff? The third option seems too easy to implement. Comments also seem to be fixed with the HTML solution and the |
I can confirm the issue is not resolved in master for Hebrew either (@KL13NT); so I suspect there is no progress. |
A way I tried to implement this (by adding this style rule in the chrome debugger interface): // Add per line support for RTL without alignment change
.view-line>span:before {
content: "\200f";
}
.view-line>span:after {
content: "\200f";
} This has the advantage of not affecting the line's content directly (applied as a style) and can be implemented as a per line solution. |
I believe this is a duplicate of #83365, although this has more info |
Indeed, text lines with combines Arabic and English text - huge problem for VS Code. Even copy-paste isn't working as would expect. |
This is a must for any language/user that can work with RTL (or bi-di) |
Any updates? |
Christ its been 9 years microsoft, it shouldn't be this hard to implement. |
My name is Tomer Mahlin. I lead a development team in IBM named Bidi Development Lab. We are specializing (for more than 20 years) in development of support for languages with bidirectional scripts (or "bidi lang." for short) .
We recently ran a sniff assessment on Monaco capabilities with respect to bidi lang. display. We believe there are several functional areas which require improvements (please see more details below).
My team can work on necessary modifications and suggest them via separate pull request, assuming community is interested in addressing the requirements detailed below.
Plain text editing
Possible values should be:
AND / OR (in case there is no toolbars for any new buttons)
Programming lang. editing
As opposed to plain text, programming lang has well defined syntax. Some part of this syntax is visualized via color schema used for coloring different elements (i..e comments vs variables etc.) of the language. It is critical to enforce visual appearance associated with the syntax regardless of language used for different elements (i.e. comments, variables etc.). If this is not done, it becomes virtually impossible to work with the code when bidi text is used. Simple English example:
a = b + c; // hello world
If bidi characters are used instead you would expect to see:
A = B + C; // DLROW OLLEH
Instead at the moment you see:
DLROW OLLEH // ;C+B=A
The more complex example can be, less intuitive the display will become.
Special case is the case of comments or/and constants. Those by all means usually include bidi characters (or at least much more frequently than variables names for example). It is thus preferable to display text in those contexts using natural text direction for bidi languages (which is RTL). We can't store text direction information with text (namely source code file is still a plain text file which can't include any meta information about text such as font size, color, direction etc.). Consequently we should be able to make a smart choice while displaying the text (relying just on the text itself). Most straightforward approach is to enforce auto (aka contextual or first strong) direction of text for each paragraph included in comments.
For example, currently the display of sample text is as follows:
res = var1 + var2; // SI EMAN YM tomer !!!
If we enforce auto text direction on the comment we will see:
res = var1 + var2; // !!! tomer SI EMAN YM
Namely text of comment will appear with actual RTL direction which is a natural one for bidi lang.
Display of text with natural text direction makes it considerably more readable and thus should greatly enhance user experience for bidi users.
Relevant requests
At some point support for bidi lang. was requested in vscode via #4994
The text was updated successfully, but these errors were encountered: