-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bi-directional text rendering issues in translated version of book #1413
Comments
My Pull request at: |
No, please don't dynamically change |
Thanks @moaminsharifi for raising this. We are probably the first project to test out the brand new right-to-left functionality in mdbook. So your feedback is valuable to find and fix problems. Note that the blocks of English text you highlighted appear because the translation is incomplete. Would the page look correct otherwise? Also, you should connect with the people in #671 to discuss the problems — a small group of volunteers showed up recently and started working hard on the translation. Thanks @Manishearth for jumping in here! I have near-zero knowledge of this, so I need help others here. What I can explain is how the translation system works:
This probably gives us limited options to apply any special tags — it especially gives us limited way to apply tags from within a translation (unless the "tag" can be encoded as a Unicode character in the replacement text?). The I hope this helps a bit with the overall flow, otherwise I'm happy to explain more. |
<html ... dir="{{ text_direction }}"> This is what I do to support BIDI by just adding a dir="auto" to the beginning of each tag, like this: And the issue is vice versa, puting RTL text in LTR text raise same issue, |
I think we can also create a renderer without changing mdBook, as specified in https://rust-lang.github.io/mdBook/format/configuration/renderers.html. Then, if the language is RTL and this text isn't translated, add But the whole point of my PR #1414 was make it simple as possible with some js,css in client browser. |
I mean, if you don't want to change mdBook, using straight up divs and spans should work just fine |
If I understand your comment correctly, conflicting content can also occur in h1-h6, p, details, ul, and li tags. |
Yes, you can stick dir attrs on spans or divs within them. The core thing is that base directionality is a function of the author's intent. Using auto directionality on all these elements works somewhat, but it is fraught when you e.g. have a bullet point that starts with a word (probably a proper noun) in the Latin script but is supposed to be an overall sentence in an RTL sentence. Like the sentence "Rust َچہَ١ ہے", which should be RTL but will be detected as LTR instead because it happens to start with "Rust". The author actually has to have the ability to signal this intent. Fortunately, they do: Markdown allows HTML inside it. It should ideally be rare to need a directionality shift. The one thing that may not work is where you need to affect the block layout of an item.
I mean, it's not, changing the |
Small note: we don't have spans or divs in the Markdown (of course). Any solution should hopefully preserve the Markdown files the way they look now: as fairly regular Markdown files without a lot of HTML. The best place to inject anything would be during translation (I believe). We do know the source and target languages and we have full control over the Markdown AST at this point. So we could for example wrap untranslated text with We would have to be careful about this, though, since we are still working in the Markdown layer. We're talking about transforming - foo
- bar
- baz into - Translated foo
<div dir="...">
- bar
</div>
- Translated baz This is not a completely faithful translation, since the div will break the list.
This is probably where the disconnect is: the translation pipeline (https://github.com/google/mdbook-i18n-helpers) does not give translators a full Markdown file at a time. Instead it extracts text from the Markdown AST and replaces this with a translation (still in the AST). The translator also doens't know or "intend" to do a directionality shift: the shift today are there because the translation is very incomplete. @moaminsharifi, I tried asking this above: would the page look okay (or nearly okay) if all paragraphs and list items were translated? Could you perhaps try experimenting with a smaller page? For example, is 1.2. میانبرهای صفحه کلی looking correct? If the fully-translated pages are usable, then the problem is much smaller: it will in some sense fix itself as the translation progresses. If the fully-translated pages are also broken, well then we have a bigger problem and I'll need the help of you both to fix it. |
@mgeisler, It seems okay, not great, but okay. At this point in the conversation, it's important to suggest that we invest in translating the book first, and then we can inject RTL text into the LTR version after we have a fully translated version. My point of this issue was to mention I can see as a web developer how important it is to make multi-language versions of websites (in this case a book) and who bidi issue makes it hard for other language readers to follow. Now we know how we can convert from different perspectives and how to get around issues where they arise. for now it's better to archive this issue and PR, the next person who want to translate into any RTL languages can just checkout and you said (@mgeisler) the problem is smaller than what I showed.
Thanks to @mgeisler @Manishearth, for your contribution to the conversion |
Thanks for confirming!
If you like, we could add a note about the problem of mixing the two directionalities to the translation instructions. The note could point to this issue and then perhaps someone can help us improve the situation down the line. I'm blind to the issues myself, so I appreciate you and @Manishearth looking at them. I can mostly help you by explaining the mechanism we use to do the translations — which we've mostly built ourselves so we have the option of modifying it to an extent. I'll close this for now and then we can revisit later if new information appears. |
Hm? Markdown supports embedding most HTML tags
Yeah that would be nice. In some cases it is ideal to do it on the wrapping tag.
This is the assumption I have been operating off of: this ought to be a rare problem, and when it crops up we should use the existing HTML solutions for embedding languages.
Well, they will in some cases; especially around code.
No, I understand, but within a markdown chunk you can still use HTML tags, yes? |
FWIW a thing that turned out useful in the Rust Website translation was that I hooked up a custom function called ENGLISH that would set the |
Sorry, I meant to say that we try to avoid HTML in our English source files today. I was afraid you wanted us to mark up things with spans in the Markdown — which would be doable, but costly in terms of readability and maintenance.
I see, you're right then. I was thinking of the case where one list item is in English because the translation is still in progress.
Yes, that is correct! The translators can inject arbitrary HTML into the translation. So maybe that is what is needed? Could translators translate the example on Code Samples like this: #: src/cargo/code-samples.md:13
msgid ""
"```rust,editable\n"
"fn main() {\n"
" println!(\"Edit me!\");\n"
"}\n"
"```"
msgstr ""
"<div dir=\"...\">\n"
"\n"
"```rust,editable\n"
"fn main() {\n"
" println!(\"Edit me!\");\n"
"}\n"
"```\n"
"\n"
"</div>" That is, wrap the Markdown code block in a div element with If there is a common correct way to do this, then we could talk about detecting such code blocks in |
I checkit out, It's seems because of not purly ``` convert to html and we have some Javascript which render it, It's not working at all: |
Well for me code blocks are already forced-LTR, which is desired behavior anyway. I was thinking about other block level elements.
No, it's not because of the JS, it's because the code block CSS has a But yeah, the code block HTML is tricky and you'd need to make the Ace editor support RTL as well, which would take a while; it has a lot of tricky layout bits. |
cc: @mgeisler |
There are a number of issues with the rendering of bi-directional text in RTL languages. These issues can cause text to be displayed incorrectly, making it difficult or impossible to read.
Some of the most common issues include:
Issue description
As an example, when you compile the latest version of the repo in Persian, this is your first page like that:
It seems right for someone who doesn't know RTl languages; however, if you are learning in your native language, it can be confusing.
this is how must be rendered in RTl languages:
Proposed solving way for Solving Bi-directional issue
Why not directly fix in mdbook?
My first step was to check out the Contributing to MDBook document before I started a github issue here
The current PR backlog is beyond what we can process at this time. Only issues that have an E-Help-wanted or Feature accepted label will likely receive reviews.
mdBook/CONTRIBUTING.md
As far as I can tell, supporting BiDi is not a priority for them at the moment
My Proposed method:
The way I fix the bi-di issue with the html tags is to add dir="auto" html attribute just after rendering by js and add "unicode-bidi:embed;" to those classes.
Refrences:
#671
mdBook issue#1486 Support for Right to Left
The text was updated successfully, but these errors were encountered: