Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multi-language support #3249

Open
fmrico opened this issue Dec 11, 2022 · 18 comments
Open

Multi-language support #3249

fmrico opened this issue Dec 11, 2022 · 18 comments
Labels
help wanted Extra attention is needed

Comments

@fmrico
Copy link
Contributor

fmrico commented Dec 11, 2022

Hi all,

I would like to push the multi-language support for the documentation, letting to have the official ROS 2 documentation in Spanish, Portuguese, Japanese, French, and so on. I am convinced that language is an entry barrier for many potential ROS users. I am aware, at least in the local Spanish group, that many efforts are focused on translating documents. I wonder if we could support it from this repo.

I have seen that the documentation is in https://docs.ros.org/en/, so it is reasonable to have a https://docs.ros.org/es/, https://docs.ros.org/pt/, https://docs.ros.org/fr/..., isn't it? We could initially fill the non-English versions with the English version, and let people, even non-technical people, contribute with their translations.

Do you think it is reasonable? Is it technically viable?

Thanks
Francisco

@christophebedard
Copy link
Member

Unless a very small subset of the documentation is translated (e.g., installation instructions and some basic tutorials), I think that the translations would quickly become unmanageable and outdated or would just generally lag behind.

To complement the subset of manually-translated pages, we could perhaps rely on automatic translations. I just looked at Google's translation of the main Ubuntu installation instructions in French. It's not great (and of course doesn't translate text in images and some text in code blocks), but it's not terrible either.

@fmrico
Copy link
Contributor Author

fmrico commented Dec 16, 2022

Yes, we would need some technical resources to track outdated pages. It might be enough to check the timestamp and make an automatic issue that a revision on that page is required.

I don't know if this repo should contain the docs in other languages. ROS has always followed a federated development model. Maybe interested local user groups in specific languages could maintain a fork of this repo, in which they would start translating the documents. Each local group would be responsible for updating the documents.

Maybe we could move this discussion to ROS Discourse. I will make a post there...

@fmrico
Copy link
Contributor Author

fmrico commented Dec 20, 2022

I have pushed a little here. Trying to resolve what @christophebedard , I have developed https://github.com/fmrico/sync-docs, a GitHub action that:

The action is in the testing stage, but it could be helpful for forked repos for language translations.

@ros-discourse
Copy link

This issue has been mentioned on ROS Discourse. There might be relevant details there:

https://discourse.ros.org/t/ros-2-documentation-in-other-languages/28811/4

@fujitatomoya
Copy link
Collaborator

IMO, I believe this is gonna be good for ROS local community, I will second this approach.

Unless a very small subset of the documentation is translated (e.g., installation instructions and some basic tutorials), I think that the translations would quickly become unmanageable and outdated or would just generally lag behind.

this is true. I am skeptical to have multiple language support by mainline. probably local community based support would be better.

I don't know if this repo should contain the docs in other languages.

probably not, reason is the same with above.

but if we take multiple language support in this repo, i would request the following architecture dependency.

  • mainline doc WILL NOT depend on any multiple language contents.
  • Only multiple language contents can refer to mainline doc.

@fmrico
Copy link
Contributor Author

fmrico commented Dec 20, 2022

Agree completely, @fujitatomoya.

This repo (English version) is the reference. Any local group should maintain a fork in their organization/user. I have provided a GitHub action to help keep the forked repos.

We only would need help generating and linking the doc in https://docs.ros.org/*/

@ros-discourse
Copy link

This issue has been mentioned on ROS Discourse. There might be relevant details there:

https://discourse.ros.org/t/llamada-para-traductores-de-la-documentacion-oficial-de-ros-2/28882/1

@clalancette clalancette added the help wanted Extra attention is needed label Jan 5, 2023
@olivier-stasse
Copy link

olivier-stasse commented Jan 7, 2023

Hi @fmrico, (edited)
I have been quickly trying to follow the Sphinx multilingual support and it provides a short set of files for each language.
This is within the rolling branch of ros2_documentation and rolling branch

cd sources
ln -s ../conf.py .
ln -s ../favicon.ico .
sphinx-build -b gettext . ../build/gettext

The output in build/gettext is

(ros2_doc) ➜  gettext git:(rolling) ✗ ls
Citations.pot        Contact.pot          How-To-Guides.pot    Related-Projects.pot The-ROS2-Project.pot index.pot
Concepts.pot         Glossary.pot         Installation.pot     Releases.pot         Tutorials.pot

Each string has an identifier, it seems very easy to maintain.

The only caveats is that the any modification on the documentation upstream will modify the msgid (which might be seen as a feature).

What is your thought on this approach ?

@olivier-stasse
Copy link

We could even maybe use https://github.com/SekouD/potranslator to generate a first translation...

@fmrico
Copy link
Contributor Author

fmrico commented Jan 7, 2023

Hi @olivier-stasse

This Christmas, some local user groups have implemented a more straightforward solution: maintain a fork with the assistance of GitHub actions to synchronize changes with respect upstream. So, these groups are in charge of maintain the translations and avoid increase complexity in upstream. We have to finish discussing how to generate and link these repos in the ROS official documentation web page.

IMHO the solution that you propose is a big change in the structure and work done in this repo. Let's see how the current approach works before changing the course.

Let's continue discussing this here, or in TSC.

Best

@olivier-stasse
Copy link

Hi @fmrico,
Sorry for my lack of precision, we also have created a local French group with a fork here:
https://github.com/ROS-French-Users-Group/ros2_documentation
following more or less the Spanish group organization.

We also are trying to use the github action to synchronize changes with respect upstream.

We have been evaluating the work to do. And after reading the Sphinx documentation I am wondering if the forked group could be organized using the sphinx po files.

IMHO it was more interesting to share the discussion here rather than in the fork.

Best.

@olivier-stasse
Copy link

olivier-stasse commented Jan 11, 2023

Hi, a quick update on the technical/user experiment on one technical solution I mentioned previously to answer @christophebedard comments on automatic translation. (edited)

Brief conclusion

The burden of the technical solution justify only if you have a significant set of non technical users willing to help in translation. Otherwise the solution suggested by @fmrico is probably more efficient for people used to github.

Detailed explanations

All the tests were done on the French fork.

The sphinx multilingual support is working the following way.
First you need to generate intermediate pot files through:

sphinx-build -b gettext . _build/gettext

From this pot files it is possible to have po files for a specific language. In my case I tried French:

sphinx-intl update -p _build/gettext -l fr

On my local branch this generated:

locale
└── fr
   └── LC_MESSAGES
       └── index.po

The po files are using the reference language (here en) as an identifier.
For instance for Related-Projects.rst you have:

#: ../../source/Related-Projects.rst:3
msgid "Related Projects"
msgstr ""

From this point the Sphinx documentation gives two choices, but there is an additional one answering @christophebedard question on automatic translation. Sphinx choices are either change manually the po file or use Transifex. The po file manual modification is obvious but does not add anything to @fmrico's solution. The second choice makes only sense if there is a strong base of non technical users. Unfortunately the call on the French local user group did not bring that many volunteers.

The third solution is to use an automatic tool such as the one provided by the following rep: https://github.com/SekouD/potranslator (N.B. the author has a new github account, but the pip install of the package points towards this repo). It is two years without any update, and relies on google_translate. I had to update the google_translate python package in the python virtual env to a newer version, but it worked in a rather unsatisfactory way. Indeed the connection with google service was very slow and broke often. I managed to have a translation of files, but had to run the potranslator for each directory. Therefore an automatic translation will need some effort in robustify this tool. From Stack over flow threads it looks like that google breaks from time to time this API, thus I am not sure this is worth it. In addition the translation needs to be checked with some reformulations. It can be a good starting point so, and this is the main reason why I went through all the way down to this lengthy comment.

But I am not sure this is a valid alternatives to @fmrico proposal for a github-action.

@ros-discourse
Copy link

This issue has been mentioned on ROS Discourse. There might be relevant details there:

https://discourse.ros.org/t/documentation-en-francais/28896/14

@fmrico
Copy link
Contributor Author

fmrico commented Jan 11, 2023

Hi @olivier-stasse

Thanks for pushing this evaluating more alternatives. My concerns about using po files come from losing the direct connection between the original documentation in English and our translated pages. It is easier to maintain and does force anybody to make any modifications to the official documentation.

Translating this is a best-effort task to help people who prefer to read in their language, or it is not able (or it isn't easy) to read in English. A translated page is good, but in any other case, the page will be in English.

As you said, it is challenging to enroll volunteers in this. Maybe today we have volunteers, and tomorrow we don't. In any case, I assume that they are technicians, and git shouldn't be a barrier.

@olivier-stasse
Copy link

Hi @fmrico, just to clarify one point, the po file can be created from the official documentation without modifying it. The official documentation is itself the string-to-be-translated identifier. The newly created files DO NOT HAVE to be included in the main repo. So you do not loose the reference to the official documentation and do not add anything to the main repo.

It just adds one layer of complexity, and thus is justified only if one wants to use efficient third-party tools.

Again thanks for launching this initiative, this is an interesting exercise.

@fmrico
Copy link
Contributor Author

fmrico commented Jan 14, 2023

Thanks @olivier-stasse

Let me dive into your work about po files. Maybe we shouldn't discard it.

@fmrico
Copy link
Contributor Author

fmrico commented Jan 28, 2023

Hi @clalancette

I want to retake this thread after the TSC discussion.

We have translated 57/258 (22%) of the documentation in the Spanish version, and it is online in the GitHub page URL. Probably, @olivier-stasse has also progressed in this task. I think we can start thinking about how to link it to the official documentation. I see two options:

  1. (preferred) When the official documentation is built from the official repo and linked under https://docs.ros.org/en/, do exactly the same for a list of translated repos, and link them under https://docs.ros.org/es/, https://docs.ros.org/fr/, etc...
  2. Delegate the generation and URL to the Local Users Groups in their repos, as is currently, but add a section in the official documentation with links to them. This could be done pretty fast.

Maybe we could start with 2 while 1 is implemented. What do you think?
Francisco

@olivier-stasse
Copy link

olivier-stasse commented Feb 11, 2023

Hi,
Some follow up on the automation of the translation.

Brief feedback

I succeeded in applying google_translate to the whole ros2_documentation using potranslator3 which is a fork of potranslator3.
The result is here: https://ros-french-users-group.github.io/ros2_documentation/

In overall the google translation is a very good starting point. I finally came back to this solution because modifying the rst file is rather tedious and for the few interested volunteers this is overwhelming.

Technical description

The overall process

As specified in the sphinxdoc international documentation you need to:

  1. Modify the conf.py file to specify the language and specify the locale_dir.
  2. Generate .pot files from .rst files using make gettext. The resulting files are in build/gettext by default
  3. Generate .po files from .pot files for a specific language. In my case it was fr. The .po files have msgid and msgstr fields.
  4. Translate the .pofiles
  5. Build the .mo files
  6. Build the html files.

The fr example.

  1. The conf.py file was modified by switching the variable language to the locale language fr and the following lines were added:
locale_dirs = ['locales/']   #path is an example but this is the recommended path.
gettext_compact = False     #optional.
  1. Generate pot.files with:
make gettext

which generates the whole directory in build/gettext
3. Generate translated po.file using google_translate.
For this, potranslator3 was used :
potranslator update -d source/locales -l fr -p build/gettext
This generates the .po files in : ./source/locales/fr/LC_MESSAGES/
following the same architecture than the documentation.
4. The next step is to generate the .mo files. It was done using a script:

#!/bin/zsh
for afile in ./**/*.po(.) 
do
  targetfile=${afile:r}.mo
  echo "msgfmt $afile -o $targetfile"
  msgfmt $afile -o $targetfile
done
  1. To build the html file:
make html

Based on the default language variable in conf.py and the mo.files co-located in the source/locales/fr/LC_MESSAGES/*.po files the system generates the translated documentation.
7. Copy the generated html files in the github branch of the local user group.

What remains to be done ?

  • Change some wrong translation. It can be done by proposed PR in the translated .po files in the branch where they are located.
  • Check if the msgid modifications in the reference repository, i.e. ros2/ros2_documentation can be detected, and possibly offer a first translation.
  • Automatize the process with a github action.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

6 participants