Skip to content

Latest commit

 

History

History
87 lines (68 loc) · 3.57 KB

README.md

File metadata and controls

87 lines (68 loc) · 3.57 KB

jitsi-roomnames-translation

Quick python scripts to translate Jitsi's dictionnaries with Google Cloud Translate API.

The words where taken from Jitsi's js-util repo.

The Google Cloud API requesting function was taken from Google documentation.

You will need an API key to launch this code, and a Translate API ready project. The steps to get one can be found here

Repo structure

.
├── LICENSE                     # Just took same license as Jitsi js-utils repo
├── README.md
├── data_en                     # Jitsi dictionnaries of words to generate room names
│   ├── adjectives.en.json
│   ├── adverbs.en.json
│   ├── pluralnouns.en.json
│   └── verbs.en.json
├── data_<lang>                 # The output folder that will contain the translated words (gitignored)
│   ├── adjectives.<lang>.json  # (gitignored)
│   ├── adverbs.<lang>.json     # (gitignored)
│   ├── pluralnouns.<lang>.json # (gitignored)
│   └── verbs.<lang>.json       # (gitignored)
├── data_managment.py           # Functions to handle JSON files
├── google-api-private-key.json # Your API key (gitignored)
└── translation.py              # Main function

Env

It is advised to use a virtualenv to install pip modules, to prevent messing with your other projects.

Installing virtualenv:

pip3 install --user virtualenv

or

python -m pip3 install --user virtualenv

To create a virtualenv :

virtualenv env

Depending on your working environment, you may not have entered your env automatically, so do:

source env/bin/activate

If you leave and you come back later you will need to re-enter the virtualenv.

While in the env, install the requirements:

env/bin/pip3 install -r requirements.txt

Launching

You will need to add the path to your Google Cloud API key in the environment variables. To do it only for your terminal session (you will need to re-enter the command after you close the CLI), do:

export GOOGLE_APPLICATION_CREDENTIALS='/path/to/the/json/key'

You can modify the translation.py file and set the test_number (default -1) to the int value you want if you want to execute a non-default behaviour (testing on a small set, comparing singlethreading ant multithreading execution times...)

You still need to have the virtualenv activated to launch the following commands

Help:

python3 translation.py -h

Default behaviour:
Destination language: French, Input folder: data_en, Output folder: data_fr, use multithreading

python3 translation.py

Post processing

Depending on what you want to do after that with the words, you may need to affine the translation, or to remove some of the words.

For instance, this project was created to generate a French dictionnary for the jitsi-box project. For this reason, I had to remove all words containing special characters, which i did with some sed commands such as sed -i '' '/\u00e7/d' ./data_fr/*.json (This command works for MACOS. With a GNU distribution, use sed -i '/\u00e7/d' ./data_fr/*.json). I also had to check that forms (verbs, adverbs, noun) of words were respected.

If you want to use an AI or another translating API who can check itself for this, feel free do no anything you want :)