-
-
Notifications
You must be signed in to change notification settings - Fork 382
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bad keys in translation files #5849
Comments
Looks like the translator has translated the variable name, which is a big no-no. Anything within Would be most excellent if you could fix it: https://wiki.pioneerspacesim.net/wiki/Translations
Files are in EDIT: Note: the changes should be done in transifex. |
Ok. Easy for french translations, but i will have to compare {} keys in the *.json for the others languages. |
Well, variable names are always in English, so if you see any in French, that needs fixing. Simplest is to just identify when/who made the error and check those. I don't know if that feature exist in Transifex, but either way, you can do a git log on the french files, and identify when the change was made (recently, I'd venture to guess). You can see all commits made that transfer changes from Transifex to master: https://github.com/pioneerspacesim/pioneer/commits?author=pioneer-transifex Also, we intend to do a release within a few days, so if you have time to play around with this in the not too distant future that would be great. |
For anyone tagging along in the conversation, but not interested in running command in OP, result is below. Since we have 31 languages all occurrences should be multiples of 31, unless there's a parsing error, (which it looks like some lines are, where the whole sentence has been included). E.g. I was curious about
going to line 1460 in en.json:
Output from command in OP:
|
OK, there are 149 files that have errors lua-variable translation errors (I'll update this post when I have the C++ ones as well): Files with errors
KEYS in files with errorAnd for each of above files, this is what it barfed on (I've included english for each, thus every other line is the error, every other is the correct):
|
nice work. |
@JeromeChignoli I wrote a python script this morning. I should be able to extend it to just fix all the errors, and we'll push it to transifex directly, so you don't need to do anything. |
Nice. |
Right, sorry! I should have read the whole thread, obviously, before manually fixing strings. I solved some 4-5 cases before noticing this part. I hope this doesn't create any problems for the script. |
@zonkmachine no worries. You might have been on to something...
@JeromeChignoli Indeed. See below: 1. Where we standI've done some more tinkering, and looked more at the output, and reached the conclusion: there be complications. the TL;DR, I could just compile a list of errors from each language and send that out to the translators, and "pretty please fix"-them, or we just brute force the fixes, and nuke 95% of the errors, in my guesstimation. 2. Edge casesBelow are some edge cases that make it challenging to (correctly) automate fixing in script. 2.1 "Not even wrong"...to quote Wolfgang Pauli. This is going to be tough to fix in script
or here
or missing closing bracket all together:
maybe above warrants some regex magic to detect? (all three examples from polish). 2.2 False positives?Below, the percentage sign is lacking a space, thus being interpreted as a C-variable string by my script (variables from C++ code:
2.3 Assume there be orderBelow is probably fine (thus script only considers it as "warning"), but I if the variables also had been wrong (e.g. translated), then I would replace them assuming the order would be the same as in the English string. This is not an issue when the string only has a single
(Seems like like German (to no surprise) and Russian often swaps order of variables) 2.4 Can't even handle the errorBelow, number of strings is mismatched between English and translated. This might be because the string is
3 ConclusionI think 2.3 is a very (imagined?) edge case, and we could just ignore it, I think the script would do the right thing 95 % of the time, and the translations would be less wrongier than before (but now the error might be so subtle as to be not obviously noticeable in game play, making the translation "bug" more deceptive?). One could also measure some kind of "distance" (character Hamming distance, or word2vec embedding space, or LLM) between the wrong and correct variable, for most cases the wrong and correct are quite close, either because they're:
As teased in the Introduction, we could ask translators to do the work for us, and we'll see where that gets us. (Also: I'm yet to figure out how to implement the regex string subsitution from the list of correct variables) |
Said differently: only about 5% of the messages need a human correction. |
I would rather inform the translators. |
This mostly fixes pioneerspacesim#5849 (translators getting the placeholders in strings wrong), BUT: - I assume we don't want to merge this PR, but push branch to transifex - Ignoring lang/core/ for now, due to some parsing issue with % and it getting close to my bed time (I'll fix tomorrow), that will add < 10 more files with corrections - As mentioned in the issue above, this only solves for strings where number of placeholders in translated and original strings are the same / unambiguous)
This mostly fixes pioneerspacesim#5849 (translators getting the placeholders in strings wrong), BUT: - I assume we don't want to merge this PR, but push branch to transifex - Ignoring lang/core/ for now, due to some parsing issue with % and it getting close to my bed time (I'll fix tomorrow), that will add < 10 more files with corrections - As mentioned in the issue above, this only solves for strings where number of placeholders in translated and original strings are the same / unambiguous)
This mostly fixes pioneerspacesim#5849 (translators getting the placeholders in strings wrong), BUT: - I assume we don't want to merge this PR, but push branch to transifex - Ignoring lang/core/ for now, due to some parsing issue with % and it getting close to my bed time (I'll fix tomorrow), that will add < 10 more files with corrections - As mentioned in the issue above, this only solves for strings where number of placeholders in translated and original strings are the same / unambiguous)
Maybe it is too late, i writed a bash script (sorry to not be python aware). EDIT: Ooops wrong buggy script. updated |
This mostly fixes pioneerspacesim#5849 (translators getting the placeholders in strings wrong), BUT: - I assume we don't want to merge this PR, but push branch to transifex - As mentioned in the issue above, this only solves for strings where number of placeholders in translated and original strings are the same / unambiguous)
This mostly fixes pioneerspacesim#5849 (translators getting the placeholders in strings wrong), BUT: - I assume we don't want to merge this PR, but push branch to transifex - As mentioned in the issue above, this only solves for strings where number of placeholders in translated and original strings are the same / unambiguous)
@JeromeChignoli I've now applied the changes I've proposed. Please feel free to add any additional fixes you can identify. I've briefly tried your script, and e.g. for where translation is missing one or several keys, it's hard to know what is the correct answer, unless you speak the language in question. I have those strings identified, and I'm thinking to just give them to the translators, and hope they'll fix it. |
Hello.
I was having nil values in trade computer.
Searching from source code i found bad keys in lang/ui.core/fr.json:
i found {système} keys in TRADING_FROM and TRADING_TO sections.
Assuming it was auto-translation issue (as système is system), i was curious to see if it was the only one...
in lang directory:
find -name '*.json' -exec sed -n '/[}{]/{s/^[^{]*//;s/[^}]*$//;s/}[^{]*{/}\n{/g;p}' {} \;|sort|uniq -c|sort -n
Interesting results with my 20240314 (8b6ae92) on Linux version: a lot of auto-translated keys and some typos.
The real work will be to find the files... maybe by grep'ing the bad entries
EDIT: by "auto-translation" I mean "automatic translation". Tools are great but stupid.
The text was updated successfully, but these errors were encountered: