Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multi-language support for CLIs? #1134

Open
thomasgloe opened this issue Jun 11, 2020 · 19 comments
Open

Multi-language support for CLIs? #1134

thomasgloe opened this issue Jun 11, 2020 · 19 comments
Assignees
Labels
good-first-issue Issues that a new contributor could make a PR for

Comments

@thomasgloe
Copy link

I would like to support multiple languages for my CLI using cobra. Implementation for commands is no problem, but is it correct that there is currently no support for the text output generated by cobra itself (e.g., "Usage", "Flags", "Use "mycmd [command] --help" for more information about a command.")?

@BunnyBrewery
Copy link

Are you talking about if there is multi-language support for default help message in Cobra?

@thomasgloe
Copy link
Author

Yes and I've already checked the source code, where strings are encapsulated in the UsageTemplate. The way to go seems to change the usage template with SetUsageTemplate.

If I have enough time, would it be of interest to include a small example in the docs?

@github-actions
Copy link

This issue is being marked as stale due to a long period of inactivity

@jharshman jharshman self-assigned this Sep 24, 2020
@jharshman
Copy link
Collaborator

@thomasgloe I'd be interested to see a PR for this if you wanted to take a shot at it.

@jharshman jharshman added good-first-issue Issues that a new contributor could make a PR for waiting-user-response and removed kind/stale labels Sep 24, 2020
@github-actions
Copy link

This issue is being marked as stale due to a long period of inactivity

@github-actions
Copy link

This issue is being marked as stale due to a long period of inactivity

@hitzhangjie
Copy link

SetsageTemplate may only affects the template. If we want to support multiple languages, we may consider the description of commands and flags.

I use go-i18n to support multiple languages in my cobra cli.

@Goutte
Copy link

Goutte commented Apr 2, 2023

I've reviewed cobra these past days (using it for git spend), and I've come to the conclusion that cobra itself should have some form of i18n for default content.

I'm glad you're not against it for arcane reasons :) – it's just work, and this I understand.

There are some decisions that are best discussed beforehand, though.

Choosing a translation file format

I'm partial to toml in our case, since we won't really need the tree structure of yaml.
The other formats are just not human-friendly enough, and even though there are really nice GUIs for translation, I prefer keeping the translation files as readable as possible.

Embedding toml translation files

This appears to be the easy way of handling i18n.

Embedding (go:embed) ALL translations may add some kilobytes to cobra.
I'm okay with it, personally, but some of y'all may know ways (I don't) to distribute "lightweight" versions of cobra (with only the english file), along with the fully translated one, for people who desperately need lightweight.

Embedding also kind of slightly breaks the philosophy of package managers (.deb), since each cobra-based cli app will end up with its own translation files for the internals of cobra, whereas they could be shared, ideally. This is tricky.

of note: go:embed requires go1.16 or later

goi18n extract or not?

Usage of goi18n extract requires writing the translation fetching code very verbosely,
and adding the english default right there in the code. This would make cobra a bit harder to read and much more verbose, but I see ways to mitigate that (creating a function for each translation string, and "hiding" the verbose fetch in those, keeping the rest of cobra free of the clutter)

The alternative is to do something trivial like locale.T("HelpTemplate") everywhere, which goi18n extract won't understand. ('tis what I've done in git-spend)

I prefer the solution where we'd support goi18n extract, especially because then we could more easily disable the whole embedding of translation files and still have english working as fallback.


Currently trying to devise a PoC for this so we have a more concrete example to decide upon.

@Goutte
Copy link

Goutte commented Apr 3, 2023

Here's a draft of what it would look like:

localizer.go

package cobra

import (
	"embed"
	"fmt"
	"github.com/BurntSushi/toml"
	"github.com/nicksnyder/go-i18n/v2/i18n"
	"golang.org/x/text/language"
)

var defaultLanguage = language.English

// localeFS points to an embedded filesystem of TOML translation files
//
//go:embed translations/*.toml
var localeFS embed.FS

// Localizer can be used to fetch localized messages
var localizer *i18n.Localizer

func i18nError() string {
	return localizeMessage(&i18n.Message{
		ID:          "Error",
		Description: "prefix of error messages",
		Other:       "Error",
	})
}

func i18nExclusiveFlagsValidationError() string {
	return localizeMessage(&i18n.Message{
		ID:          "ExclusiveFlagsValidationError",
		Description: "error shown when multiple exclusive flags are provided (group flags, offending flags)",
		Other:       "if any flags in the group [%v] are set none of the others can be; %v were all set",
	})
}

// … lots more translations here

func localizeMessage(message *i18n.Message) string {
	localizedValue, err := localizer.Localize(&i18n.LocalizeConfig{
		DefaultMessage: message,
	})
	if err != nil {
		return message.Other
	}

	return localizedValue
}

func loadTranslationFiles(bundle *i18n.Bundle, langs []string) {
	for _, lang := range langs {
		_, _ = bundle.LoadMessageFileFS(localeFS, fmt.Sprintf("translations/main.%s.toml", lang))
	}
}

func init() {
	bundle := i18n.NewBundle(defaultLanguage)
	bundle.RegisterUnmarshalFunc("toml", toml.Unmarshal)

	// FIXME: detect lang(s) from env (LANGUAGE > LC_ALL > LANG)
	detectedLangs := []string{
		"fr",
		"en",
	}

	loadTranslationFiles(bundle, detectedLangs)
	localizer = i18n.NewLocalizer(bundle, detectedLangs...)
}

It uses init(), as I'm not yet intimate enough with cobra to know where to properly hook initialization.

@Goutte
Copy link

Goutte commented Apr 4, 2023

Draft continues in the feat-i18n branch.

I'm not fond of how I added i18n in the command Usage template, but my goal is to keep backwards compatibility.

Used composition, at the cost of a runtime copy of a Command instance, but we keep the same API in the template and don't have to expose an additional property in Command.

@Goutte
Copy link

Goutte commented Apr 4, 2023

Well… works for me ! I've registered a MR draft.

There's a bunch of things I'm not comfortable with, let's discuss those in #1944

@phw
Copy link

phw commented Dec 9, 2023

I prefer the solution where we'd support goi18n extract, especially because then we could more easily disable the whole embedding of translation files and still have english working as fallback.

What about using gotext instead? It does not as verbose code as go-i18n, and you can use a simple wrapper function like you have show and gotext still manages to extract the texts. Translation foles are JSON, not TOML, though. But I think they are still rather easy to handle.

@Goutte
Copy link

Goutte commented Dec 10, 2023

Thanks for the suggestion, @phw .
I remember, at the time of choosing, I saw JSON, facepalmed, sighed, and went on my way.

Here's what the goi18n lib says it provides :

  1. Supports pluralized strings for all 200+ languages in the Unicode Common Locale Data Repository (CLDR).

    We don't use this, I believe.

  2. Supports strings with named variables using text/template syntax.

    This is very handy when injected words ought to be in different order in some translations. But we can perhaps do without, for simplicity's sake.

  3. Supports message files of any format (e.g. JSON, TOML, YAML).

    I profoundly dislike having to edit JSON by hand. TOML feels nice, but it's not even the best option. gettext files (PO, MO) would be my preferred choice.
    There's a (quite new) go-i18n lib that promises to do just this, but it does not look like it is finished yet.


All in all, I don't mind ditching the goi18n lib, but :

  • pretty please, no JSON, it's not made for humans
  • gettext would be nice

@phw
Copy link

phw commented Dec 10, 2023

Just to avoid confusion further down (@Goutte understood me correctly): I was referring to golang.org/x/text/message with the golang.org/x/text/cmd/gotext CLI utility to extract text

golang.org/x/text does support both pluralization and changing variable order. Actually it has one of the nicest implementations for this where the developer basically does not need to think about it. If you have a translatable string like this:

printer.Sprintf("%s copied %d files to %s", user, count, dest)

There will be a translation string like "{User} copied {Count} files to {Dest}". The translator can reorder the placeholders however they see fit and it will be used correctly.

It is limited to the JSON format though, and it also has this a bit convoluted concept with separate out.gotext.json and messages.gotext.json. But a tool like Weblate can actually deal with both those issues.

If you want to go the gettext route have a look at https://github.com/leonelquinteros/gotext . This is also a gettext implementation. Pure go, so no actual dependency on gettext libraries. It also provides an extraction tool github.com/leonelquinteros/gotext/cli/xgotext .

I haven't used it yet, but it looks nice. What originally discouraged me from using it was that it does not directly provide the option to load the translation files from go:embed. But according to the discussion at leonelquinteros/gotext#52 the library's API is flexible enough to allow this.

With gettext you definitely get the best tooling for translators.

What I really dislike about github.com/nicksnyder/go-i18n is the verbosity it requires for each translatable string without the ability to add an abstraction over this that fits your application (at least not without breaking string extraction, which I consider mandatory to have).

@phw
Copy link

phw commented Dec 10, 2023

3. There's a (quite new) go-i18n lib that promises to do just this, but it does not look like it is finished yet.

Just saw that this is actually using github.com/leonelquinteros/gotext, but adds the ability to embed the translation files on top.

@Goutte
Copy link

Goutte commented Dec 12, 2023

Thanks @phw for the clarifications !

I think you're right, it's worth implementing this your way.

I'll start another branch with the ubuntu lib, unless you want to hack around and kickstart things.


One thing I really don't understand about x/text is that it requires x/tools and in turn the net, crypto and goldmark packages. Insofar as I understand, they are used for the CLI (dev) utilities ; it feels wrong to add those to cobra just for i18n.

If I understand correctly, those are essentially removed at compile-time since nothing will link to them, but still... Does not feel right.

@Goutte
Copy link

Goutte commented Dec 12, 2023

Ooops.

The ubuntu lib requires Go 1.20 ; embedding requires 1.16 I believe. Cobra is 1.15 right now.

I'll try to shoot straight for https://github.com/leonelquinteros/gotext and some glue for embedded PO files.

@Goutte
Copy link

Goutte commented Dec 13, 2023

A few notes after hacking around with gotext :

  1. There's no way to describe a translation string to help translators, which is something nice, but not mandatory given the low amount of translations that we have. Furthermore, there are contexts in gotext that aim to solve this. Even though I find descriptors more elegant and humane, we can live with this.
  2. The xgotext CLI only detects gotext.Get(…) and not for example GetLocale.Get(…) which means we cannot lazy-load the locale, we have to initialize it before it is used anywhere, so probably in init(). I'm told usage of init() is frowned upon in Golang.

@Goutte
Copy link

Goutte commented Dec 14, 2023

Made a draft in #2090 @phw 🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good-first-issue Issues that a new contributor could make a PR for
Projects
None yet
Development

No branches or pull requests

6 participants