Dragoman is an AI-powered tool for translating structured documents like JSON, XML, YAML. The tool's key feature is its ability to maintain the document's structure during translation - keeping elements such as JSON keys and placeholders intact.
Dragoman is available as both a CLI tool and a Go library. This means you can use it directly from your terminal for one-off tasks, or integrate it into your Go applications for more complex use cases.
If you're looking for a version of Dragoman that leverages conventional translation services like Google Translate or DeepL, check out the freeze branch of this repository. The previous implementation manually extracted texts from the input files, translated them using DeepL or Google Translate, and reinserted the translated pieces back into the original documents.Dragoman can be installed directly using Go's built-in package manager:
go install github.com/modernice/dragoman/cmd/dragoman@latest
To add Dragoman to your Go project, install using go get
:
go get github.com/modernice/dragoman
The basic usage of Dragoman is as follows:
dragoman source.json
This command will translate the content of source.json
to English and print
the translated document to stdout. The source language is automatically detected
by default, but if you want to specify the source or target languages, you need
to use the --from
or --to
option.
-f
or --from
The source language of the document. It can be specified in any format that a human would understand (like 'English', 'German', 'French', etc.). If not provided, it defaults to 'auto', meaning the language is automatically detected.
dragoman translate source.json --from English
-t
or --to
The target language to which the document will be translated. It can be specified in any format that a human would understand (like 'English', 'German', 'French', etc.). If not provided, it defaults to 'English'.
dragoman translate source.json --to French
-o
or --out
The path to the output file where the translated content will be saved. If this option is not provided, the translated content will be printed to stdout.
dragoman translate source.json --out target.json
--split-chunks
Split the source document into chunks before translating. This can help to fit the documents into the context size of OpenAI's models. Each line that starts with one of the provided prefixes will create a new chunk.
Example: Split a Markdown file into chunks when encountering H2 and H3 headings:
dragoman translate source.json --split-chunks "## " --split-chunks "### "
-u
or --update
Enable this option to only translate missing fields from the source file that are missing in the output file. This option requires the source and output files to be JSON!
dragoman translate source.json --out target.json --update
When you add new translations to your JSON source file, you can use the --update
option to only translate the newly added fields and merge them into the output file.
// en.json
{
"hello": "Hello, world!",
"contact": {
"email": "hello@example.com",
"response": "Thank you for your message."
}
}
// de.json
{
"hello": "Hallo, Welt!",
"contact": {
"email": "hallo@example.com"
}
}
dragoman translate en.json --out de.json --update
Result:
// de.json
{
"hello": "Hallo, Welt!",
"contact": {
"email": "hallo@example.com",
"response": "Vielen Dank für deine Nachricht."
}
}
-p
or --preserve
This option allows you to specify a list of specific words or phrases, separated by commas, that you want to remain unchanged during the translation process. It's particularly useful for ensuring that certain terms, which may have significance in their original form or are used in specific contexts (like code, trademarks, or names), are not altered. These specified terms will be recognized and preserved whether they appear in isolation or as part of larger strings. This feature is especially handy for content that includes embedded terms within other elements, such as HTML tags. For instance, using --preserve ensures that a term like Dragoman retains its original form post-translation. Note that the effectiveness of this feature may vary depending on the language model used, and it is optimized for use with OpenAI's GPT models.
dragoman translate source.json --preserve Dragoman
-v
or --verbose
A flag that, if provided, makes the CLI provide more detailed output about the process and result of the translation.
dragoman translate source.json --verbose
-h
or --help
A flag that displays a help message detailing how to use the command and its options.
dragoman --help
Besides the CLI tool, Dragoman can also be used as a Go library in your own applications. This allows you to build the Dragoman translation capabilities directly into your own Go programs.
In this example, we load a JSON file and translate its content using the default source and target languages (automatic detection and English, respectively).
package main
import (
"fmt"
"io"
"github.com/modernice/dragoman"
"github.com/modernice/dragoman/openai"
)
func main() {
content, _ := io.ReadFile("source.json")
service := openai.New()
translator := dragoman.New(service)
translated, _ := translator.Translate(context.TODO(), string(content))
fmt.Println(translated)
}
In this example, we translate a JSON file, specifying some preserved words that should not be translated.
package main
import (
"fmt"
"io"
"github.com/modernice/dragoman"
"github.com/modernice/dragoman/openai"
)
func main() {
content, _ := io.ReadFile("source.json")
service := openai.New()
translator := dragoman.New(service)
translated, _ := translator.Translate(
context.TODO(),
string(content),
dragoman.Preserve([]string{"Dragoman", "OpenAI"}),
)
fmt.Println(translated)
}
In this example, we translate a JSON file from English to French, specifying the source and target languages.
package main
import (
"fmt"
"io"
"github.com/modernice/dragoman"
"github.com/modernice/dragoman/openai"
)
func main() {
content, _ := io.ReadFile("source.json")
service := openai.New()
translator := dragoman.New(service)
translated, _ := translator.Translate(
context.TODO(),
string(content),
dragoman.Source("English"),
dragoman.Target("French"),
)
fmt.Println(translated)
}