TextRecognitionDataGenerator

A synthetic data generator for text recognition

What is it for?

Generating text image samples to train an OCR software. Now supporting non-latin text! For a more thorough tutorial see the official documentation.

What do I need to make it work?

Install the package

pip install git+https://github.com/voun7/TextRecognitionDataGenerator.git

Afterward, you can use trdg from the CLI. I recommend using a virtualenv instead of installing with root.

If you want to add another language, you can clone the repository instead. Simply run pip install -r requirements.txt

New

Add --stroke_width argument to set the width of the text stroke (Thank you, @SunHaozhe)
Add --stroke_fill argument to set the color of the text contour if stroke > 0 (Thank you, @SunHaozhe)
Add --word_split argument to split on word instead of per-character. This is useful for ligature-based languages
Add --dict argument to specify a custom dictionary (Thank you, @luh0907)
Add --font_dir argument to specify the fonts to use
Add --output_mask to output character-level mask for each image
Add --character_spacing to control space between characters (in pixels)
Add python module
Add --font to use only one font for all the generated images (Thank you, @JulienCoutault!)
Add --fit and --margins for finer layout control
Change the text orientation using the -or parameter
Specify text color range using -tc '#000000,#FFFFFF', please note that the quotes are necessary
Add support for Simplified and Traditional Chinese

How does it work?

Words will be randomly chosen from a dictionary of a specific language. Then an image of those words will be generated by using font, background, and modifications (skewing, blurring, etc.) as specified.

Basic (Python module)

The usage as a Python module is very similar to the CLI, but it is more flexible if you want to include it directly in your training pipeline, and will consume less space and memory. There are 4 generators that can be used.

from trdg.generators import (
    GeneratorFromDict,
    GeneratorFromRandom,
    GeneratorFromStrings,
)

# The generators use the same arguments as the CLI, only as parameters
generator = GeneratorFromStrings(
    ['Test1', 'Test2', 'Test3'],
    blur=2,
    random_blur=True
)

for img, lbl in generator:
    print(img, lbl)  # Do something with the pillow images here.

You can see the full class definition here:

Basic (CLI)

trdg -c 1000 -w 5 -f 64

You get 1,000 randomly generated images with random text on them like:

By default, they will be generated to out/ in the current working directory.

Text skewing

What if you want random skewing? Add -k and -rk (trdg -c 1000 -w 5 -f 64 -k 5 -rk)

Text distortion

You can also add distortion to the generated text with -d and -do

Text blurring

But scanned document usually aren't that clear are they? Add -bl and -rbl to get gaussian blur on the generated image with user-defined radius (here 0, 1, 2, 4):

Background

Maybe you want another background? Add -b to define one of the three available backgrounds: gaussian noise (0), plain white (1), quasicrystal (2) or image (3).

When using image background (3). A image from the images/ folder will be randomly selected and the text will be written on it.

Dictionary

The text is chosen at random in a dictionary file (that can be found in the dicts folder) and drawn on a white background made with Gaussian noise. The resulting image is saved as [text]_[index].jpg

There are a lot of parameters that you can tune to get the results you want, therefore I recommend checking out trdg -h for more information.

Create images with Chinese text

It is simple! Just do trdg -l cn -c 1000 -w 5!

Generated texts come both in simplified and traditional Chinese scripts.

Traditional:

Simplified:

Create images with Japanese text

It is simple! Just do trdg -l ja -c 1000 -w 5!

Output

Add new fonts

The script picks a font at random from the fonts directory.

Directory	Languages
fonts/latin	English, French, Spanish, German
fonts/cn	Chinese
fonts/ko	Korean
fonts/ja	Japanese
fonts/th	Thai

Simply add/remove fonts until you get the desired output.

If you want to add a new non-latin language, the amount of work is minimal.

Create a new folder with your language two-letters code
Add a .ttf font in it
Edit run.py to add an if statement in load_fonts()
Add a text file in dicts with the same two-letters code
Run the tool as you normally would but add -l with your two-letters code

It only supports .ttf for now.

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
samples		samples
tests		tests
trdg		trdg
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt
setup.py		setup.py
tests.py		tests.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

TextRecognitionDataGenerator

What is it for?

What do I need to make it work?

New

How does it work?

Basic (Python module)

Basic (CLI)

Text skewing

Text distortion

Text blurring

Background

Dictionary

Create images with Chinese text

Create images with Japanese text

Add new fonts

About

Releases

Packages

Languages

voun7/TextRecognitionDataGenerator

Folders and files

Latest commit

History

Repository files navigation

TextRecognitionDataGenerator

What is it for?

What do I need to make it work?

New

How does it work?

Basic (Python module)

Basic (CLI)

Text skewing

Text distortion

Text blurring

Background

Dictionary

Create images with Chinese text

Create images with Japanese text

Add new fonts

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages