Multi-domain-OCR

The project is the codebase for the paper "Multi-domain OCR with Meta Self-Learning" (https://arxiv.org/abs/2401.00971). The code is based on MMOCR.

Installation

The environment setup is the same as MMOCR. Alternatively you can use the setup.sh script to install the environment:

# (Optional) Create a conda environment
conda create -n multi-domain-ocr python=3.10 -y 
conda activate multi-domain-ocr

# Set up the environment
bash setup.sh

Prepare Dataset

The dataset used for evaluation is the open-sourced dataset MSDA (Multi-source domain adaptation dataset for text recognition). Please refer to the homepage for the download link. The dataset is in the format of tar file. Please extract the file and then use tools/dataset_converters/textrecog/lmdb_converter.py to convert the dataset to lmdb format. Assume the dataset is stored in data/Meta-SelfLearning and to be extracted to data/cache/, the following is an example of converting the syn dataset to lmdb format:

# Extract the dataset
mkdir -p data/cache/
tar -xvf data/Meta-SelfLearning/syn/test_imgs.tar -C data/cache/

# Convert the dataset to lmdb format
python tools/dataset_converters/textrecog/lmdb_converter.py data/Meta-SelfLearning/syn/test_label.txt data/Meta-SelfLearning/LMDB/syn/test_imgs.lmdb -i data/cache/Meta-SelfLearning/root/data/TextRecognitionDatasets/IMG/syn/test_imgs/ --label-format txt

Training

The training script is in tools/train.py. The following is an example of training the model on the syn dataset:

python tools/train.py configs/path/to/config.py

If you want to use multiple GPUs for training, use tools/dist_train.sh:

tools/dist_train.sh configs/configs/path/to/config.py 8 --auto-scale-lr --amp

The config files to reproduce the results in the paper are in configs/. The following is an example of training the backbone on the docu dataset:

tools/dist_train.sh configs/textrecog/adapter/backbone_docu.py 8 --auto-scale-lr --amp

The following is an example of training the adapter on the syn dataset:

tools/dist_train.sh configs/textrecog/adapter/adapter_docu_adapter_syn.py 8 --auto-scale-lr --amp

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.circleci		.circleci
.dev_scripts		.dev_scripts
configs/textrecog		configs/textrecog
dicts		dicts
docker		docker
mmocr		mmocr
requirements		requirements
tests		tests
tools		tools
.gitignore		.gitignore
CITATION.cff		CITATION.cff
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
requirements.txt		requirements.txt
setup.cfg		setup.cfg
setup.py		setup.py
setup.sh		setup.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Multi-domain-OCR

Installation

Prepare Dataset

Training

About

Releases

Packages

Languages

License

Jiayou-Chao/Multi-domain-OCR

Folders and files

Latest commit

History

Repository files navigation

Multi-domain-OCR

Installation

Prepare Dataset

Training

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages