This is the official PyTorch implementation of RE2: Region-Aware Relation Extraction from Visually Rich Documents. [arXiv]
For CUDA 11.0
conda create -n re2 python=3.7
conda activate re2
git clone ###to insert git repo url##### (check if cd is required depending on what I am pushing as parent directory)
pip install -r requirements.txt
pip install -e .
For CUDA 11.X
conda create -n re2 python=3.7
conda activate re2
git clone ###to insert git repo url#####
pip install -r requirements_2.txt
pip install -e .
Or check Detectron2/PyTorch versions and modify the requirements.txt file accordingly.
In this repository, we provide the fine-tuning codes for FUNSD, XFUND and our proposed dataset DiverseForm.
For FUNSD, create a directory FUNSD
and create two directories funsd_test
and funsd_train
under it. Each directory will have annotations
and images
folder.
├── FUNSD
│ ├── funsd_test
│ │ ├── annotations
| | ├── images
| ├──funsd_train
| | ├── annotations
| | ├── images
Dataset | Download |
---|---|
FUNSD | Download link |
XFUND | Download link |
DiverseForm | Download link |
Put the unzipped files to RE2 folder
To create Paragraph level
python region_extraction/easyOCR.py \
--para true \
--ip_dir /path/to/folder/with/images \
--op_dir /path/to/folder/for/output
To create Line level
python region_extraction/easyOCR.py \
--para false \
--ip_dir /path/to/folder/with/images \
--op_dir /path/to/folder/for/output
To create Tabular level
python region_extraction/table.py \
--ip_dir /path/to/folder/with/images \
--op_dir /path/to/folder/for/output
To combine both
python region_extraction/combine.py \
--lang {en,pt...} \
--ip_path /path/to/folder/with/images \
--table /path/to/table.json \
--easy_para /path/to/easy_para.json \
--easy_line /path/to/easy_line.json \
--op_path /path/to/output/folder
Use just paragraph or combined region as fit. Rename file to xfun_custom.json
and put it in re2
Dataset (Language) | Download |
---|---|
FUNSD (EN) | Download link |
XFUND (ZH) | Download link |
XFUND (JA) | Download link |
XFUND (ES) | Download link |
XFUND (FR) | Download link |
XFUND (IT) | Download link |
XFUND (DE) | Download link |
XFUND (PT) | Download link |
DiverseForm | Download link |
CUDA_VISIBLE_DEVICES=0,1,2,3 python -m torch.distributed.launch --nproc_per_node=4 examples/run_xfun_re_inf.py \
--model_name_or_path microsoft/layoutxlm-base \
--output_dir path/to/output/directory \
--do_train \
--do_eval \
--lang zh \
--max_steps 5000 \
--per_device_train_batch_size 4 \
--warmup_ratio 0.1 \
--fp16
CUDA_VISIBLE_DEVICES=0,1,2,3 python -m torch.distributed.launch --nproc_per_node=4 examples/run_xfun_re_inf.py \
--model_name_or_path microsoft/layoutxlm-base \
--output_dir path/to/output/directory \
--do_train \
--do_eval \
--lang en \
--path_to_dataset /path/to/dataset \
--max_steps 5000 \
--per_device_train_batch_size 4 \
--warmup_ratio 0.1 \
--fp16
CUDA_VISIBLE_DEVICES=0,1,2,3 python -m torch.distributed.launch --nproc_per_node=4 examples/run_xfun_re_inf.py \
--model_name_or_path microsoft/layoutxlm-base \
--output_dir path/to/output/directory \
--do_train \
--do_eval \
--lang en \
--custom True \
--path_to_dataset /path/to/dataset \
--max_steps 5000 \
--per_device_train_batch_size 4 \
--warmup_ratio 0.1 \
--fp16
CUDA_VISIBLE_DEVICES=0,1,2,3 python -m torch.distributed.launch --nproc_per_node=4 examples/run_xfun_re_inf.py \
--model_name_or_path microsoft/layoutxlm-base \
--output_dir path/to/output/directory \
--do_train \
--do_eval \
--lang zh \
--additional_langs en+de+es+pt+it+fr+ja
--max_steps 5000 \
--per_device_train_batch_size 4 \
--warmup_ratio 0.1 \
--fp16
CUDA_VISIBLE_DEVICES=0,1,2,3 python -m torch.distributed.launch --nproc_per_node=4 examples/run_xfun_re_inf.py \
--model_name_or_path /path/to/model_checkpoint \
--path_to_dataset /path/to/dataset \
--output_dir path/to/output/directory \
--do_predict
Folder directory of path to data that requires inference is as follows:
├── DATA
│ ├── annotations
| ├── images
Refer to sample_inference
folder for a sample of required data