Skip to content

v1.0 - CoRT models

Latest
Compare
Choose a tag to compare
@OrigamiDream OrigamiDream released this 25 Nov 12:20
· 9 commits to main since this release

CoRT TensorFlow Models

I had trained CoRT models on Rhetorical Tagging Dataset for 8,000 steps in Pre-training and 1 epoch in Fine-tuning using A100 and V100 GPUs respectively.
To the models be run properly following exact hyperparameter setups are required.

Model Performance Hyperparameters
Macro F1-score Accuracy model_name repr_size repr_classifier repr_act concat_hidden_states
CoRT-KorSciBERT 90.42 90.25 korscibert 1,024 seq_cls tanh 2
CoRT-RoBERTa 90.50 90.17 klue/roberta-base 1,024 bi_lstm tanh 2

Usage Examples

Inference

There are two modes for inference, Inference mode that results all datasets at once and Interactive mode for interactively see results one by one.
Following command (for example) runs inference mode.

python run_inference.py \
#      --interactive=True \  # Uncomment to activate interactive mode
       --checkpoint_path=./CoRT-KorSciBERT/ckpt-0 \
       --model_name=korscibert \
       --tfrecord_path=./data/tfrecords/{model_name}/valid.fold-1-of-10.tfrecord \
       --concat_hidden_states=2 \
       --repr_act=tanh \
       --repr_classifier=seq_cls \
       --repr_size=1024 \
       --batch_size=32