For the isometric SLT task, two neural MT based baseline systems are considered,
- WeakBaseline: a standard transformer model trained in a constrained setting.
- StrongBaseline: a similar transformer model trained in an unconstrained setting, implementing previously proposed isometric translation approaches.
The steps below show the procedure for training and evaluating the baseline systems. For model training and inference, we used Sockeye MT toolkit.
Install the following requirements for data preprocessing and evaluation, see setup-env.sh for more on environment setup.
#git clone https://github.com/moses-smt/mosesdecoder.git ../scripts/mosesdecoder
pip install sacremoses==0.0.53
pip install sentencepiece==0.1.96
pip install sacrebleu==2.0.0
pip install bert-score==0.3.11
As described in the official call datasets are organized following the two baseline model training settings.
Constrained Task:
- Download MuST-C v1.2 corpus following data set release instruction.
Unconstrained Task
- In addition to MuST-C, we collected WMT datasets for large model training.
# Download parallel WMT training and test data for En-De/Fr/Es lang pairs
for TL in de fr es; do ./download-wmt-data.sh $TL; done
Before preprocessing check if both MuST-C and WMT data are downloaded and present in the ../datasets
directory.
Weak Baseline
# Preprocessing for WeakBaseline without isometric feature
for TL in de fr es; do ./scripts/preprocess-baseline-sockeye.sh $TL --constrained; done
Strong Baseline
# Preprocessing for StrongBaseline without isometric feature
for TL in de fr es; do ./scripts/preprocess-baseline-sockeye.sh $TL --unconstrained; done
# Preprocess the correspond data with isometric features
for TL in de fr es; do ./scripts/preprocess-isometric-sockeye.sh $TL --unconstrained; done
Weak Baseline
# Traing weak baseline model, for language pair en-de/fr/it
for TL in de; do ./scripts/train-sockeye.sh $TL constrained baseline; done
Strong Baseline
# Traing strong baseline model, for language pair en-de/fr/it
for TL in de; do ./scripts/train-sockeye.sh $TL unconstrained isometric; done
Note: before evaluation fine-tune the strong baseline isometric model using the in-domain MuST-C data.
For more on fine-tuning see train-sockeye.sh
script.
As given in the task description, evaluation for isometric model considers both translation quality and the isometric level of the translation (i.e. how close the hypothesis length is the input).
Weak Baseline
# inference for en-de pair
TL=de
INPUT=datasets/en-$L/constrained/preprocessed/test.bpe.en-$TL.en
REF=datasets/en-$TL/constrained/preprocessed/test.en-$TL.$TL
SPM=datasets/en-$TL/constrained/preprocessed/sentencepiece.bpe.model
MODELDIR=experiments/en-$TL/constrained/baseline
./scripts/inference-stat-sockeye.sh $TL $INPUT $REF $SPM $MODELDIR &
# note: outputs the evaluation stat for translation quality and length compliance
Strong Baseline (with isometric translation feature)
# inference for en-de pair with isometric feature
TL=de
INPUT=datasets/en-$L/unconstrained/preprocessed/test.bpe.en-$TL.en
REF=datasets/en-$TL/unconstrained/preprocessed/test.en-$TL.$TL
SPM=datasets/en-$TL/unconstrained/preprocessed/sentencepiece.bpe.model
MODELDIR=experiments/en-$TL/unconstrained/isometric
./scripts/inference-stat-sockeye.sh $TL $INPUT $REF $SPM $MODELDIR --isometric &
Strong Baseline (with isometric translation feature, and N-best generation and re-ranking)
# inference for en-de pair with isometric feature and nbest list re-ranking
# leverages Sockeye's hypotheses reranking module
TL=de
INPUT=datasets/en-$L/unconstrained/preprocessed/test.bpe.en-$TL.en
REF=datasets/en-$TL/unconstrained/preprocessed/test.en-$TL.$TL
SPM=datasets/en-$TL/unconstrained/preprocessed/sentencepiece.bpe.model
MODELDIR=experiments/en-$TL/unconstrained/isometric
# for re-ranking nbest-list see https://github.com/awslabs/sockeye/blob/main/sockeye/arguments.py#L298
./scripts/inference-stat-sockeye.sh $TL $INPUT $REF $SPM $MODELDIR --isometric isometric-lc &
Note: see /scripts/inference-stat-sockeye.sh
for modifying the inference,
re-ranking, and evaluation procedures.
Result of the baseline systems are reported in the shared task evaluation paper. Section 8 (Isometric SLT) describes the task, results are given in Table 35, along with the human evaluation results, and discussion about isometric SLT use case.