(correspond to Tables 1, 2, A7 and A8 in the NMTScore paper )
Monolingual Paraphrase Identification
en
ru
fi
sv
de
es
fr
ja
zh
pawsx-avg.
macro-avg.
score_direct (prism
)
72.6
84.1
72.4
70.6
73.9
73.5
75.7
66.4
68.9
71.7
74.3
score_pivot (prism
)
72.1
84.9
70.3
70.9
77.4
76.2
76.9
68.4
70.8
74.0
74.4
score_cross_likelihood (prism
)
71.7
86.6
71.2
72.4
76.6
75.1
75.6
65.8
70.5
72.7
74.9
score_direct (m2m100_418M
)
72.0
83.2
71.1
71.1
71.1
69.0
72.3
61.9
65.4
67.9
73.1
score_pivot (m2m100_418M
)
72.4
84.2
68.2
70.3
73.2
72.0
72.2
64.0
67.9
69.9
73.0
score_cross_likelihood (m2m100_418M
)
72.1
85.1
69.7
71.5
71.6
72.6
72.2
63.1
66.7
69.2
73.5
score_direct (m2m100_1.2B
)
72.9
84.0
71.4
71.2
73.0
70.2
72.4
62.4
66.4
68.9
73.7
score_pivot (m2m100_1.2B
)
74.1
84.5
69.1
69.6
75.1
73.0
73.3
65.8
70.2
71.5
73.8
score_cross_likelihood (m2m100_1.2B
)
72.8
85.0
70.0
71.0
74.1
73.0
73.3
66.2
69.5
71.2
74.0
Cross-lingual Paraphrase Identification
en-es
en-fr
en-ja
en-zh
de-en
de-es
de-fr
de-ja
de-zh
es-fr
es-ja
es-zh
fr-ja
fr-zh
ja-zh
avg.
score_direct (prism
)
76.4
76.1
68.6
68.8
76.4
73.3
74.5
66.0
66.9
74.3
66.7
66.8
66.8
67.4
64.4
70.2
score_pivot (prism
)
76.9
77.3
68.9
70.7
77.4
75.0
76.0
67.0
69.5
75.5
67.6
69.5
67.5
69.9
66.5
71.7
score_cross_likelihood (prism
)
75.9
75.9
65.2
66.0
76.0
74.5
75.2
64.8
65.8
74.2
64.6
66.2
64.4
65.7
65.3
69.3
score_direct (m2m100_418M
)
70.9
72.2
63.3
65.0
72.5
67.9
69.0
61.0
63.1
68.4
60.6
62.0
61.4
63.2
60.6
65.4
score_pivot (m2m100_418M
)
73.1
73.6
63.9
65.4
73.8
71.9
70.5
63.0
63.8
70.1
62.8
64.3
62.7
63.5
62.0
67.0
score_cross_likelihood (m2m100_418M
)
72.3
72.2
61.9
62.3
72.8
69.7
69.0
60.2
63.2
69.8
61.5
61.5
60.5
61.7
61.9
65.4
score_direct (m2m100_1.2B
)
72.4
73.0
64.8
67.0
75.0
71.4
71.7
62.7
65.1
69.8
61.5
63.4
62.7
65.1
62.6
67.2
score_pivot (m2m100_1.2B
)
74.0
74.4
66.4
67.5
75.9
72.3
72.4
64.4
66.5
70.9
63.4
65.6
64.1
64.9
63.4
68.4
score_cross_likelihood (m2m100_1.2B
)
74.1
73.8
63.0
63.6
74.8
70.9
70.5
61.4
63.5
71.3
61.6
62.5
61.8
63.8
63.3
66.7
Additional results (v0.3.2)
Monolingual Paraphrase Identification
en
ru
fi
sv
de
es
fr
ja
zh
pawsx-avg.
macro-avg.
score_direct (small100
, fp16)
72.5
82.9
71.2
71.3
72.0
69.0
69.3
61.1
65.5
67.4
73.0
score_pivot (small100
, fp16)
73.2
84.1
68.4
69.9
73.3
73.5
73.0
62.9
69.4
70.4
73.2
score_cross_likelihood (small100
, fp16)
73.1
84.1
69.8
71.8
73.4
72.9
74.0
62.1
67.8
70.0
73.8
score_direct (m2m100_418M
, fp16)
72.0
83.2
71.1
71.1
70.8
68.9
72.2
61.9
65.4
67.8
73.1
score_pivot (m2m100_418M
, fp16)
72.5
84.1
68.2
70.3
73.1
72.2
72.1
64.0
67.8
69.8
73.0
score_cross_likelihood (m2m100_418M
, fp16)
72.1
85.1
69.7
71.5
71.6
72.5
72.2
63.6
65.9
69.1
73.5
Cross-lingual Paraphrase Identification
en-es
en-fr
en-ja
en-zh
de-en
de-es
de-fr
de-ja
de-zh
es-fr
es-ja
es-zh
fr-ja
fr-zh
ja-zh
avg.
score_direct (small100
, fp16)
72.3
72.1
62.1
64.6
73.1
69.1
69.2
61.2
63.4
69.1
60.4
63.4
61.0
63.4
60.9
65.7
score_pivot (small100
, fp16)
74.2
73.6
64.2
66.2
73.8
71.0
70.8
63.2
64.5
71.3
62.2
63.9
62.7
64.4
62.1
67.2
score_cross_likelihood (small100
, fp16)
72.9
71.9
60.2
63.3
72.6
69.8
70.0
61.3
62.7
70.9
61.9
62.3
61.3
62.0
62.3
65.7
score_direct (m2m100_418M
, fp16)
71.1
72.2
63.3
65.0
72.5
67.9
69.0
61.0
63.1
68.4
60.7
61.9
61.3
63.4
60.6
65.4
score_pivot (m2m100_418M
, fp16)
73.2
73.7
63.9
65.5
73.7
71.8
70.3
63.1
63.9
70.1
62.9
64.1
62.7
63.5
62.0
67.0
score_cross_likelihood (m2m100_418M
, fp16)
72.1
72.2
61.9
62.4
72.8
69.7
68.9
60.2
63.3
69.7
61.5
61.7
60.5
61.8
61.9
65.4
score_direct
score_pivot
score_cross_likelihood
prism
21
138
69
m2m100_418M
21
1309
663
– fp16
8
1100
690
small100
15
762
342
– fp16
6
439
247