variable number of gold annotations #69

danyaljj · 2020-03-30T20:00:06Z

The example in the introduction assumes that there is a fixed # of gold annotations for each generated text:

import sacrebleu
refs = [['The dog bit the man.', 'It was not unexpected.', 'The man bit him first.'],
        ['The dog had bit the man.', 'No one was surprised.', 'The man had bitten the dog.']]
sys = ['The dog bit the man.', "It wasn't surprising.", 'The man had just bitten him.']
bleu = sacrebleu.corpus_bleu(sys, refs)
print(bleu.score)

Is there a call that accepts variable number of gold text for each generated text?

martinpopel · 2020-03-30T20:34:18Z

Yes, it is expected that each reference (gold) translation is available for all sentences. If this is not the case, and e.g. a third reference is missing for some sentences, you can fill the missing blanks with another reference (duplicating a reference should have no effect on BLEU):

refs = [['The dog bit the man.', 'It was not unexpected.', 'The man bit him first.'],
        ['The dog had bit the man.', 'No one was surprised.', 'The man had bitten the dog.'],
        ['This is the 3rd reference, which is available only for the first sentence.', 'No one was surprised.', 'The man had bitten the dog.'],]

mjpost closed this as completed Apr 24, 2020

martinpopel mentioned this issue Nov 27, 2020

Refactoring ideas #125

Closed

martinpopel mentioned this issue Jan 8, 2021

BLEU with a variable number of references #130

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

variable number of gold annotations #69

variable number of gold annotations #69

danyaljj commented Mar 30, 2020

martinpopel commented Mar 30, 2020

variable number of gold annotations #69

variable number of gold annotations #69

Comments

danyaljj commented Mar 30, 2020

martinpopel commented Mar 30, 2020