PubAnnotation Evaluator

It is an evaluation tool for annotations in PubAnnotation JSON format. It compares a set of annotations under evaluation (we call it study annotations) against another set of annotations which we assume trustible (reference annotations), and reports the difference in precision, recall, and f-score. It also shows false positives and false negatives.

Requirement

Please use it with ruby version 3.2 or above. If your system does not already have an installation of ruby, you need to install it. Using rvm is generally a recommended way to install ruby in your system.

Install

It is released as a gemfile, which is a standard package distribution system of the ruby programming language. Please use the following ruby command to install it:

> gem install pubannotation_evaluator

Execuete

This package includes an excutable command, pubannotation-eval. Executing it without any argument will show you how to use it.

> pubannotation-eval
Usage: pubannotation-eval.rb [options] annotation_file(s)
    -r, --rdir=DIR                   specifies the path to the directory of reference annotation_file(s).
    -c, --soft-match-characters=INT  specifies the number of characters to allow for boundary mismatch (default=20).
    -w, --soft-match-words=INT       specifies the number of words to allow for boundary mismatch (default=2).
    -D, --denotation-type-match=TEXT specifies a ruby block to determine type match of two denotations (defalut='study_type == reference_type ? 1 : 0').
    -R, --relation-type-match=TEXT   specifies a ruby block to determine type match of two denotations (defalut='study_type == reference_type ? 1 : 0').
    -v, --verbose                    tells it to report false positives and false negatives.
    -h, --help                       displays this screen.

Suppose you have your study annotations in the directory, sdir, and your reference annotations in rdir, the simplest way to evaluate your study annotations is as follows:

> pubannotation-eval -r rdir sdir/*.json

Soft boundary matching

It is generally believed that requiring strict boundary matching for entity recognition does not make much sense, and evaluation with relaxed boudnary matching is practically much more useful. You can control the softness factor through two optional parameters, c and w. For example, the following command compares annotations on the basis of strict boudnary matching.

> pubannotation-eval -r rdir -c 0 -w 0 sdir/*.json

By default, c = 20 and w = 2, which means boundary mismatch by 2 words (within 20 characters) will be considered fine.

Custom matching

By default, the type of denotations and relations are compared strictly. However, you can control it yourself by supplying a ruby block for it. For example, below is the default ruby block for exact label matching:

study_type == reference_type ? 1 : 0

It means if the type of a study annotation is the same as the type of a reference annotation, it will return 1. Otherwise, it will return 0.

For a little bit complex example, the following block will allow mismatch between 'gene' and 'protein':

if study_type == reference_type
    1
elsif (study_type == 'gene' && reference_type == 'protein') || (study_type == 'protein' && reference_type == 'gene')
    1
else
    0
end

Or, you can even give different the scores for different matching:

if study_type == reference_type
    1
elsif (study_type == 'gene' && reference_type == 'protein') || (study_type == 'protein' && reference_type == 'gene')
    0.5
else
    0
end

Then, the result will what we call weighted precision / recall / f-score.

Output

It reports

count of annotations
precision/recall/fscore
list of false positives and false negatives

Author

Jin-Dong Kim (jdkim@dbcls.rois.ac.jp)

License

Released under the MIT license.

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
bin		bin
lib		lib
.gitignore		.gitignore
Gemfile		Gemfile
Gemfile.lock		Gemfile.lock
README.md		README.md
pubannotation_evaluator.gemspec		pubannotation_evaluator.gemspec

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PubAnnotation Evaluator

Requirement

Install

Execuete

Soft boundary matching

Custom matching

Output

Author

License

About

Releases

Packages

Contributors 2

Languages

pubannotation/pubannotation_evaluator

Folders and files

Latest commit

History

Repository files navigation

PubAnnotation Evaluator

Requirement

Install

Execuete

Soft boundary matching

Custom matching

Output

Author

License

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages