Score API (beta)

Score API is a collection of endpoints for calculating standard AI performance metrics on user-provided inputs.

Machine Translation

List of available scores:

hlepor
ribes
ter
sacrebleu
rouge
chrf
chrf++
bert
comet
comet-mtqe
labse

Basic usage

To compare machine translation with a reference translation, send POST request to https://api.inten.to/evaluate/score.

curl  -XPOST -H 'apikey: YOUR_API_KEY' 'https://api.inten.to/evaluate/score'  -d '{
    "data": {
        "items": [
            "A sample text",
            "Some other text"
        ],
        "reference": [
            "Not a sample text",
            "Some other context"
        ],
        "lang": "en"
    },
    "scores": [
        {
            "name": "hlepor",
            "ignore_errors": true
        },
        {
            "name": "ter",
            "ignore_errors": true
        }
    ],
    "itemize": false,
    "async": true
}'

Specify request payload:

items - results of the machine translation
reference - ground truth translation.
lang - language of translation.
source - optional parameter for COMET score. See below for more details.
scores - list of scoring metrics to calculate.

Setting ignore_errors to true in the scores array, makes the API to return results regardless of consecutive errors.

Note: evaluation endpoints are supported only in async mode.

The response contains id of the operation:

{ "id": "ea1684f1-4ec7-431d-9b7e-bfbe98cf0bda" }

Wait for processing to complete. To retrieve the result of the operation, make a GET request to the https://api.inten.to/evaluate/score/YOUR_OPERATION_ID. TTL of the resource is 30 days.

{
    "id": "ea1684f1-4ec7-431d-9b7e-bfbe98cf0bda",
    "response": {
        "results": {
            "scores": [
                {
                    "name": "hlepor",
                    "value": 0.755
                },
                {
                    "name": "ter",
                    "value": 28.571
                }
            ]
        }
    },
    "error": null,
    "done": true
}

To get scores for each pair of items (machine translation and reference), set itemize flag to true. In this case results would be:

{
    "id": "ea1684f1-4ec7-431d-9b7e-bfbe98cf0bda",
    "response": {
        "results": {
            "scores": [
                {
                    "value": 0.770527098111673,
                    "name": "hlepor"
                },
                {
                    "value": 0.740740740740741,
                    "name": "hlepor"
                },
                {
                    "value": 25.0,
                    "name": "ter"
                },
                {
                    "value": 33.33333333333333,
                    "name": "ter"
                }
            ]
        }
    },
    "error": null,
    "done": true
}

COMET

Crosslingual Optimized Metric for Evaluation of Translation (COMET) is a metric that achieve high levels of correlation with different types of human judgements. COMET takes 4 parameters as an input:

sources - a list of source sentences
items - results of the machine translation
reference - ground truth translation
lang - language of translation

Setting ignore_errors to true in the scores array, makes the API to return results regardless of consecutive errors.

Note: evaluation endpoints are supported only in async mode.

Usage

curl  -XPOST -H 'apikey: YOUR_API_KEY' 'https://api.inten.to/evaluate/score'  -d '{
    "data": {
        "items": [
            "A sample text",
            "Some other text"
        ],
        "reference": [
            "Not a sample text",
            "Some other context"
        ],
        "source": [
            "Un texto de muestra",
            "Algún otro texto"
        ],
        "lang": "en"
    },
    "scores": [
        {
            "name": "comet",
            "ignore_errors": true
        }
    ],
    "itemize": false,
    "async": true
}'

The response contains id of the operation:

{ "id": "c74934b3-89e9-463e-b358-335c7c717f02" }

Wait for processing to complete.

{
    "id": "c74934b3-89e9-463e-b358-335c7c717f02",
    "done": true,
    "response": {
        "results": {
            "scores": [
                {
                    "value": {
                        "segment_scores": [
                            0.3271389603614807,
                            0.08198270201683044
                        ],
                        "corpus_scores": [
                            0.20456083118915558
                        ],
                        "return_hash": "wmt20-comet-da"
                    },
                    "name": "comet"
                }
            ]
        },
        "type": "scores"
    },
    "meta": {},
    "error": null
}

COMET-QE

Crosslingual Optimized Metric for Evaluation of Translation (COMET) is a metric that achieve high levels of correlation with different types of human judgements. Intento uses wmt20-comet-qe-da model.

COMET-QE takes 3 parameters as an input:

items - results of the machine translation
reference - ground truth translation.
lang - language of translation.

Setting ignore_errors to true in the scores array, makes the API to return results regardless of consecutive errors.

Note: evaluation endpoints are supported only in async mode.

Usage

curl  -XPOST -H 'apikey: YOUR_API_KEY' 'https://api.inten.to/evaluate/score'  -d '{
    "data": {
        "items": [
            "A sample text",
            "Some text"
        ],
        "reference": [],
        "source": [
            "Un texto de muestra",
            "Algún otro texto"
        ],
        "lang": "en"
    },
    "scores": [
        {
            "name": "comet-mtqe",
            "ignore_errors": true
        }
    ],
    "itemize": false,
    "async": true
}'

Note: reference field is required but can be an empty list.

The response contains id of the operation:

{ "id": "6f409125-a8c2-4ffb-b42d-99f2d9d14f7a" }

Wait for processing to complete.

{
    "id": "6f409125-a8c2-4ffb-b42d-99f2d9d14f7a",
    "done": true,
    "response": {
        "results": {
            "scores": [
                {
                    "value": {
                        "segment_scores": [
                            0.5931146740913391,
                            4.1349710954818875e-05
                        ],
                        "corpus_scores": [
                            0.29657801190114697
                        ],
                        "return_hash": "wmt20-comet-qe-da"
                    },
                    "name": "comet-mtqe"
                }
            ]
        },
        "type": "scores"
    },
    "meta": {},
    "error": null
}

LaBSE

The language-agnostic BERT sentence embedding encodes text into high dimensional vectors. The model is trained and optimized to produce similar representations exclusively for bilingual sentence pairs that are translations of each other. So it can be used for mining for translations of a sentence in a larger corpus. Intento uses a port of the LaBSE model to PyTorch from huggingface.

LaBSE takes 3 parameters as an input:

items - results of the machine translation
source - a list of source sentences
lang - language of translation

Setting ignore_errors to true in the scores array, makes the API to return results regardless of consecutive errors.

Note: evaluation endpoints are supported only in async mode.

Usage

curl  -XPOST -H 'apikey: YOUR_API_KEY' 'https://api.inten.to/evaluate/score'  -d '{
    "data": {
        "items": [
            "A sample text",
            "Some text"
        ],
        "reference": [],
        "source": [
            "Un texto de muestra",
            "Algún otro texto"
        ],
        "lang": "en"
    },
    "scores": [
        {
            "name": "labse",
            "ignore_errors": true
        }
    ],
    "itemize": false,
    "async": true
}'

Note: reference field is required but can be an empty list.

The response contains id of the operation:

{ "id": "8ab6da9a-c218-405d-88d6-a7d48b507c54" }

Wait for processing to complete.

{
    "id": "8ab6da9a-c218-405d-88d6-a7d48b507c54",
    "done": true,
    "response": {
        "results": {
            "scores": [
                {
                    "value": 0.922886312007904,
                    "name": "LaBSE"
                },
                {
                    "value": 0.7924383878707886,
                    "name": "LaBSE"
                }
            ]
        },
        "type": "scores"
    },
    "error": null,
    "meta": {}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

score.md

score.md

Score API (beta)

Machine Translation

Basic usage

COMET

Usage

COMET-QE

Usage

LaBSE

Usage

Files

score.md

Latest commit

History

score.md

File metadata and controls

Score API (beta)

Machine Translation

Basic usage

COMET

Usage

COMET-QE

Usage

LaBSE

Usage