Score API is a collection of endpoints for calculating standard AI performance metrics on user-provided inputs.
List of available scores:
To compare machine translation with a reference translation, send POST request to https://api.inten.to/evaluate/score
.
curl -XPOST -H 'apikey: YOUR_API_KEY' 'https://api.inten.to/evaluate/score' -d '{
"data": {
"items": [
"A sample text",
"Some other text"
],
"reference": [
"Not a sample text",
"Some other context"
],
"lang": "en"
},
"scores": [
{
"name": "hlepor",
"ignore_errors": true
},
{
"name": "ter",
"ignore_errors": true
}
],
"itemize": false,
"async": true
}'
Specify request payload:
items
- results of the machine translationreference
- ground truth translation.lang
- language of translation.source
- optional parameter for COMET score. See below for more details.scores
- list of scoring metrics to calculate.
Setting ignore_errors
to true
in the scores array, makes the API to return results regardless of consecutive errors.
Note: evaluation endpoints are supported only in async mode.
The response contains id of the operation:
{ "id": "ea1684f1-4ec7-431d-9b7e-bfbe98cf0bda" }
Wait for processing to complete. To retrieve the result of the operation, make a GET request to the https://api.inten.to/evaluate/score/YOUR_OPERATION_ID
. TTL of the resource is 30 days.
{
"id": "ea1684f1-4ec7-431d-9b7e-bfbe98cf0bda",
"response": {
"results": {
"scores": [
{
"name": "hlepor",
"value": 0.755
},
{
"name": "ter",
"value": 28.571
}
]
}
},
"error": null,
"done": true
}
To get scores for each pair of items (machine translation and reference), set itemize flag to true. In this case results would be:
{
"id": "ea1684f1-4ec7-431d-9b7e-bfbe98cf0bda",
"response": {
"results": {
"scores": [
{
"value": 0.770527098111673,
"name": "hlepor"
},
{
"value": 0.740740740740741,
"name": "hlepor"
},
{
"value": 25.0,
"name": "ter"
},
{
"value": 33.33333333333333,
"name": "ter"
}
]
}
},
"error": null,
"done": true
}
Crosslingual Optimized Metric for Evaluation of Translation (COMET) is a metric that achieve high levels of correlation with different types of human judgements. COMET takes 4 parameters as an input:
sources
- a list of source sentencesitems
- results of the machine translationreference
- ground truth translationlang
- language of translation
Setting ignore_errors
to true
in the scores array, makes the API to return results regardless of consecutive errors.
Note: evaluation endpoints are supported only in async mode.
curl -XPOST -H 'apikey: YOUR_API_KEY' 'https://api.inten.to/evaluate/score' -d '{
"data": {
"items": [
"A sample text",
"Some other text"
],
"reference": [
"Not a sample text",
"Some other context"
],
"source": [
"Un texto de muestra",
"Algún otro texto"
],
"lang": "en"
},
"scores": [
{
"name": "comet",
"ignore_errors": true
}
],
"itemize": false,
"async": true
}'
The response contains id of the operation:
{ "id": "c74934b3-89e9-463e-b358-335c7c717f02" }
Wait for processing to complete.
{
"id": "c74934b3-89e9-463e-b358-335c7c717f02",
"done": true,
"response": {
"results": {
"scores": [
{
"value": {
"segment_scores": [
0.3271389603614807,
0.08198270201683044
],
"corpus_scores": [
0.20456083118915558
],
"return_hash": "wmt20-comet-da"
},
"name": "comet"
}
]
},
"type": "scores"
},
"meta": {},
"error": null
}
Crosslingual Optimized Metric for Evaluation of Translation (COMET) is a metric that achieve high levels of correlation with different types of human judgements. Intento uses wmt20-comet-qe-da
model.
COMET-QE takes 3 parameters as an input:
items
- results of the machine translationreference
- ground truth translation.lang
- language of translation.
Setting ignore_errors
to true
in the scores array, makes the API to return results regardless of consecutive errors.
Note: evaluation endpoints are supported only in async mode.
curl -XPOST -H 'apikey: YOUR_API_KEY' 'https://api.inten.to/evaluate/score' -d '{
"data": {
"items": [
"A sample text",
"Some text"
],
"reference": [],
"source": [
"Un texto de muestra",
"Algún otro texto"
],
"lang": "en"
},
"scores": [
{
"name": "comet-mtqe",
"ignore_errors": true
}
],
"itemize": false,
"async": true
}'
Note: reference
field is required but can be an empty list.
The response contains id of the operation:
{ "id": "6f409125-a8c2-4ffb-b42d-99f2d9d14f7a" }
Wait for processing to complete.
{
"id": "6f409125-a8c2-4ffb-b42d-99f2d9d14f7a",
"done": true,
"response": {
"results": {
"scores": [
{
"value": {
"segment_scores": [
0.5931146740913391,
4.1349710954818875e-05
],
"corpus_scores": [
0.29657801190114697
],
"return_hash": "wmt20-comet-qe-da"
},
"name": "comet-mtqe"
}
]
},
"type": "scores"
},
"meta": {},
"error": null
}
The language-agnostic BERT sentence embedding encodes text into high dimensional vectors. The model is trained and optimized to produce similar representations exclusively for bilingual sentence pairs that are translations of each other. So it can be used for mining for translations of a sentence in a larger corpus. Intento uses a port of the LaBSE model to PyTorch from huggingface.
LaBSE takes 3 parameters as an input:
items
- results of the machine translationsource
- a list of source sentenceslang
- language of translation
Setting ignore_errors
to true
in the scores array, makes the API to return results regardless of consecutive errors.
Note: evaluation endpoints are supported only in async mode.
curl -XPOST -H 'apikey: YOUR_API_KEY' 'https://api.inten.to/evaluate/score' -d '{
"data": {
"items": [
"A sample text",
"Some text"
],
"reference": [],
"source": [
"Un texto de muestra",
"Algún otro texto"
],
"lang": "en"
},
"scores": [
{
"name": "labse",
"ignore_errors": true
}
],
"itemize": false,
"async": true
}'
Note: reference
field is required but can be an empty list.
The response contains id of the operation:
{ "id": "8ab6da9a-c218-405d-88d6-a7d48b507c54" }
Wait for processing to complete.
{
"id": "8ab6da9a-c218-405d-88d6-a7d48b507c54",
"done": true,
"response": {
"results": {
"scores": [
{
"value": 0.922886312007904,
"name": "LaBSE"
},
{
"value": 0.7924383878707886,
"name": "LaBSE"
}
]
},
"type": "scores"
},
"error": null,
"meta": {}
}