Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add REST API method batch-suggest #664

Merged
merged 19 commits into from
Mar 31, 2023

Conversation

juhoinkinen
Copy link
Member

@juhoinkinen juhoinkinen commented Jan 30, 2023

Adds a new /v1/projects/{project_id}/suggest-batch REST API method. Based on the branch of PR #663, implements the REST API part of #579.

The method accepts in json the documents (max. 32) with the optional document_id field:

{
  "documents": [
    {
      "document_id": "doc-1234",
      "text": "A quick brown fox jumped over the lazy dog."
    }
  ]
}

The limit, threshold and language parameters are optional as for the regular suggest method and can be given as URL query parameters:

POST /projects/yso-tfidf-en/suggest?limit=10&threshold=0.2

An example response is:

[
  {
    "results": [
      {
        "label": "Archaeology",
        "notation": "42.42",
        "score": 0.85,
        "uri": "http://example.org/subject1"
      }
    ],
    "document_id": "doc-1234"
  }
]

The document_id is null in the response if the document in the request does not have one. It is similar to the external_id field of MonkeyLearn classifier and to the index AWS Comprehend BatchDetectKeyPhrases.

@juhoinkinen juhoinkinen added this to the Short term milestone Jan 30, 2023
@codecov
Copy link

codecov bot commented Jan 30, 2023

Codecov Report

Patch coverage: 100.00% and no project coverage change.

Comparison is base (38ec228) 99.57% compared to head (ca158d8) 99.58%.

Additional details and impacted files
@@           Coverage Diff            @@
##             main     #664    +/-   ##
========================================
  Coverage   99.57%   99.58%            
========================================
  Files          88       89     +1     
  Lines        6146     6268   +122     
========================================
+ Hits         6120     6242   +122     
  Misses         26       26            
Impacted Files Coverage Δ
annif/__init__.py 90.32% <100.00%> (+0.66%) ⬆️
annif/openapi/validation.py 100.00% <100.00%> (ø)
annif/rest.py 97.53% <100.00%> (+0.97%) ⬆️
tests/conftest.py 100.00% <100.00%> (ø)
tests/test_openapi.py 100.00% <100.00%> (ø)
tests/test_rest.py 100.00% <100.00%> (ø)

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

@osma
Copy link
Member

osma commented Jan 30, 2023

Would it make sense to limit the number of documents in a single REST call to the minibatch size (currently 32)? After all, there would be little benefit (except maybe avoiding some HTTP overhead) in processing more than one minibatch on the backend side.

@juhoinkinen

This comment was marked as outdated.

@juhoinkinen

This comment was marked as outdated.

@juhoinkinen juhoinkinen modified the milestones: Short term, 0.61 Mar 20, 2023
@juhoinkinen juhoinkinen changed the title Add REST API method for suggestions for a batch of documents Add REST API method batch-suggest Mar 20, 2023
@juhoinkinen
Copy link
Member Author

juhoinkinen commented Mar 20, 2023

#682 introduced Schemathesis to automate the testing of the actual API, which was previously done with manually written tests. Manually written tests allowed to test specific things, e.g. which error code arises for which (malformed) request. For example now there is the limit of 32 documents for /suggest-batch, but no test for it.

Schemathesis uses the examples from OpenAPI specification and some random inputs in path and query parameters. The requests to /v1/projects/<proj-id>/suggest-batch can be seen by running a local Annif server and Schemathesis from command line:

st run annif/openapi/annif.yaml -E suggest-batch$ --base-url http://127.0.0.1:5000/v1
Long list of request logs
INFO:werkzeug:127.0.0.1 - - [20/Mar/2023 16:06:20] "POST /v1/projects/dummy-fi/suggest-batch HTTP/1.1" 200 -
INFO:werkzeug:127.0.0.1 - - [20/Mar/2023 16:06:20] "POST /v1/projects/%3F/suggest-batch HTTP/1.1" 404 -
INFO:werkzeug:127.0.0.1 - - [20/Mar/2023 16:06:20] "POST /v1/projects/󠀟%40/suggest-batch?language=Æ𲂇%7B HTTP/1.1" 404 -
INFO:werkzeug:127.0.0.1 - - [20/Mar/2023 16:06:20] "POST /v1/projects/𫞠/suggest-batch?limit=9602 HTTP/1.1" 404 -
INFO:werkzeug:127.0.0.1 - - [20/Mar/2023 16:06:20] "POST /v1/projects/%0F©𭵡õ%3Fí%1F%11÷¢/suggest-batch?language=%1C&threshold=6.103515625e-05 HTTP/1.1" 404 -
INFO:werkzeug:127.0.0.1 - - [20/Mar/2023 16:06:20] "POST /v1/projects/󰾽2kr7C/suggest-batch HTTP/1.1" 404 -
INFO:werkzeug:127.0.0.1 - - [20/Mar/2023 16:06:20] "POST /v1/projects/󂝮/suggest-batch?limit=8389248&language=Ȁ&threshold=0.5008241237401309 HTTP/1.1" 404 -
INFO:werkzeug:127.0.0.1 - - [20/Mar/2023 16:06:20] "POST /v1/projects/ÄÒ%7D/suggest-batch?limit=25733 HTTP/1.1" 404 -
INFO:werkzeug:127.0.0.1 - - [20/Mar/2023 16:06:20] "POST /v1/projects/󎣐%17/suggest-batch?language=&limit=79&threshold=0.2344360919705793 HTTP/1.1" 404 -
INFO:werkzeug:127.0.0.1 - - [20/Mar/2023 16:06:20] "POST /v1/projects/Î/suggest-batch?limit=25503 HTTP/1.1" 404 -
INFO:werkzeug:127.0.0.1 - - [20/Mar/2023 16:06:20] "POST /v1/projects/JÖ𪾻¥򻠊/suggest-batch?threshold=1.1754943508222875e-38 HTTP/1.1" 404 -
INFO:werkzeug:127.0.0.1 - - [20/Mar/2023 16:06:20] "POST /v1/projects/0/suggest-batch HTTP/1.1" 404 -
INFO:werkzeug:127.0.0.1 - - [20/Mar/2023 16:06:20] "POST /v1/projects/鼠ä/suggest-batch?language=øx&threshold=1.1754943508222875e-38&limit=21298 HTTP/1.1" 404 -
INFO:werkzeug:127.0.0.1 - - [20/Mar/2023 16:06:20] "POST /v1/projects/y/suggest-batch?limit=3844&threshold=0.25238021155102036&language=² HTTP/1.1" 404 -
INFO:werkzeug:127.0.0.1 - - [20/Mar/2023 16:06:20] "POST /v1/projects/󶓅𣫛Y𺰏%60𕖆%5EÎ%3F/suggest-batch HTTP/1.1" 404 -
INFO:werkzeug:127.0.0.1 - - [20/Mar/2023 16:06:20] "POST /v1/projects/S򷝦/suggest-batch?threshold=1.0&limit=43&language=ç¯%05'􏃲aÿ򧐅'%3A񷡍n HTTP/1.1" 404 -
INFO:werkzeug:127.0.0.1 - - [20/Mar/2023 16:06:20] "POST /v1/projects/򀱟/suggest-batch?threshold=1e-05 HTTP/1.1" 404 -
INFO:werkzeug:127.0.0.1 - - [20/Mar/2023 16:06:21] "POST /v1/projects/E/suggest-batch?limit=127&language=%3C񢖬򵦫w񹬠&threshold=0.99999 HTTP/1.1" 404 -
INFO:werkzeug:127.0.0.1 - - [20/Mar/2023 16:06:21] "POST /v1/projects/¡򿹤HËýÄüÕ/suggest-batch?limit=3512496318697642229&threshold=0.23034777681204327&language= HTTP/1.1" 404 -
INFO:werkzeug:127.0.0.1 - - [20/Mar/2023 16:06:21] "POST /v1/projects/%04õ/suggest-batch?threshold=1e-05&limit=10162&language= HTTP/1.1" 404 -
INFO:werkzeug:127.0.0.1 - - [20/Mar/2023 16:06:21] "POST /v1/projects/𽺾/suggest-batch?threshold=1.192092896e-07 HTTP/1.1" 404 -
INFO:werkzeug:127.0.0.1 - - [20/Mar/2023 16:06:21] "POST /v1/projects/¬ÿ÷򐁖DZ𤂷_¿𦥋7ñ/suggest-batch?threshold=1.1125369292536007e-308&language=Ù»îÛÂ&limit=29145 HTTP/1.1" 404 -
INFO:werkzeug:127.0.0.1 - - [20/Mar/2023 16:06:22] "POST /v1/projects/𹳂ÑÒ𑺾&threshold=1.175494351e-38&limit=21033 HTTP/1.1" 404 -
INFO:werkzeug:127.0.0.1 - - [20/Mar/2023 16:06:22] "POST /v1/projects/oèÁ/suggest-batch HTTP/1.1" 404 -
INFO:werkzeug:127.0.0.1 - - [20/Mar/2023 16:06:22] "POST /v1/projects/㼀/suggest-batch?threshold=0.0 HTTP/1.1" 404 -
INFO:werkzeug:127.0.0.1 - - [20/Mar/2023 16:06:22] "POST /v1/projects/􉑟/sugge%1D/suggest-batch HTTP/1.1" 404 -0&language='TTP/1.1" 404 -
INFO:werkzeug:127.0.0.1 - - [20/Mar/2023 16:06:22] "POST /v1/projects/𥯆%5CÏ1򺊒%7B/suggest-batch?limit=15&language=󓝮 HTTP/1.1" 404 -
INFO:werkzeug:127.0.0.1 - - [20/Mar/2023 16:06:22] "POST /v1/projects/񿄌â%7B¥oÇ_%5D/suggest-batch HTTP/1.1" 404 -
INFO:werkzeug:127.0.0.1 - - [20/Mar/2023 16:06:22] "POST /v1/projects/𧂰jÀ9򋇺0/suggest-batch HTTP/1.1" 404 -
INFO:werkzeug:127.0.0.1 - - [20/Mar/2023 16:06:22] "POST /v1/projects/Ý%0FVv°P/suggest-batch?limit=118 HTTP/1.1" 404 -
INFO:werkzeug:127.0.0.1 - - [20/Mar/2023 16:06:22] "POST /v1/projects/a%05F/suggest-batch?language=ª½M§&threshold=1e-05 HTTP/1.1" 404 -
INFO:werkzeug:127.0.0.1 - - [20/Mar/2023 16:06:23] "POST /v1/projects/%5D%098+Ó¡Wò/suggest-batch?limit=13346 HTTP/1.1" 404 -
INFO:werkzeug:127.0.0.1 - - [20/Mar/2023 16:06:23] "POST /v1/projects/Ý°%5Dò%1B/suggest-batch?language=򤴩nû£񈾌ÖÍ򯥦&limit=3513076579650466351 HTTP/1.1" 404 -
INFO:werkzeug:127.0.0.1 - - [20/Mar/2023 16:06:23] "POST /v1/projects/򄝛%05󡹻uggest-batch?threshold=0.9999999999999999&language=򹡺© HTTP/1.1" 404 -
INFO:werkzeug:127.0.0.1 - - [20/Mar/2023 16:06:23] "POST /v1/projects/󅽉𼖿õ&limit=22244 HTTP/1.1" 404 -
INFO:werkzeug:127.0.0.1 - - [20/Mar/2023 16:06:23] "POST /v1/projects/%03/suggest-batch HTTP/1.1" 404 -
INFO:werkzeug:127.0.0.1 - - [20/Mar/2023 16:06:23] "POST /v1/projects/" 404 -
INFO:werkzeug:127.0.0.1 - - [20/Mar/2023 16:06:23] "POST /v1/projects/%5B%0AT/suggest-batch HTTP/1.1" 404 -
INFO:werkzeug:127.0.0.1 - - [20/Mar/2023 16:06:24] "POST /v1/projects/򑼿Ã2/suggest-batch HTTP/1.1" 404 -
INFO:werkzeug:127.0.0.1 - - [20/Mar/2023 16:06:24] "POST /v1/projects/²B%07%5C/suggest-batch?threshold=2.2250738585072014e-308&limit=105&language=W HTTP/1.1" 404 -
INFO:werkzeug:127.0.0.1 - - [20/Mar/2023 16:06:24] "POST /v1/projects/𜹥/suggest-batch HTTP/1.1" 404 -
INFO:werkzeug:127.0.0.1 - - [20/Mar/2023 16:06:24] "POST /v1/projects/𜹥/suggest-batch?limit=847249536&threshold=2.225073858507203e-309&language=Ý%0Ca𦅹a񯔎󮖛 HTTP/1.1" 404 -
INFO:werkzeug:127.0.0.1 - - [20/Mar/2023 16:06:24] "POST /v1/projects/°Ò𑫽»%1D񺚋󏍁/suggest-batch?language=%1Dò򥔃󋉩&limit=120 HTTP/1.1" 404 -
INFO:werkzeug:127.0.0.1 - - [20/Mar/2023 16:06:24] "POST /v1/projects/%18ú/suggest-batch?limit=61 HTTP/1.1" 404 -
INFO:werkzeug:127.0.0.1 - - [20/Mar/2023 16:06:25] "POST /v1/projects/򲦍Æ6򘮔/suggest-batch?limit=10787 HTTP/1.1" 404 -
INFO:werkzeug:127.0.0.1 - - [20/Mar/2023 16:06:25] "POST /v1/projects/󣥐򂖃ζ¸/suggest-batch?language=ñ𔩆%11x©𜈈&limit=22711&threshold=1.175494351e-38 HTTP/1.1" 404 -
INFO:werkzeug:127.0.0.1 - - [20/Mar/2023 16:06:25] "POST /v1/projects/񄺮R/suggest-batch?limit=107&language=񹸴󗩙&threshold=1.0 HTTP/1.1" 404 -
INFO:werkzeug:127.0.0.1 - - [20/Mar/2023 16:06:25] "POST /v1/projects/%60¥􋱐/suggest-batch HTTP/1.1" 404 -
INFO:werkzeug:127.0.0.1 - - [20/Mar/2023 16:06:25] "POST /v1/projects/Q/suggest-batch HTTP/1.1" 404 -
INFO:werkzeug:127.0.0.1 - - [20/Mar/2023 16:06:26] "POST /v1/projects/%18/suggest-batch HTTP/1.1" 404 -
INFO:werkzeug:127.0.0.1 - - [20/Mar/2023 16:06:26] "POST /v1/projects/򲦍Æ6򘮔/suggest-batch?limit=10787 HTTP/1.1" 404 -
INFO:werkzeug:127.0.0.1 - - [20/Mar/2023 16:06:26] "POST /v1/projects/򲦍Æ6򘮔/suggest-batch?limit=10787 HTTP/1.1" 404 -
INFO:werkzeug:127.0.0.1 - - [20/Mar/2023 16:06:26] "POST /v1/projects/򲦍Æ6򘮔/suggest-batch?limit=10787 HTTP/1.1" 404 -
INFO:werkzeug:127.0.0.1 - - [20/Mar/2023 16:06:26] "POST /v1/projects/򲦍Æ6򘮔/suggest-batch?limit=10787 HTTP/1.1" 404 -
INFO:werkzeug:127.0.0.1 - - [20/Mar/2023 16:06:26] "POST /v1/projects/򲦍Æ6򘮔/suggest-batch?limit=10787 HTTP/1.1" 404 -
INFO:werkzeug:127.0.0.1 - - [20/Mar/2023 16:06:26] "POST /v1/projects/򲦍Æ6򘮔/suggest-batch?limit=10787 HTTP/1.1" 404 -
INFO:werkzeug:127.0.0.1 - - [20/Mar/2023 16:06:26] "POST /v1/projects/򲦍Æ6򘮔/suggest-batch?limit=10787 HTTP/1.1" 404 -
INFO:werkzeug:127.0.0.1 - - [20/Mar/2023 16:06:26] "POST /v1/projects/򲦍Æ6򘮔/suggest-batch?limit=10787 HTTP/1.1" 404 -
INFO:werkzeug:127.0.0.1 - - [20/Mar/2023 16:06:27] "POST /v1/projects/򲦍Æ6򘮔/suggest-batch?limit=10787 HTTP/1.1" 404 -
INFO:werkzeug:127.0.0.1 - - [20/Mar/2023 16:06:27] "POST /v1/projects/򲦍Æ6򘮔/suggest-batch?limit=10787 HTTP/1.1" 404 -
INFO:werkzeug:127.0.0.1 - - [20/Mar/2023 16:06:27] "POST /v1/projects/򲦍Æ6򘮔/suggest-batch?limit=10787 HTTP/1.1" 404 -
INFO:werkzeug:127.0.0.1 - - [20/Mar/2023 16:06:27] "POST /v1/projects/򲦍Æ6򘮔/suggest-batch?limit=10787 HTTP/1.1" 404 -
INFO:werkzeug:127.0.0.1 - - [20/Mar/2023 16:06:27] "POST /v1/projects/򲦍Æ6򘮔/suggest-batch?limit=10787 HTTP/1.1" 404 -
INFO:werkzeug:127.0.0.1 - - [20/Mar/2023 16:06:27] "POST /v1/projects/򲦍Æ6򘮔/suggest-batch?limit=10787 HTTP/1.1" 404 -
INFO:werkzeug:127.0.0.1 - - [20/Mar/2023 16:06:27] "POST /v1/projects/򲦍Æ6򘮔/suggest-batch?limit=10787 HTTP/1.1" 404 -
INFO:werkzeug:127.0.0.1 - - [20/Mar/2023 16:06:27] "POST /v1/projects/򲦍Æ6򘮔/suggest-batch?limit=10787 HTTP/1.1" 404 -
INFO:werkzeug:127.0.0.1 - - [20/Mar/2023 16:06:27] "POST /v1/projects/򲦍Æ6򘮔/suggest-batch?limit=10787 HTTP/1.1" 404 -
INFO:werkzeug:127.0.0.1 - - [20/Mar/2023 16:06:27] "POST /v1/projects/򲦍Æ6򘮔/suggest-batch?limit=10787 HTTP/1.1" 404 -
INFO:werkzeug:127.0.0.1 - - [20/Mar/2023 16:06:27] "POST /v1/projects/򲦍Æ6򘮔/suggest-batch?limit=10787 HTTP/1.1" 404 -
INFO:werkzeug:127.0.0.1 - - [20/Mar/2023 16:06:27] "POST /v1/projects/򲦍Æ6򘮔/suggest-batch?limit=10787 HTTP/1.1" 404 -
INFO:werkzeug:127.0.0.1 - - [20/Mar/2023 16:06:28] "POST /v1/projects/򲦍Æ6򘮔/suggest-batch?limit=10787 HTTP/1.1" 404 -
INFO:werkzeug:127.0.0.1 - - [20/Mar/2023 16:06:28] "POST /v1/projects/򲦍Æ6򘮔/suggest-batch?limit=10787 HTTP/1.1" 404 -
INFO:werkzeug:127.0.0.1 - - [20/Mar/2023 16:06:28] "POST /v1/projects/򲦍Æ6򘮔/suggest-batch?limit=10787 HTTP/1.1" 404 -
INFO:werkzeug:127.0.0.1 - - [20/Mar/2023 16:06:28] "POST /v1/projects/򲦍Æ6򘮔/suggest-batch?limit=10787 HTTP/1.1" 404 -
INFO:werkzeug:127.0.0.1 - - [20/Mar/2023 16:06:29] "POST /v1/projects/򲦍Æ6򘮔/suggest-batch?limit=10787 HTTP/1.1" 404 -
INFO:werkzeug:127.0.0.1 - - [20/Mar/2023 16:06:29] "POST /v1/projects/򲦍Æ6򘮔/suggest-batch?limit=10787 HTTP/1.1" 404 -
INFO:werkzeug:127.0.0.1 - - [20/Mar/2023 16:06:29] "POST /v1/projects/򲦍Æ6򘮔/suggest-batch?limit=10787 HTTP/1.1" 404 -
INFO:werkzeug:127.0.0.1 - - [20/Mar/2023 16:06:29] "POST /v1/projects/򲦍Æ6򘮔/suggest-batch?limit=10787 HTTP/1.1" 404 -
INFO:werkzeug:127.0.0.1 - - [20/Mar/2023 16:06:29] "POST /v1/projects/򲦍Æ6򘮔/suggest-batch?limit=10787 HTTP/1.1" 404 -
INFO:werkzeug:127.0.0.1 - - [20/Mar/2023 16:06:29] "POST /v1/projects/򲦍Æ6򘮔/suggest-batch?limit=10787 HTTP/1.1" 404 -
INFO:werkzeug:127.0.0.1 - - [20/Mar/2023 16:06:29] "POST /v1/projects/򲦍Æ6򘮔/suggest-batch?limit=10787 HTTP/1.1" 404 -
INFO:werkzeug:127.0.0.1 - - [20/Mar/2023 16:06:29] "POST /v1/projects/򲦍Æ6򘮔/suggest-batch?limit=10787 HTTP/1.1" 404 -
INFO:werkzeug:127.0.0.1 - - [20/Mar/2023 16:06:30] "POST /v1/projects/򲦍Æ6򘮔/suggest-batch?limit=10787 HTTP/1.1" 404 -
INFO:werkzeug:127.0.0.1 - - [20/Mar/2023 16:06:30] "POST /v1/projects/򲦍Æ6򘮔/suggest-batch?limit=10787 HTTP/1.1" 404 -
INFO:werkzeug:127.0.0.1 - - [20/Mar/2023 16:06:30] "POST /v1/projects/򲦍Æ6򘮔/suggest-batch?limit=10787 HTTP/1.1" 404 -
INFO:werkzeug:127.0.0.1 - - [20/Mar/2023 16:06:30] "POST /v1/projects/򲦍Æ6򘮔/suggest-batch?limit=10787 HTTP/1.1" 404 -
INFO:werkzeug:127.0.0.1 - - [20/Mar/2023 16:06:30] "POST /v1/projects/򲦍Æ6򘮔/suggest-batch?limit=10787 HTTP/1.1" 404 -
INFO:werkzeug:127.0.0.1 - - [20/Mar/2023 16:06:30] "POST /v1/projects/򲦍Æ6򘮔/suggest-batch?limit=10787 HTTP/1.1" 404 -
INFO:werkzeug:127.0.0.1 - - [20/Mar/2023 16:06:30] "POST /v1/projects/򲦍Æ6򘮔/suggest-batch?limit=10787 HTTP/1.1" 404 -
INFO:werkzeug:127.0.0.1 - - [20/Mar/2023 16:06:30] "POST /v1/projects/򲦍Æ6򘮔/suggest-batch?limit=10787 HTTP/1.1" 404 -
INFO:werkzeug:127.0.0.1 - - [20/Mar/2023 16:06:30] "POST /v1/projects/򲦍Æ6򘮔/suggest-batch?limit=10787 HTTP/1.1" 404 -
INFO:werkzeug:127.0.0.1 - - [20/Mar/2023 16:06:31] "POST /v1/projects/򲦍Æ6򘮔/suggest-batch?limit=10787 HTTP/1.1" 404 -
INFO:werkzeug:127.0.0.1 - - [20/Mar/2023 16:06:31] "POST /v1/projects/򲦍Æ6򘮔/suggest-batch?limit=10787 HTTP/1.1" 404 -
INFO:werkzeug:127.0.0.1 - - [20/Mar/2023 16:06:31] "POST /v1/projects/򲦍Æ6򘮔/suggest-batch?limit=10787 HTTP/1.1" 404 -
INFO:werkzeug:127.0.0.1 - - [20/Mar/2023 16:06:31] "POST /v1/projects/򲦍Æ6򘮔/suggest-batch?limit=10787 HTTP/1.1" 404 -
INFO:werkzeug:127.0.0.1 - - [20/Mar/2023 16:06:31] "POST /v1/projects/򲦍Æ6򘮔/suggest-batch?limit=10787 HTTP/1.1" 404 -
INFO:werkzeug:127.0.0.1 - - [20/Mar/2023 16:06:31] "POST /v1/projects/򲦍Æ6򘮔/suggest-batch?limit=10787 HTTP/1.1" 404 -
INFO:werkzeug:127.0.0.1 - - [20/Mar/2023 16:06:31] "POST /v1/projects/򲦍Æ6򘮔/suggest-batch?limit=10787 HTTP/1.1" 404 -
INFO:werkzeug:127.0.0.1 - - [20/Mar/2023 16:06:31] "POST /v1/projects/򲦍Æ6򘮔/suggest-batch?limit=10787 HTTP/1.1" 404 -
INFO:werkzeug:127.0.0.1 - - [20/Mar/2023 16:06:31] "POST /v1/projects/򲦍Æ6򘮔/suggest-batch?limit=10787 HTTP/1.1" 404 -

But there is no request with 400 code, which is the code for language_not_supported_error or too many documents.

@juhoinkinen juhoinkinen marked this pull request as ready for review March 20, 2023 14:28
@juhoinkinen juhoinkinen force-pushed the issue579-batch-suggest-operation-rest-api branch from fcba876 to 6f9f944 Compare March 23, 2023 13:16
@juhoinkinen juhoinkinen changed the base branch from main to batching-in-nn-ensemble March 23, 2023 13:17
@juhoinkinen juhoinkinen changed the base branch from batching-in-nn-ensemble to main March 23, 2023 13:17
@juhoinkinen juhoinkinen changed the base branch from main to batching-in-nn-ensemble March 23, 2023 13:25
@juhoinkinen juhoinkinen changed the base branch from batching-in-nn-ensemble to main March 23, 2023 13:25
@juhoinkinen juhoinkinen force-pushed the issue579-batch-suggest-operation-rest-api branch from afad237 to d819246 Compare March 23, 2023 13:57
@juhoinkinen juhoinkinen force-pushed the issue579-batch-suggest-operation-rest-api branch from d819246 to 74980c0 Compare March 23, 2023 17:27
@juhoinkinen
Copy link
Member Author

Finally I managed to drop the unnecessary merge and revert commits. I recreated the PR branch from the current main and then cherry-picked the good commits from a backup PR branch. I don't know why rebasing did not work: there ended up also all commits made to main not matter what I tried. Changing base branch back and forth did not help as usually.

But now there is a problem installing just released version of a dependency in GH Actions (but not on my laptop):

Unable to find installation candidates for libclang (16.0.0)

@juhoinkinen
Copy link
Member Author

But now there is a problem installing just released version of a dependency in GH Actions (but not on my laptop):

Unable to find installation candidates for libclang (16.0.0)

This seemed to be caused by issue sighingnow/libclang#46, which got fixed.

@juhoinkinen juhoinkinen requested a review from osma March 24, 2023 08:03
annif/openapi/annif.yaml Outdated Show resolved Hide resolved
Copy link
Member

@osma osma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested this locally and it seems to work well. A couple of points to consider:

  1. I suggested a little change to the wording of the method summary
  2. I verified that the method fails if given more than 32 documents, as it should. But the error message is a bit confusing:
{
  "detail": "[{'text': 'Olipa kerran'}, {'text': 'Olipa kerran'}, {'text': 'Olipa kerran'}, {'text': 'Olipa kerran'}, {'text': 'Olipa kerran'}, {'text': 'Olipa kerran'}, {'text': 'Olipa kerran'}, {'text': 'Olipa kerran'}, {'text': 'Olipa kerran'}, {'text': 'Olipa kerran'}, {'text': 'Olipa kerran'}, {'text': 'Olipa kerran'}, {'text': 'Olipa kerran'}, {'text': 'Olipa kerran'}, {'text': 'Olipa kerran'}, {'text': 'Olipa kerran'}, {'text': 'Olipa kerran'}, {'text': 'Olipa kerran'}, {'text': 'Olipa kerran'}, {'text': 'Olipa kerran'}, {'text': 'Olipa kerran'}, {'text': 'Olipa kerran'}, {'text': 'Olipa kerran'}, {'text': 'Olipa kerran'}, {'text': 'Olipa kerran'}, {'text': 'Olipa kerran'}, {'text': 'Olipa kerran'}, {'text': 'Olipa kerran'}, {'text': 'Olipa kerran'}, {'text': 'Olipa kerran'}, {'text': 'Olipa kerran'}, {'text': 'Olipa kerran'}, {'text': 'plaa'}] is too long - 'documents'",
  "status": 400,
  "title": "Bad Request",
  "type": "about:blank"
}

Basically the "detail" field includes the whole request, which could be extremely long; and it ends with "is too long - 'documents'" which is not that helpful. Is this something we could change or is this coming directly from Connexion so we can't do anything about it? I would like to see a more helpful message, which wouldn't include the whole request body, just a message stating that there were too many documents.

  1. Related to the above, can we write (for example using schemathesis) unit tests that verify that the API accepts 32 documents, but doesn't accept 33 documents? I'm worried that if we change the implementation later, the limit of 32 will be dropped and it could lead to problems on the backend side.

If you can fix at least some of the above it would be great, but if points 2 and/or 3 are too difficult, I think it's also OK to merge this as it is.

@juhoinkinen
Copy link
Member Author

  1. I suggested a little change to the wording of the method summary

Done.

  1. I verified that the method fails if given more than 32 documents, as it should. But the error message is a bit confusing:
{
  "detail": "[{'text': 'Olipa kerran'}, {'text': 'Olipa kerran'}, {'text': 'Olipa kerran'}, {'text': 'Olipa kerran'}, {'text': 'Olipa kerran'}, {'text': 'Olipa kerran'}, {'text': 'Olipa kerran'}, {'text': 'Olipa kerran'}, {'text': 'Olipa kerran'}, {'text': 'Olipa kerran'}, {'text': 'Olipa kerran'}, {'text': 'Olipa kerran'}, {'text': 'Olipa kerran'}, {'text': 'Olipa kerran'}, {'text': 'Olipa kerran'}, {'text': 'Olipa kerran'}, {'text': 'Olipa kerran'}, {'text': 'Olipa kerran'}, {'text': 'Olipa kerran'}, {'text': 'Olipa kerran'}, {'text': 'Olipa kerran'}, {'text': 'Olipa kerran'}, {'text': 'Olipa kerran'}, {'text': 'Olipa kerran'}, {'text': 'Olipa kerran'}, {'text': 'Olipa kerran'}, {'text': 'Olipa kerran'}, {'text': 'Olipa kerran'}, {'text': 'Olipa kerran'}, {'text': 'Olipa kerran'}, {'text': 'Olipa kerran'}, {'text': 'Olipa kerran'}, {'text': 'plaa'}] is too long - 'documents'",
  "status": 400,
  "title": "Bad Request",
  "type": "about:blank"
}

Basically the "detail" field includes the whole request, which could be extremely long; and it ends with "is too long - 'documents'" which is not that helpful. Is this something we could change or is this coming directly from Connexion so we can't do anything about it? I would like to see a more helpful message, which wouldn't include the whole request body, just a message stating that there were too many documents.

I added CustomRequestBodyValidator in annif/openapi/validation.py module. It is a child class of the connexion RequestBodyValidator, and it overrides the default validate_schema() method to modify the message in the "detail" field to only validation error: too many items - 'documents'.

This seems to be the recommended way to modify the validation, found it via spec-first/connexion#558.

  1. Related to the above, can we write (for example using schemathesis) unit tests that verify that the API accepts 32 documents, but doesn't accept 33 documents? I'm worried that if we change the implementation later, the limit of 32 will be dropped and it could lead to problems on the backend side.

I added tests for cases 32 and 33 documents, and restored the manually written Swagger/OpenAPI tests that I removed in #682. These test do not rely on Schemathesis but on the app_client fixture, and this allows to check which error code is given for which error.

Required as jsonschema is now imported directly
"""Validate the request body against the schema."""

if self.is_null_value_valid and is_null(data):
return None
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line is not covered by unit tests - can we do something about it? If nothing else, mark it with a # noqa annotation so it doesn't show up in coverage reports...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just marked the line with # noqa. The validation mechanism in soon-to-be-released Connexion 3 is changed (spec-first/connexion#1610), so this needs to be addressed again when/if upgrading to Connexion 3.

@sonarqubecloud
Copy link

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 1 Code Smell

No Coverage information No Coverage information
0.0% 0.0% Duplication

@juhoinkinen juhoinkinen merged commit 693ab21 into main Mar 31, 2023
@juhoinkinen juhoinkinen deleted the issue579-batch-suggest-operation-rest-api branch March 31, 2023 09:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants