Handler for Instruction Embedding models (and a typo fix) #2431

sidharthrajaram · 2023-06-25T00:39:55Z

Description

A simple handler that one can use to serve Instructor Embedding models with TorchServe, supporting both single inference and batch inference. Instructor Embedding models require both a sentence to embed as well as an instruction to contextualize the resultant embedding. Included in the PR is a README that shows how to serve Instruction Embedding models via torchserve as well as some example use cases.

Also in this PR is a simple documentation fix -- Fixed a strange anchor link in the documentation page about TorchServe's internals. Under the description of the different components of TorchServe's backend (Python), the link to arg_parser.py leads to a seemingly random line in the file instead of the top line description of the file and the ArgParser class. With this change, the link points to arg_parser.py instead of the anchor at a random line within arg_parser.py.

Type of change

Please delete options that are not relevant.

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
This change requires a documentation update

Feature/Issue validation/testing

Successful single inference on Instructor-XL model:

$ curl --header "Content-Type: application/json" \
  --request POST \
  --data '{"inputs": ["Represent the Science title:", "3D ActionSLAM: wearable person tracking in multi-floor environments"]}' \                        
http://127.0.0.1:8080/predictions/instructor_xl_batch
[
  [
    0.010738605633378029,
    0.02038838528096676,
    ...
    -0.003638878930360079,
    0.10961630195379257
  ]
]

Successful batch inference on Instructor-XL model:

$ curl --header "Content-Type: application/json" \
  --request POST \
  --data '{"inputs": [["Represent the Science title:", "3D ActionSLAM: wearable person tracking in multi-floor environments"],["Represent the Medicine sentence for retrieving a duplicate sentence:", "Recent studies have suggested that statins, an established drug group in the prevention of cardiovascular mortality, could delay or prevent breast cancer recurrence but the effect on disease-specific mortality remains unclear."]]}' \
http://127.0.0.1:8080/predictions/instructor_xl
[
  [
    0.010738605633378029,
    ...
    0.10961630195379257
  ],
  [
    0.014582153409719467,
    ...
    0.08006688207387924
  ]
]

Checklist:

Did you have fun?
Have you made corresponding changes to the documentation?

codecov · 2023-06-26T01:15:33Z

Codecov Report

Merging #2431 (d713ab4) into master (9833774) will not change coverage.
The diff coverage is n/a.

❗ Current head d713ab4 differs from pull request most recent head b4f3321. Consider uploading reports for the commit b4f3321 to get more accurate results

@@           Coverage Diff           @@
##           master    #2431   +/-   ##
=======================================
  Coverage   71.89%   71.89%           
=======================================
  Files          78       78           
  Lines        3654     3654           
  Branches       58       58           
=======================================
  Hits         2627     2627           
  Misses       1023     1023           
  Partials        4        4

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

msaroufim

Nice this was fun to read

a few minor nits

Make sure to run pre-commit there's a linting issue in your python file
Could you please explain what the output means exactly in batch inference section, not sure I entirely follow

sidharthrajaram · 2023-06-26T21:54:10Z

Make sure to run pre-commit there's a linting issue in your python file

@msaroufim - Ah, got it. Just ran pre-commit to fix those linting issues and pushed. Thanks.

Could you please explain what the output means exactly in batch inference section, not sure I entirely follow

The output of the batch inference request is two embedding vectors corresponding to the two input pairs (instruction, sentence):

The first input was:
["Represent the Science title:", "3D ActionSLAM: wearable person tracking in multi-floor environments"]

and the second input was:
["Represent the Medicine sentence for retrieving a duplicate sentence:", "Recent studies have suggested that statins, an established drug group in the prevention of cardiovascular mortality, could delay or prevent breast cancer recurrence but the effect on disease-specific mortality remains unclear."]

The response was a list of 2 embedding vectors (numpy arrays converted .tolist() to ensure they were JSON serializable) corresponding to each of those inputs. The output vectors were quite long so I used ellipses there.

msaroufim · 2023-06-26T22:33:03Z

Thanks @sidharthrajaram! On the output explanation I was hoping we could put that in the nice README you created. This is the first time I've seen an instruction embedding model so what do I do after getting a vector? (I'm OK if the embedding isn't used in your handler example and if this update is doc only)

sidharthrajaram · 2023-06-27T00:00:14Z

@msaroufim thanks for the feedback! Added those explanations to the README.

msaroufim

Cool! THanks! cc @agunapal for a second stamp

msaroufim · 2023-06-27T03:15:10Z

@sidharthrajaram one more thing, could you please add

ActionSLAM
statins

to https://github.com/pytorch/serve/blob/master/ts_scripts/spellcheck_conf/wordlist.txt

and quote using `` tolist

sidharthrajaram · 2023-06-27T06:29:43Z

@msaroufim - sounds good, just pushed those changes.

msaroufim · 2023-06-27T18:58:03Z

Regression test OOM seems like a flake so merging this

sidharthrajaram added 2 commits June 24, 2023 17:20

fixed arg parser link

0653fd4

handler for instruction embedding models

0f581e0

sidharthrajaram changed the title ~~fixed arg parser link~~ Handler for Instruction Embedding models (and a typo fix) Jun 25, 2023

msaroufim requested changes Jun 26, 2023

View reviewed changes

fixed some formatting, pylint fixes

bdf4ebc

sidharthrajaram requested a review from msaroufim June 26, 2023 21:55

explain output and what to do with it

1e48eef

msaroufim approved these changes Jun 27, 2023

View reviewed changes

agunapal approved these changes Jun 27, 2023

View reviewed changes

spellcheck, formatting

b4f3321

msaroufim merged commit ec3b992 into pytorch:master Jun 27, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Handler for Instruction Embedding models (and a typo fix) #2431

Handler for Instruction Embedding models (and a typo fix) #2431

sidharthrajaram commented Jun 25, 2023 •

edited

Loading

codecov bot commented Jun 26, 2023 •

edited

Loading

msaroufim left a comment

sidharthrajaram commented Jun 26, 2023 •

edited

Loading

msaroufim commented Jun 26, 2023

sidharthrajaram commented Jun 27, 2023

msaroufim left a comment

msaroufim commented Jun 27, 2023 •

edited

Loading

sidharthrajaram commented Jun 27, 2023

msaroufim commented Jun 27, 2023

Handler for Instruction Embedding models (and a typo fix) #2431

Handler for Instruction Embedding models (and a typo fix) #2431

Conversation

sidharthrajaram commented Jun 25, 2023 • edited Loading

Description

Type of change

Feature/Issue validation/testing

Checklist:

codecov bot commented Jun 26, 2023 • edited Loading

Codecov Report

msaroufim left a comment

Choose a reason for hiding this comment

sidharthrajaram commented Jun 26, 2023 • edited Loading

msaroufim commented Jun 26, 2023

sidharthrajaram commented Jun 27, 2023

msaroufim left a comment

Choose a reason for hiding this comment

msaroufim commented Jun 27, 2023 • edited Loading

sidharthrajaram commented Jun 27, 2023

msaroufim commented Jun 27, 2023

sidharthrajaram commented Jun 25, 2023 •

edited

Loading

codecov bot commented Jun 26, 2023 •

edited

Loading

sidharthrajaram commented Jun 26, 2023 •

edited

Loading

msaroufim commented Jun 27, 2023 •

edited

Loading