Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handler for Instruction Embedding models (and a typo fix) #2431

Merged
merged 5 commits into from
Jun 27, 2023

Conversation

sidharthrajaram
Copy link
Contributor

@sidharthrajaram sidharthrajaram commented Jun 25, 2023

Description

A simple handler that one can use to serve Instructor Embedding models with TorchServe, supporting both single inference and batch inference. Instructor Embedding models require both a sentence to embed as well as an instruction to contextualize the resultant embedding. Included in the PR is a README that shows how to serve Instruction Embedding models via torchserve as well as some example use cases.

Also in this PR is a simple documentation fix -- Fixed a strange anchor link in the documentation page about TorchServe's internals. Under the description of the different components of TorchServe's backend (Python), the link to arg_parser.py leads to a seemingly random line in the file instead of the top line description of the file and the ArgParser class. With this change, the link points to arg_parser.py instead of the anchor at a random line within arg_parser.py.

Type of change

Please delete options that are not relevant.

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • This change requires a documentation update

Feature/Issue validation/testing

Successful single inference on Instructor-XL model:

$ curl --header "Content-Type: application/json" \
  --request POST \
  --data '{"inputs": ["Represent the Science title:", "3D ActionSLAM: wearable person tracking in multi-floor environments"]}' \                        
http://127.0.0.1:8080/predictions/instructor_xl_batch
[
  [
    0.010738605633378029,
    0.02038838528096676,
    ...
    -0.003638878930360079,
    0.10961630195379257
  ]
]

Successful batch inference on Instructor-XL model:

$ curl --header "Content-Type: application/json" \
  --request POST \
  --data '{"inputs": [["Represent the Science title:", "3D ActionSLAM: wearable person tracking in multi-floor environments"],["Represent the Medicine sentence for retrieving a duplicate sentence:", "Recent studies have suggested that statins, an established drug group in the prevention of cardiovascular mortality, could delay or prevent breast cancer recurrence but the effect on disease-specific mortality remains unclear."]]}' \
http://127.0.0.1:8080/predictions/instructor_xl
[
  [
    0.010738605633378029,
    ...
    0.10961630195379257
  ],
  [
    0.014582153409719467,
    ...
    0.08006688207387924
  ]
]

Checklist:

  • Did you have fun?
  • Have you made corresponding changes to the documentation?

@sidharthrajaram sidharthrajaram changed the title fixed arg parser link Handler for Instruction Embedding models (and a typo fix) Jun 25, 2023
@codecov
Copy link

codecov bot commented Jun 26, 2023

Codecov Report

Merging #2431 (d713ab4) into master (9833774) will not change coverage.
The diff coverage is n/a.

❗ Current head d713ab4 differs from pull request most recent head b4f3321. Consider uploading reports for the commit b4f3321 to get more accurate results

@@           Coverage Diff           @@
##           master    #2431   +/-   ##
=======================================
  Coverage   71.89%   71.89%           
=======================================
  Files          78       78           
  Lines        3654     3654           
  Branches       58       58           
=======================================
  Hits         2627     2627           
  Misses       1023     1023           
  Partials        4        4           

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

Copy link
Member

@msaroufim msaroufim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice this was fun to read

a few minor nits

  1. Make sure to run pre-commit there's a linting issue in your python file
  2. Could you please explain what the output means exactly in batch inference section, not sure I entirely follow

@sidharthrajaram
Copy link
Contributor Author

sidharthrajaram commented Jun 26, 2023

  1. Make sure to run pre-commit there's a linting issue in your python file

@msaroufim - Ah, got it. Just ran pre-commit to fix those linting issues and pushed. Thanks.

  1. Could you please explain what the output means exactly in batch inference section, not sure I entirely follow

The output of the batch inference request is two embedding vectors corresponding to the two input pairs (instruction, sentence):

The first input was:
["Represent the Science title:", "3D ActionSLAM: wearable person tracking in multi-floor environments"]

and the second input was:
["Represent the Medicine sentence for retrieving a duplicate sentence:", "Recent studies have suggested that statins, an established drug group in the prevention of cardiovascular mortality, could delay or prevent breast cancer recurrence but the effect on disease-specific mortality remains unclear."]

The response was a list of 2 embedding vectors (numpy arrays converted .tolist() to ensure they were JSON serializable) corresponding to each of those inputs. The output vectors were quite long so I used ellipses there.

@msaroufim
Copy link
Member

Thanks @sidharthrajaram! On the output explanation I was hoping we could put that in the nice README you created. This is the first time I've seen an instruction embedding model so what do I do after getting a vector? (I'm OK if the embedding isn't used in your handler example and if this update is doc only)

@sidharthrajaram
Copy link
Contributor Author

@msaroufim thanks for the feedback! Added those explanations to the README.

Copy link
Member

@msaroufim msaroufim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool! THanks! cc @agunapal for a second stamp

@msaroufim
Copy link
Member

msaroufim commented Jun 27, 2023

@sidharthrajaram one more thing, could you please add

ActionSLAM
statins

to https://github.com/pytorch/serve/blob/master/ts_scripts/spellcheck_conf/wordlist.txt

and quote using `` tolist

@sidharthrajaram
Copy link
Contributor Author

@msaroufim - sounds good, just pushed those changes.

@msaroufim msaroufim merged commit ec3b992 into pytorch:master Jun 27, 2023
@msaroufim
Copy link
Member

Regression test OOM seems like a flake so merging this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants