-
Notifications
You must be signed in to change notification settings - Fork 863
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handler for Instruction Embedding models (and a typo fix) #2431
Conversation
Codecov Report
@@ Coverage Diff @@
## master #2431 +/- ##
=======================================
Coverage 71.89% 71.89%
=======================================
Files 78 78
Lines 3654 3654
Branches 58 58
=======================================
Hits 2627 2627
Misses 1023 1023
Partials 4 4 📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice this was fun to read
a few minor nits
- Make sure to run
pre-commit
there's a linting issue in your python file - Could you please explain what the output means exactly in batch inference section, not sure I entirely follow
@msaroufim - Ah, got it. Just ran pre-commit to fix those linting issues and pushed. Thanks.
The output of the batch inference request is two embedding vectors corresponding to the two input pairs (instruction, sentence): The first input was: and the second input was: The response was a list of 2 embedding vectors (numpy arrays converted .tolist() to ensure they were JSON serializable) corresponding to each of those inputs. The output vectors were quite long so I used ellipses there. |
Thanks @sidharthrajaram! On the output explanation I was hoping we could put that in the nice README you created. This is the first time I've seen an instruction embedding model so what do I do after getting a vector? (I'm OK if the embedding isn't used in your handler example and if this update is doc only) |
@msaroufim thanks for the feedback! Added those explanations to the README. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool! THanks! cc @agunapal for a second stamp
@sidharthrajaram one more thing, could you please add
to https://github.com/pytorch/serve/blob/master/ts_scripts/spellcheck_conf/wordlist.txt and quote using `` tolist |
@msaroufim - sounds good, just pushed those changes. |
Regression test OOM seems like a flake so merging this |
Description
A simple handler that one can use to serve Instructor Embedding models with TorchServe, supporting both single inference and batch inference. Instructor Embedding models require both a sentence to embed as well as an instruction to contextualize the resultant embedding. Included in the PR is a README that shows how to serve Instruction Embedding models via
torchserve
as well as some example use cases.Also in this PR is a simple documentation fix -- Fixed a strange anchor link in the documentation page about TorchServe's internals. Under the description of the different components of TorchServe's backend (Python), the link to
arg_parser.py
leads to a seemingly random line in the file instead of the top line description of the file and theArgParser
class. With this change, the link points toarg_parser.py
instead of the anchor at a random line withinarg_parser.py
.Type of change
Please delete options that are not relevant.
Feature/Issue validation/testing
Successful single inference on Instructor-XL model:
Successful batch inference on Instructor-XL model:
Checklist: