Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[doc] Add LMI Text Embedding Inference user guide #2022

Merged
merged 3 commits into from
Jun 5, 2024

Conversation

xyang16
Copy link
Contributor

@xyang16 xyang16 commented Jun 4, 2024

Description

Brief description of what this PR is about

  • If this change is a backward incompatible change, why must this change be made?
  • Interesting edge cases to note here

@xyang16 xyang16 requested review from zachgk, frankfliu and a team as code owners June 4, 2024 21:26
serving/docs/lmi/user_guides/embedding-user-guide.md Outdated Show resolved Hide resolved
serving/docs/lmi/user_guides/embedding-user-guide.md Outdated Show resolved Hide resolved

You can leverage LMI Text Embedding inference using the following starter configurations:

### serving.properties
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should not promote this usage, we should guide user to:

  1. Use djl:// model if the model already exist
  2. use HF_MODEL_ID to convert the model at runtime
  3. Manually import the model into DJL model format with our djl_convert/djl_import tool
  4. and finally fully customize with serving.properties

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the user give HF_MODEL_ID, then the code will automatically convert for them.

serving/docs/lmi/user_guides/embedding-user-guide.md Outdated Show resolved Hide resolved
serving/docs/lmi/user_guides/embedding-user-guide.md Outdated Show resolved Hide resolved
@xyang16 xyang16 force-pushed the docs branch 3 times, most recently from a457f74 to 73f18e7 Compare June 4, 2024 23:34

```
OPTION_ENGINE=OnnxRuntime
MODEL_URL=<your model url>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
MODEL_URL=<your model url>
HF_MODEL_ID=djl://ai.djl.huggingface.onnxruntime/BAAI/bge-base-en-v1.5

You can specify the djl:// model url to load a model from the DJL model zoo.

```
OPTION_ENGINE=OnnxRuntime
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need OPTION_ENGINE djl model zoo

1.1664978,
0.79496926,
0.28931668,
1.2245488,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is too long, could we trim into ... in the middle? And also show the expected shape

@xyang16 xyang16 merged commit 419b889 into deepjavalibrary:master Jun 5, 2024
2 checks passed
sindhuvahinis pushed a commit to sindhuvahinis/djl-serving that referenced this pull request Jun 12, 2024
Co-authored-by: Frank Liu <frankfliu2000@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants