Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VLLM Sampling params #171

Closed
HRashidi opened this issue Sep 6, 2024 · 1 comment · Fixed by #174
Closed

VLLM Sampling params #171

HRashidi opened this issue Sep 6, 2024 · 1 comment · Fixed by #174

Comments

@HRashidi
Copy link
Contributor

HRashidi commented Sep 6, 2024

Feature Summary

  • Add support to set other values for vllm sampling params

Justification/Rationale

  • The user should be able set all available params supported by vllm like repetition_penalty, ...

Proposed Implementation (if any)

  • Using vllm.sampling_params model as input directly instead of declaring the SamplingParam in the sdk (It can be too much arguments for the user)
  • Adding a kw_dict for setting other values inside the SamplingParam
@movchan74
Copy link
Contributor

Did we get a request for extra sampling parameters from someone?

Just to clarify why we are not using sampling params model from vLLM (vllm.sampling_params.SamplingParams):

  1. vLLM uses msgspec for SamplingParams not Pydantic so it will not work well as parameter to the endpoints.
  2. We want to support the same SamplingParams for different text generation deployments. Right now we support vLLM and HuggingFace Transformers. Soon HQQ Text Generation Models. So we don't want to be dependent on vLLM.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants