VLLM Sampling params #171

HRashidi · 2024-09-06T07:39:55Z

The user should be able set all available params supported by vllm like repetition_penalty, ...

Using vllm.sampling_params model as input directly instead of declaring the SamplingParam in the sdk (It can be too much arguments for the user)
Adding a kw_dict for setting other values inside the SamplingParam

The text was updated successfully, but these errors were encountered:

movchan74 · 2024-09-09T14:51:52Z

Did we get a request for extra sampling parameters from someone?

Just to clarify why we are not using sampling params model from vLLM (vllm.sampling_params.SamplingParams):

vLLM uses msgspec for SamplingParams not Pydantic so it will not work well as parameter to the endpoints.
We want to support the same SamplingParams for different text generation deployments. Right now we support vLLM and HuggingFace Transformers. Soon HQQ Text Generation Models. So we don't want to be dependent on vLLM.

HRashidi added the feature request label Sep 6, 2024

movchan74 mentioned this issue Sep 10, 2024

Add repetition_penalty and kwargs parameters to SamplingParams #174

Merged

movchan74 closed this as completed in #174 Sep 11, 2024

Provide feedback