Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support top n sigma sampling #2192

Closed

Conversation

Snowdar
Copy link

@Snowdar Snowdar commented Nov 26, 2024

Motivation

Support a more complete samling options for decoding, especialy for openai compatible api.

Modifications

  1. Add the rencent top-n-sigma samping as a decoding option and it will be applied to the logits after temperature applied when top_n_sigma > 0. The batch processing has been implemented.
  2. Add top-k, min-p and top_n_sigma params for openai compatible api server (Note, please add these three params by extra_body when using openai SDK)

client example:

client = OpenAI(api_key='no key', base_url='http://localhost:30000/v1')
response = client.chat.completions.create(
      model='llama',
      messages=dict(role='user', content='hi'),
      temperature=0.3,
      max_tokens=128,
      top_p=0.95,
      frequency_penalty=0.,
      presence_penalty=0.,
      extra_body=dict(
          min_p=0.,
          top_k=-1,
          top_n_sigma=1, # enable it
      ),
)

Checklist

  • Format your code according to the Contributor Guide.
  • Add unit tests as outlined in the Contributor Guide.
  • Update documentation as needed, including docstrings or example tutorials.

Copy link
Contributor

@merrymercy merrymercy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the contribution. I left a few comments.

python/sglang/srt/openai_api/protocol.py Outdated Show resolved Hide resolved
python/sglang/srt/layers/sampler.py Outdated Show resolved Hide resolved
python/sglang/srt/layers/sampler.py Outdated Show resolved Hide resolved
@Snowdar Snowdar requested a review from merrymercy December 2, 2024 02:09
@Snowdar
Copy link
Author

Snowdar commented Dec 2, 2024

Thanks for the contribution. I left a few comments.

OK. All of these have completed. Could you please view it again?

@merrymercy
Copy link
Contributor

top-k and min-p are fixed in #2499.
However, top n sigma is still not a common sampling strategy, we may not want to maintain it in the core right now.
For these things, it is better to support them via a custom logit processer interface. Can you help review this issue? #2291

I will close this for now.

@merrymercy merrymercy closed this Dec 17, 2024
@Snowdar
Copy link
Author

Snowdar commented Dec 18, 2024

top-k and min-p are fixed in #2499. However, top n sigma is still not a common sampling strategy, we may not want to maintain it in the core right now. For these things, it is better to support them via a custom logit processer interface. Can you help review this issue? #2291

I will close this for now.

So if the custom logit processer could be supported,then any other sampling codes will not be merged into sglang (considering sharing the custom samping to other people by the sglang itself)?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants