Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[InferenceClient] Manually updating chat_completion()'s params types #2682

Merged
merged 2 commits into from
Nov 28, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions docs/source/en/package_reference/inference_types.md
Original file line number Diff line number Diff line change
Expand Up @@ -65,7 +65,9 @@ This part of the lib is still under development and will be improved in future r

[[autodoc]] huggingface_hub.ChatCompletionInputStreamOptions

[[autodoc]] huggingface_hub.ChatCompletionInputToolType
[[autodoc]] huggingface_hub.ChatCompletionInputTool

[[autodoc]] huggingface_hub.ChatCompletionInputToolChoiceClass

[[autodoc]] huggingface_hub.ChatCompletionInputURL

Expand Down Expand Up @@ -105,8 +107,6 @@ This part of the lib is still under development and will be improved in future r

[[autodoc]] huggingface_hub.ChatCompletionStreamOutputUsage

[[autodoc]] huggingface_hub.ToolElement



## depth_estimation
Expand Down
6 changes: 3 additions & 3 deletions docs/source/ko/package_reference/inference_types.md
Original file line number Diff line number Diff line change
Expand Up @@ -64,7 +64,9 @@ rendered properly in your Markdown viewer.

[[autodoc]] huggingface_hub.ChatCompletionInputStreamOptions

[[autodoc]] huggingface_hub.ChatCompletionInputToolType
[[autodoc]] huggingface_hub.ChatCompletionInputTool

[[autodoc]] huggingface_hub.ChatCompletionInputToolChoiceClass

[[autodoc]] huggingface_hub.ChatCompletionInputURL

Expand Down Expand Up @@ -104,8 +106,6 @@ rendered properly in your Markdown viewer.

[[autodoc]] huggingface_hub.ChatCompletionStreamOutputUsage

[[autodoc]] huggingface_hub.ToolElement



## depth_estimation[[huggingface_hub.DepthEstimationInput]]
Expand Down
10 changes: 6 additions & 4 deletions src/huggingface_hub/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -297,7 +297,9 @@
"ChatCompletionInputMessageChunk",
"ChatCompletionInputMessageChunkType",
"ChatCompletionInputStreamOptions",
"ChatCompletionInputToolType",
"ChatCompletionInputTool",
"ChatCompletionInputToolChoiceClass",
"ChatCompletionInputToolChoiceEnum",
"ChatCompletionInputURL",
"ChatCompletionOutput",
"ChatCompletionOutputComplete",
Expand Down Expand Up @@ -400,7 +402,6 @@
"TokenClassificationInput",
"TokenClassificationOutputElement",
"TokenClassificationParameters",
"ToolElement",
"TranslationInput",
"TranslationOutput",
"TranslationParameters",
Expand Down Expand Up @@ -827,7 +828,9 @@ def __dir__():
ChatCompletionInputMessageChunk, # noqa: F401
ChatCompletionInputMessageChunkType, # noqa: F401
ChatCompletionInputStreamOptions, # noqa: F401
ChatCompletionInputToolType, # noqa: F401
ChatCompletionInputTool, # noqa: F401
ChatCompletionInputToolChoiceClass, # noqa: F401
ChatCompletionInputToolChoiceEnum, # noqa: F401
ChatCompletionInputURL, # noqa: F401
ChatCompletionOutput, # noqa: F401
ChatCompletionOutputComplete, # noqa: F401
Expand Down Expand Up @@ -930,7 +933,6 @@ def __dir__():
TokenClassificationInput, # noqa: F401
TokenClassificationOutputElement, # noqa: F401
TokenClassificationParameters, # noqa: F401
ToolElement, # noqa: F401
TranslationInput, # noqa: F401
TranslationOutput, # noqa: F401
TranslationParameters, # noqa: F401
Expand Down
62 changes: 26 additions & 36 deletions src/huggingface_hub/inference/_client.py
Original file line number Diff line number Diff line change
Expand Up @@ -70,7 +70,9 @@
AutomaticSpeechRecognitionOutput,
ChatCompletionInputGrammarType,
ChatCompletionInputStreamOptions,
ChatCompletionInputToolType,
ChatCompletionInputTool,
ChatCompletionInputToolChoiceClass,
ChatCompletionInputToolChoiceEnum,
ChatCompletionOutput,
ChatCompletionStreamOutput,
DocumentQuestionAnsweringOutputElement,
Expand All @@ -79,6 +81,7 @@
ImageClassificationOutputTransform,
ImageSegmentationOutputElement,
ImageSegmentationSubtask,
ImageToImageTargetSize,
ImageToTextOutput,
ObjectDetectionOutputElement,
QuestionAnsweringOutputElement,
Expand All @@ -94,7 +97,6 @@
TextToSpeechEarlyStoppingEnum,
TokenClassificationAggregationStrategy,
TokenClassificationOutputElement,
ToolElement,
TranslationOutput,
TranslationTruncationStrategy,
VisualQuestionAnsweringOutputElement,
Expand Down Expand Up @@ -473,9 +475,9 @@ def chat_completion( # type: ignore
stop: Optional[List[str]] = None,
stream_options: Optional[ChatCompletionInputStreamOptions] = None,
temperature: Optional[float] = None,
tool_choice: Optional[Union[ChatCompletionInputToolType, str]] = None,
tool_choice: Optional[Union[ChatCompletionInputToolChoiceClass, "ChatCompletionInputToolChoiceEnum"]] = None,
tool_prompt: Optional[str] = None,
tools: Optional[List[ToolElement]] = None,
tools: Optional[List[ChatCompletionInputTool]] = None,
top_logprobs: Optional[int] = None,
top_p: Optional[float] = None,
) -> ChatCompletionOutput: ...
Expand All @@ -498,9 +500,9 @@ def chat_completion( # type: ignore
stop: Optional[List[str]] = None,
stream_options: Optional[ChatCompletionInputStreamOptions] = None,
temperature: Optional[float] = None,
tool_choice: Optional[Union[ChatCompletionInputToolType, str]] = None,
tool_choice: Optional[Union[ChatCompletionInputToolChoiceClass, "ChatCompletionInputToolChoiceEnum"]] = None,
tool_prompt: Optional[str] = None,
tools: Optional[List[ToolElement]] = None,
tools: Optional[List[ChatCompletionInputTool]] = None,
top_logprobs: Optional[int] = None,
top_p: Optional[float] = None,
) -> Iterable[ChatCompletionStreamOutput]: ...
Expand All @@ -523,9 +525,9 @@ def chat_completion(
stop: Optional[List[str]] = None,
stream_options: Optional[ChatCompletionInputStreamOptions] = None,
temperature: Optional[float] = None,
tool_choice: Optional[Union[ChatCompletionInputToolType, str]] = None,
tool_choice: Optional[Union[ChatCompletionInputToolChoiceClass, "ChatCompletionInputToolChoiceEnum"]] = None,
tool_prompt: Optional[str] = None,
tools: Optional[List[ToolElement]] = None,
tools: Optional[List[ChatCompletionInputTool]] = None,
top_logprobs: Optional[int] = None,
top_p: Optional[float] = None,
) -> Union[ChatCompletionOutput, Iterable[ChatCompletionStreamOutput]]: ...
Expand All @@ -548,9 +550,9 @@ def chat_completion(
stop: Optional[List[str]] = None,
stream_options: Optional[ChatCompletionInputStreamOptions] = None,
temperature: Optional[float] = None,
tool_choice: Optional[Union[ChatCompletionInputToolType, str]] = None,
tool_choice: Optional[Union[ChatCompletionInputToolChoiceClass, "ChatCompletionInputToolChoiceEnum"]] = None,
tool_prompt: Optional[str] = None,
tools: Optional[List[ToolElement]] = None,
tools: Optional[List[ChatCompletionInputTool]] = None,
top_logprobs: Optional[int] = None,
top_p: Optional[float] = None,
) -> Union[ChatCompletionOutput, Iterable[ChatCompletionStreamOutput]]:
Expand Down Expand Up @@ -616,11 +618,11 @@ def chat_completion(
top_p (`float`, *optional*):
Fraction of the most likely next words to sample from.
Must be between 0 and 1. Defaults to 1.0.
tool_choice ([`ChatCompletionInputToolType`] or `str`, *optional*):
tool_choice ([`ChatCompletionInputToolChoiceClass`] or [`ChatCompletionInputToolChoiceEnum`], *optional*):
The tool to use for the completion. Defaults to "auto".
tool_prompt (`str`, *optional*):
A prompt to be appended before the tools.
tools (List of [`ToolElement`], *optional*):
tools (List of [`ChatCompletionInputTool`], *optional*):
A list of tools the model may call. Currently, only functions are supported as a tool. Use this to
provide a list of functions the model may generate JSON inputs for.

Expand Down Expand Up @@ -1224,12 +1226,11 @@ def image_to_image(
image: ContentT,
prompt: Optional[str] = None,
*,
negative_prompt: Optional[str] = None,
height: Optional[int] = None,
width: Optional[int] = None,
negative_prompt: Optional[List[str]] = None,
num_inference_steps: Optional[int] = None,
guidance_scale: Optional[float] = None,
model: Optional[str] = None,
target_size: Optional[ImageToImageTargetSize] = None,
Comment on lines +1229 to +1233
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(nit) worth mentioning this breaking change in the release notes

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

btw, I did not add a proper deprecation given the possible limited usage of InferenceClient.image_to_image(), but of course, this will be mentioned in the release notes!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep, no need for a deprecation cycle here 👍

**kwargs,
) -> "Image":
"""
Expand All @@ -1246,21 +1247,19 @@ def image_to_image(
The input image for translation. It can be raw bytes, an image file, or a URL to an online image.
prompt (`str`, *optional*):
The text prompt to guide the image generation.
negative_prompt (`str`, *optional*):
A negative prompt to guide the translation process.
height (`int`, *optional*):
The height in pixels of the generated image.
width (`int`, *optional*):
The width in pixels of the generated image.
negative_prompt (`List[str]`, *optional*):
One or several prompt to guide what NOT to include in image generation.
num_inference_steps (`int`, *optional*):
The number of denoising steps. More denoising steps usually lead to a higher quality image at the
expense of slower inference.
For diffusion models. The number of denoising steps. More denoising steps usually lead to a higher
quality image at the expense of slower inference.
guidance_scale (`float`, *optional*):
Higher guidance scale encourages to generate images that are closely linked to the text `prompt`,
usually at the expense of lower image quality.
For diffusion models. A higher guidance scale value encourages the model to generate images closely
linked to the text prompt at the expense of lower image quality.
model (`str`, *optional*):
The model to use for inference. Can be a model ID hosted on the Hugging Face Hub or a URL to a deployed
Inference Endpoint. This parameter overrides the model defined at the instance level. Defaults to None.
target_size (`ImageToImageTargetSize`, *optional*):
The size in pixel of the output image.

Returns:
`Image`: The translated image.
Expand All @@ -1282,8 +1281,7 @@ def image_to_image(
parameters = {
"prompt": prompt,
"negative_prompt": negative_prompt,
"height": height,
"width": width,
"target_size": target_size,
"num_inference_steps": num_inference_steps,
"guidance_scale": guidance_scale,
**kwargs,
Expand Down Expand Up @@ -2469,21 +2467,13 @@ def text_to_speech(
Defaults to None.
do_sample (`bool`, *optional*):
Whether to use sampling instead of greedy decoding when generating new tokens.
early_stopping (`Union[bool, "TextToSpeechEarlyStoppingEnum"`, *optional*):
early_stopping (`Union[bool, "TextToSpeechEarlyStoppingEnum"]`, *optional*):
Controls the stopping condition for beam-based methods.
epsilon_cutoff (`float`, *optional*):
If set to float strictly between 0 and 1, only tokens with a conditional probability greater than
epsilon_cutoff will be sampled. In the paper, suggested values range from 3e-4 to 9e-4, depending on
the size of the model. See [Truncation Sampling as Language Model
Desmoothing](https://hf.co/papers/2210.15191) for more details.
eta_cutoff (`float`, *optional*):
Eta sampling is a hybrid of locally typical sampling and epsilon sampling. If set to float strictly
between 0 and 1, a token is only considered if it is greater than either eta_cutoff or sqrt(eta_cutoff)
* exp(-entropy(softmax(next_token_logits))). The latter term is intuitively the expected next token
probability, scaled by sqrt(eta_cutoff). In the paper, suggested values range from 3e-4 to 2e-3,
depending on the size of the model. See [Truncation Sampling as Language Model
Desmoothing](https://hf.co/papers/2210.15191) for more details.
float strictly between 0 and 1, a token is only considered if it is greater than either
eta_cutoff (`float`, *optional*):
Eta sampling is a hybrid of locally typical sampling and epsilon sampling. If set to float strictly
between 0 and 1, a token is only considered if it is greater than either eta_cutoff or sqrt(eta_cutoff)
Expand Down
Loading
Loading