Allow image uploads to gr.load_chat #10345

aliabid94 · 2025-01-13T17:28:21Z

This adds the ability to upload images with gr.load_chat. I added an api_media parameter to gr.load_chat. In the future we can add "image_generation" and "file_assistant", which use very different api structures but are common model types.

Test with

import gradio as gr

demo = gr.load_chat(base_url="https://api.openai.com/v1",
                    model="gpt-4o",
                    token="...",
                    file_types="image")

if __name__ == "__main__":
    demo.launch()

gradio-pr-bot · 2025-01-13T17:29:04Z

🪼 branch checks and previews

•	Name	Status	URL
	Spaces	ready!	Spaces preview
	Website	ready!	Website preview
🦄	Changes	detected!	Details

Install Gradio from this PR

pip install https://gradio-pypi-previews.s3.amazonaws.com/033de375f2c99a7f0cbe0a8064a15f5bcf9cc72e/gradio-5.14.0-py3-none-any.whl

Install Gradio Python Client from this PR

pip install "gradio-client @ git+https://github.com/gradio-app/gradio@033de375f2c99a7f0cbe0a8064a15f5bcf9cc72e#subdirectory=client/python"

Install Gradio JS Client from this PR

npm install https://gradio-npm-previews.s3.amazonaws.com/033de375f2c99a7f0cbe0a8064a15f5bcf9cc72e/gradio-client-1.10.0.tgz

Use Lite from this PR

<script type="module" src="https://gradio-lite-previews.s3.amazonaws.com/033de375f2c99a7f0cbe0a8064a15f5bcf9cc72e/dist/lite.js""></script>

gradio-pr-bot · 2025-01-13T17:29:20Z

🦄 change detected

This Pull Request includes changes to the following packages.

Package	Version
`gradio`	`minor`

Maintainers can select this checkbox to manually select packages to update.

With the following changelog entry.

Allow image uploads to gr.load_chat

Maintainers or the PR author can modify the PR title to modify this entry.

Something isn't right?

Maintainers can change the version label to modify the version bump.
If the bot has failed to detect any changes, or if this pull request needs to update multiple packages to different versions or requires a more comprehensive changelog entry, maintainers can update the changelog file directly.

abidlabs · 2025-01-13T17:36:41Z

"image_generation" and "file_assistant", which use very different api structures but are common model types

where are these "model types" coming from?

Could we just use our existing paramter mutimodal=True which is already familiar to developers?

aliabid94 · 2025-01-13T22:29:39Z

where are these "model types" coming from?

So the openai api supports these endpoint, which are all distinct endpoints in the client

regular chat (including image uploads)
image generation
file upload and chat
.. and some other specific endpoints

so for example, you cannot have a chat endpoint that supports both image generation and image uploads, or both image uploads and file uploads. And the api for the client is very distinct for each of these.

Now most non-openai endpoints only implement the regular chat endpoint. Even within that, many models only support text, no image uploads. So for example, for anthropic, you can't actually upload a csv - you have to include the files as text in the prompt itself.

For this reason, I'm starting off with just text and image_upload. Images and non-image files do not use the same api and non-image files are not supported by most providers so we'll support those when they become more popular. Non-image binary files require tool use to process the files as well.

multimodal would imply any type of file and since the api behaviours are different for image and non-image files, that's not the correct arg name.

…p/gradio into load_chat_image_upload

aliabid94 · 2025-01-14T18:31:37Z

Ready for re-review. Changed the api to use file_types, which can support "text_files" (any text-encoded file, which is added to the prompt) and "image" (which embeds the image as base64)

abidlabs · 2025-01-15T01:43:15Z

gradio/external.py

+        if "text_files" in file_types:
+            supported_extensions += TEXT_FILE_EXTENSIONS
+        if "images" in file_types:
+            supported_extensions += IMAGE_FILE_EXTENSIONS


You can just set file_types="image", which covers all of the image formats that have the image mimetype (for example, your list above is missing .webp)

openai image api only supports a subset of the images supported by "images/*", so it's necessary to specify.

abidlabs · 2025-01-15T01:45:26Z

gradio/external.py

 @document()
 def load_chat(
    base_url: str,
    model: str,
    token: str | None = None,
    *,
+    file_types: Literal["text_files", "images"]


Not intuitive that text_files includes any text-encoded files, I would have just expected .txt files. Perhaps "text_encoded" is better. And I would rename "images" -> "image" as it is consistent with how one specifies images in file_types param in other components

abidlabs · 2025-01-15T01:54:32Z

I tested this demo:

import gradio as gr

demo = gr.load_chat(base_url="https://api.openai.com/v1",
                    model="gpt-4o-mini",
                    token="sk-...",
                    file_types=["images"])

if __name__ == "__main__":
    demo.launch()

The first message (with an image attachment) went through and I got a good response, but with the second message (no image attachment), I got this error:

openai.BadRequestError: Error code: 400 - {'error': {'message': "Invalid type for 'messages[0].content[0]': expected an object, but got a string instead.", 'type': 'invalid_request_error', 'param': 'messages[0].content[0]', 'code': 'invalid_type'}}

Overall implementation lgtm, I would add some tests & expand the docs (maybe with a dedicated guide?) now that gr.load_chat() is getting more complex

aliabid94 · 2025-01-17T04:04:14Z

The first message (with an image attachment) went through and I got a good response, but with the second message (no image attachment), I got this error:

Fixed

aliabid94 · 2025-01-17T04:19:29Z

Overall implementation lgtm, I would add some tests & expand the docs

Tests are a bit tricky, what "always available" ednpoint can I connect gr.load_chat to for testing?

gr.load_chat is really quite simple, it was already in the guide but I linked to the docs in the guides as well and expanded the docs a bit.

abidlabs · 2025-01-17T06:20:19Z

Tests are a bit tricky, what "always available" ednpoint can I connect gr.load_chat to for testing?

Just mock one so that we don’t break any core functionality in a future PR

gradio/external.py

abidlabs · 2025-01-17T17:00:55Z

The first message (with an image attachment) went through and I got a good response, but with the second message (no image attachment), I got this error:

I'm still seeing this error (see video below). The error I see in the terminal is:

openai.InternalServerError: Error code: 500 - {'error': {'message': 'The server had an error processing your request. Sorry about that! You can retry your request, or contact us through our help center at help.openai.com if you keep seeing this error. (Please include the request ID req_5a9164a17a39d924958d5e0aed0e8dad in your email.)', 'type': 'server_error', 'param': None, 'code': None}}

Screen.Recording.2025-01-17.at.8.59.02.AM.mov

aliabid94 · 2025-01-18T01:26:01Z

I'm still seeing this error (see video below). The error I see in the terminal is:

Fixed.

gradio/external.py

abidlabs · 2025-01-24T05:40:36Z

gradio/external.py

+        multimodal=bool(file_types),
+        textbox=gradio.MultimodalTextbox(file_types=supported_extensions)
+        if file_types
+        else None,


Just fyi this only allows uploading a single image at a time, whereas the api can support multiple changes I believe. To change, set file_count="multiple" in gr.MultimodalTextbox

abidlabs

See note about file count above. I would also add a test that mocks responses from an openai server just so we don't accidentally break things, but otherwise lgtm!

abidlabs · 2025-02-03T19:51:03Z

Added a couple of tests, will merge this in once tests pass.

changes

8438b84

aliabid94 requested a review from abidlabs January 13, 2025 17:28

add changeset

5b9a66c

abidlabs requested review from dawoodkhan82 and freddyaboulton January 13, 2025 17:29

Ali Abid and others added 3 commits January 13, 2025 15:41

changes

9cc9673

Merge branch 'load_chat_image_upload' of https://github.com/gradio-ap…

5fe0264

…p/gradio into load_chat_image_upload

Merge branch 'main' into load_chat_image_upload

7d37802

abidlabs reviewed Jan 15, 2025

View reviewed changes

changes

d0f79c4

Ali Abid added 4 commits January 16, 2025 20:06

chagens

c42f28d

Merge remote-tracking branch 'origin' into load_chat_image_upload

7b0ce2c

changes

a484aa3

changes

84f963b

chagnges

caa6be0

abidlabs reviewed Jan 17, 2025

View reviewed changes

gradio/external.py Outdated Show resolved Hide resolved

changes

f8eea77

changes

1acd45e

Ali Abid and others added 2 commits January 17, 2025 17:27

Merge remote-tracking branch 'origin' into load_chat_image_upload

b5960e2

Merge branch 'main' into load_chat_image_upload

2e1b365

abidlabs reviewed Jan 24, 2025

View reviewed changes

gradio/external.py Outdated Show resolved Hide resolved

Update gradio/external.py

f3c2d72

abidlabs reviewed Jan 24, 2025

View reviewed changes

abidlabs approved these changes Jan 24, 2025

View reviewed changes

abidlabs added 3 commits February 3, 2025 11:48

changes

1191eda

Merge branch 'main' into load_chat_image_upload

71aad90

simplify tests

033de37

abidlabs enabled auto-merge (squash) February 3, 2025 19:51

abidlabs merged commit 39f0c23 into main Feb 3, 2025
22 checks passed

abidlabs deleted the load_chat_image_upload branch February 3, 2025 20:01

gradio-pr-bot mentioned this pull request Feb 3, 2025

chore: update versions #10479

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow image uploads to gr.load_chat #10345

Allow image uploads to gr.load_chat #10345

aliabid94 commented Jan 13, 2025 •

edited

Loading

gradio-pr-bot commented Jan 13, 2025 •

edited

Loading

gradio-pr-bot commented Jan 13, 2025 •

edited

Loading

Something isn't right?

abidlabs commented Jan 13, 2025

aliabid94 commented Jan 13, 2025 •

edited

Loading

aliabid94 commented Jan 14, 2025

abidlabs Jan 15, 2025 •

edited

Loading

aliabid94 Jan 17, 2025

abidlabs Jan 15, 2025

abidlabs commented Jan 15, 2025

aliabid94 commented Jan 17, 2025

aliabid94 commented Jan 17, 2025

abidlabs commented Jan 17, 2025

abidlabs commented Jan 17, 2025

aliabid94 commented Jan 18, 2025

abidlabs Jan 24, 2025

abidlabs left a comment •

edited

Loading

abidlabs commented Feb 3, 2025

Allow image uploads to gr.load_chat #10345

Allow image uploads to gr.load_chat #10345

Conversation

aliabid94 commented Jan 13, 2025 • edited Loading

gradio-pr-bot commented Jan 13, 2025 • edited Loading

🪼 branch checks and previews

gradio-pr-bot commented Jan 13, 2025 • edited Loading

🦄 change detected

This Pull Request includes changes to the following packages.

With the following changelog entry.

Something isn't right?

abidlabs commented Jan 13, 2025

aliabid94 commented Jan 13, 2025 • edited Loading

aliabid94 commented Jan 14, 2025

abidlabs Jan 15, 2025 • edited Loading

Choose a reason for hiding this comment

aliabid94 Jan 17, 2025

Choose a reason for hiding this comment

abidlabs Jan 15, 2025

Choose a reason for hiding this comment

abidlabs commented Jan 15, 2025

aliabid94 commented Jan 17, 2025

aliabid94 commented Jan 17, 2025

abidlabs commented Jan 17, 2025

abidlabs commented Jan 17, 2025

aliabid94 commented Jan 18, 2025

abidlabs Jan 24, 2025

Choose a reason for hiding this comment

abidlabs left a comment • edited Loading

Choose a reason for hiding this comment

abidlabs commented Feb 3, 2025

aliabid94 commented Jan 13, 2025 •

edited

Loading

gradio-pr-bot commented Jan 13, 2025 •

edited

Loading

gradio-pr-bot commented Jan 13, 2025 •

edited

Loading

aliabid94 commented Jan 13, 2025 •

edited

Loading

abidlabs Jan 15, 2025 •

edited

Loading

abidlabs left a comment •

edited

Loading