Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Image input causes failure when mixing visual and non-visual LLMs (Dify 0.10) #9750

Closed
5 tasks done
hejuntt1014 opened this issue Oct 24, 2024 · 8 comments · Fixed by #9790
Closed
5 tasks done

Image input causes failure when mixing visual and non-visual LLMs (Dify 0.10) #9750

hejuntt1014 opened this issue Oct 24, 2024 · 8 comments · Fixed by #9790
Assignees
Labels
🐞 bug Something isn't working

Comments

@hejuntt1014
Copy link

Self Checks

  • This is only for bug report, if you would like to ask a question, please head to Discussions.
  • I have searched for existing issues search for existing issues, including closed ones.
  • I confirm that I am using English to submit this report (我已阅读并同意 Language Policy).
  • [FOR CHINESE USERS] 请务必使用英文提交 Issue,否则会被关闭。谢谢!:)
  • Please do not modify this template :) and fill in all the required fields.

Dify version

0.10.1

Cloud or Self Hosted

Cloud

Steps to reproduce

Steps to Reproduce the Issue:

  1. Create a flow in Dify 0.10 that includes:

    • A visual-enabled LLM (e.g., qwen-vl-plus).
    • A non-visual LLM (e.g., qwen-long).
  2. Configure a logic condition node to allow flexible selection between the visual-enabled LLM and the non-visual LLM.

  3. Run the flow and upload an image as input.

  4. In the next round of conversation, use the logic condition to switch and respond with the non-visual LLM model.

image

✔️ Expected Behavior

The model should correctly return the response content.

❌ Actual Behavior

When the conversation history contains an image, using a non-visual model results in an error. This worked correctly in versions prior to 0.10 without any errors.

@dosubot dosubot bot added the 🐞 bug Something isn't working label Oct 24, 2024
@nguyenphan
Copy link

I am having the same issue. Prior to 0.10.0, all my workflows work perfectly fine. Now it keeps return this kind of error, even for different model (like gemini-pro or gpt-4o). It seems like the way dify handle the image and send it to the LLM changes and it bugged.

I don't think it is due to the mixture. Even with a Text-generation app have this issue.

Please look into it soon.

@hjlarry
Copy link
Contributor

hjlarry commented Oct 24, 2024

image

I can't reproduce this in the latest branch

@nguyenphan
Copy link

nguyenphan commented Oct 24, 2024

@hjlarry I can use normally with the "Upload from Computer" option, but if you try to use "Paste a link", it won't work. The issue is on both the GUI and API (transfer_method="remote_url"). I don't want to reupload the image to Dify where it is already available on my s3.

like this:
Screenshot 2024-10-22 at 5 08 12 PM

@nguyenphan
Copy link

@hjlarry please note that, when using API, the error is like this:

  • Google: [google] Server Unavailable Error, 500 An internal error has occurred. Please retry or report in https://developers.generativeai.google/guide/troubleshooting
  • GPT4o: [openai] Bad Request Error, Error code: 400 - {'error': {'message': 'Invalid MIME type. Only image types are supported.', 'type': 'invalid_request_error', 'param': None, 'code': 'invalid_image_format'}}

This makes me suspect that, some changes in 0.10 has malform the image that got sent to the llm. I can assure you the app I made has been running flawlessly prior to 0.10

@hjlarry
Copy link
Contributor

hjlarry commented Oct 24, 2024

@nguyenphan the issue you encounterd seems not related to this issue, I think you can open another issue to decribe how to reproduct it

@Copilotes
Copy link

same bug with #9738 , right?

@nguyenphan
Copy link

@Copilotes not so sure, seems like the same issue, but per @hjlarry request, I will file another bug with more elaborated steps and details.

@nguyenphan
Copy link

@Copilotes you can check out my issue here #9776

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🐞 bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants