-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add support for multimodal in Vertex #1338
feat: add support for multimodal in Vertex #1338
Conversation
@ArthurGoupil This is awesome. Will the same multimodal principles apply for OpenAI endpoints too? I am looking to develop multimodal capability for OpenAI endpoints |
Hi @adhishthite, I think you can reuse the generic parts of the implementation in #1021, but the final data structure (with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi, thanks for the contribution, looks great!
Would it be possible to also update the docs here to show how we can use multimodal models with vertex ?
Other than that LGTM!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tested locally, it's working, I'm able to upload an image and the model is using it to answer.
README.md
Outdated
"maxWidth": 2000, | ||
"maxHeight": number; | ||
} | ||
}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
}" | |
} |
Hi @ArthurGoupil, I'm not super familiar with the vertex endpoint but wdyt of the comments on the review from @pocman ? Feel free to approve or not and then I'll merge this 😄 |
Hi @nsarrazin, yeah i need to apply them once i have some time, probably in september :/ |
@@ -73,7 +92,8 @@ export function endpointVertex(input: z.input<typeof endpointVertexParametersSch | |||
stopSequences: parameters?.stop, | |||
temperature: parameters?.temperature ?? 1, | |||
}, | |||
tools, | |||
// tools and multimodal are mutually exclusive | |||
tools: !multimodal ? tools : undefined, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
condition should not be on !multimodal (otherwise, the model without picture provided will never use the other tools) but on files.length == 0.
If not files is provided for this specific query to vertex, then tools can be provided
@nsarrazin I believe this is ready to be merged. Can you have a look ? |
Thanks a lot @ArthurGoupil @pocman 🔥 Merging this now |
* feat: add support for multimodal in Vertex * Nit changes and remove tools if multimodal * revert model name change * Fix tools/multimodal condition * chores(lint): fix formatting --------- Co-authored-by: Thomas <thomas.poc@gmail.com> Co-authored-by: Nathan Sarrazin <sarrazin.nathan@gmail.com>
Following #1021 (comment)
See https://cloud.google.com/vertex-ai/generative-ai/docs/multimodal/image-understanding#supported_models