You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I searched existing ideas and did not find a similar one
I added a very descriptive title
I've clearly described the feature request and motivation for it
Feature request
I propose adding native support for reading PDF files in the Anthropic and Gemini models via their respective APIs (Anthropic API and Vertex AI). This feature would allow users to upload a PDF file directly for processing, enabling the models to extract both text and visual elements, such as images.
Expected functionality:
Direct upload of a PDF file to the API.
Automated processing of the PDF content, including both text and images.
Structured output containing the extracted text, metadata, and visual details.
Motivation
The ability to work with PDFs natively is essential for a wide range of use cases, including legal document analysis, technical reports, academic studies, and any context involving a combination of text and images.
Currently, users need to preprocess PDFs manually before sending them to the models, which adds complexity, time, and potential errors to the workflow. Implementing native support would streamline the process, improve efficiency, and enhance the versatility of the APIs.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
Checked
Feature request
I propose adding native support for reading PDF files in the Anthropic and Gemini models via their respective APIs (Anthropic API and Vertex AI). This feature would allow users to upload a PDF file directly for processing, enabling the models to extract both text and visual elements, such as images.
Expected functionality:
Motivation
The ability to work with PDFs natively is essential for a wide range of use cases, including legal document analysis, technical reports, academic studies, and any context involving a combination of text and images.
Currently, users need to preprocess PDFs manually before sending them to the models, which adds complexity, time, and potential errors to the workflow. Implementing native support would streamline the process, improve efficiency, and enhance the versatility of the APIs.
Proposal (If applicable)
No response
Beta Was this translation helpful? Give feedback.
All reactions