Native Support for PDF Reading in Anthropic and Gemini Models (API and Vertex) #7243

Abel1011 · 2024-11-22T19:08:47Z

Abel1011
Nov 22, 2024

Checked

I searched existing ideas and did not find a similar one
I added a very descriptive title
I've clearly described the feature request and motivation for it

Feature request

I propose adding native support for reading PDF files in the Anthropic and Gemini models via their respective APIs (Anthropic API and Vertex AI). This feature would allow users to upload a PDF file directly for processing, enabling the models to extract both text and visual elements, such as images.

Expected functionality:

Direct upload of a PDF file to the API.
Automated processing of the PDF content, including both text and images.
Structured output containing the extracted text, metadata, and visual details.

Motivation

The ability to work with PDFs natively is essential for a wide range of use cases, including legal document analysis, technical reports, academic studies, and any context involving a combination of text and images.

Currently, users need to preprocess PDFs manually before sending them to the models, which adds complexity, time, and potential errors to the workflow. Implementing native support would streamline the process, improve efficiency, and enhance the versatility of the APIs.

Proposal (If applicable)

No response

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Native Support for PDF Reading in Anthropic and Gemini Models (API and Vertex) #7243

{{title}}

Replies: 0 comments

Select a reply

Native Support for PDF Reading in Anthropic and Gemini Models (API and Vertex) #7243

Abel1011 Nov 22, 2024

Checked

Feature request

Motivation

Proposal (If applicable)

Replies: 0 comments

Abel1011
Nov 22, 2024