Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

added fitz and unstructuredio pdf loader py evadb script #1340

Open
wants to merge 2 commits into
base: staging
Choose a base branch
from

Conversation

seansru
Copy link

@seansru seansru commented Nov 4, 2023

No description provided.

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👋 Hello @Stru17, thanks for submitting a EVA DB PR 🙏 To allow your work to be integrated as seamlessly as possible, we advise you to:

  • ✅ Verify that your PR is up-to-date with georgia-tech-db/eva master branch. If your PR is behind you can update your code by clicking the 'Update branch' button or by running git pull and git merge master locally.
  • ✅ Verify that all EVA DB Continuous Integration (CI) checks are passing.
  • ✅ Reduce changes to the absolute minimum required for your bug fix or feature addition.

@xzdandy
Copy link
Collaborator

xzdandy commented Nov 6, 2023

Hi, @Stru17 how should the PDFReader be used by the EvaDB?

@jarulraj
Copy link
Member

jarulraj commented Nov 6, 2023

The idea is to replace add an additional PDF Reader backed by Unstructured IO -- not replace the default PDF reader.

@jarulraj
Copy link
Member

jarulraj commented Nov 7, 2023

@Stru17 The output and structure of this UnstructuredIOPDFReader class should match that of the original PDFReader class. Currently, it is more of a Python script.

@seansru
Copy link
Author

seansru commented Nov 20, 2023

I have provided an updated script, please take a look and provide some feedback on if I should further change/update it. I have also message the professor on Slack and would appreciate if I can get a response as soon as possible!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants