Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add simple pdf extraction service #1070

Merged
merged 11 commits into from
May 9, 2023

Conversation

tangopapatime
Copy link
Contributor

Description

Description

Added a local pdf extraction service in Python/Flask
Added tera-drag-n-drop component to the project - v0 of setting up for user uploading.

Resolves #(issue)
Issue 1020

import_resource.webm

@tangopapatime tangopapatime linked an issue May 4, 2023 that may be closed by this pull request
@tangopapatime
Copy link
Contributor Author

I haven't yet resolved the multipart pass through issue but have found a workaround for now. Need to add additional documentation as well.

@tangopapatime tangopapatime changed the title feat: Add simple pdf extraction service feat: add simple pdf extraction service May 4, 2023
@tangopapatime
Copy link
Contributor Author

The major outstanding follow up task is to integrate this service as part of the orchestration. Especially if actually want this to work on staging. Definitely not my forte but can give it a shot if someone can outline the major steps so that I'm not forgetting anything.

Copy link
Collaborator

@dvince2 dvince2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Java side looks good - just a few small comments on it to do before this is finalized. I took a quick look at the front end stuff but someone else may want to comment there, and I'm not sure about python 😂

@tangopapatime tangopapatime requested review from Tom-Szendrey and echl May 5, 2023 16:19
@tangopapatime tangopapatime requested review from YohannParis and removed request for YohannParis May 9, 2023 11:25
@tangopapatime
Copy link
Contributor Author

I still need an check @YohannParis . Take a peak at this and see if it's a-ok.

@YohannParis
Copy link
Member

I still need an check @YohannParis . Take a peak at this and see if it's a-ok.

After a careful review of those 1000+ lines of code, I approve.

@tangopapatime tangopapatime merged commit ef1c38c into main May 9, 2023
@tangopapatime tangopapatime deleted the 1020-extraction-text-layer-from-pdf branch May 9, 2023 19:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

extraction text layer from PDF
3 participants