feat: add simple pdf extraction service #1070

tangopapatime · 2023-05-04T18:57:59Z

Description

Added a local pdf extraction service in Python/Flask
Added tera-drag-n-drop component to the project - v0 of setting up for user uploading.

Resolves #(issue)
Issue 1020

import_resource.webm

tangopapatime · 2023-05-04T18:59:13Z

I haven't yet resolved the multipart pass through issue but have found a workaround for now. Need to add additional documentation as well.

packages/client/hmi-client/src/types/common.ts

tangopapatime · 2023-05-04T19:43:25Z

The major outstanding follow up task is to integrate this service as part of the orchestration. Especially if actually want this to work on staging. Definitely not my forte but can give it a shot if someone can outline the major steps so that I'm not forgetting anything.

dvince2

Java side looks good - just a few small comments on it to do before this is finalized. I took a quick look at the front end stuff but someone else may want to comment there, and I'm not sure about python 😂

...are/uncharted/terarium/hmiserver/proxies/pdfextractionservice/PDFExtractionServiceProxy.java

...i-server/src/main/java/software/uncharted/terarium/hmiserver/resources/DownloadResource.java

tangopapatime · 2023-05-09T17:15:24Z

I still need an check @YohannParis . Take a peak at this and see if it's a-ok.

YohannParis · 2023-05-09T17:31:44Z

I still need an check @YohannParis . Take a peak at this and see if it's a-ok.

After a careful review of those 1000+ lines of code, I approve.

Thanh Pham added 6 commits April 30, 2023 19:37

added PDF extraction - first commit

f0c39d3

Merge branch 'main' into 1020-extraction-text-layer-from-pdf

5a4aafa

hooked up extraction to code->model

e759f5e

Merge branch 'main' into 1020-extraction-text-layer-from-pdf

6092735

refactor

f7aced9

Merge branch 'main' into 1020-extraction-text-layer-from-pdf

c6d8a7c

tangopapatime requested review from dgauldie, dvince2, YohannParis and mwdchang as code owners May 4, 2023 18:58

tangopapatime linked an issue May 4, 2023 that may be closed by this pull request

extraction text layer from PDF #1020

Closed

tangopapatime requested a review from chris-dickson as a code owner May 4, 2023 18:58

tangopapatime changed the title ~~feat: Add simple pdf extraction service~~ feat: add simple pdf extraction service May 4, 2023

oops

64b4af9

dvince2 reviewed May 4, 2023

View reviewed changes

packages/client/hmi-client/src/types/common.ts Show resolved Hide resolved

dvince2 approved these changes May 4, 2023

View reviewed changes

tangopapatime mentioned this pull request May 5, 2023

[TASK]: Add the pdf extraction service to the orchestration #1074

Closed

2 tasks

Thanh Pham added 2 commits May 5, 2023 10:44

replace underscores with dashes

472c566

URL

5b5d010

tangopapatime requested review from Tom-Szendrey and echl May 5, 2023 16:19

missed one

4a72f61

tangopapatime requested review from YohannParis and removed request for YohannParis May 9, 2023 11:25

YohannParis approved these changes May 9, 2023

View reviewed changes

added hmi-extraction-server to the hcl file

524b3ef

tangopapatime merged commit ef1c38c into main May 9, 2023

tangopapatime deleted the 1020-extraction-text-layer-from-pdf branch May 9, 2023 19:03

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add simple pdf extraction service #1070

feat: add simple pdf extraction service #1070

tangopapatime commented May 4, 2023

tangopapatime commented May 4, 2023

tangopapatime commented May 4, 2023

dvince2 left a comment

tangopapatime commented May 9, 2023

YohannParis commented May 9, 2023

feat: add simple pdf extraction service #1070

feat: add simple pdf extraction service #1070

Conversation

tangopapatime commented May 4, 2023

Description

Description

tangopapatime commented May 4, 2023

tangopapatime commented May 4, 2023

dvince2 left a comment

Choose a reason for hiding this comment

tangopapatime commented May 9, 2023

YohannParis commented May 9, 2023