-
Notifications
You must be signed in to change notification settings - Fork 39
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[MDS-6105] Permit Condition Extraction Improvements #3230
Merged
Merged
Changes from 12 commits
Commits
Show all changes
19 commits
Select commit
Hold shift + click to select a range
75e89f3
[MDS-6086] Added support for async permit condition extraction using …
simensma-fresh e11ba7d
MDS-6086 Tweaks after PR feedback
simensma-fresh 4c0a294
MDS-6986 Fixed tests + added github action to run permit service tests
simensma-fresh a742cbf
MDS-6086 Moved tests folder
simensma-fresh 21777a3
MDS-6086 Run elasticsearch using https locally
simensma-fresh 90e39ee
MDS-6086 Fixed celery setup to accept certs
simensma-fresh fb3cf22
Fixed cert job startup issue
simensma-fresh 3e8b7b5
Fixes
simensma-fresh 88cf462
Permit condition extraction using azure document intelligence
simensma-fresh 5bb55d6
MDS-6086 Fixed tests
simensma-fresh 267bb81
Added more tests, cleanup
simensma-fresh b9fa122
Added more tests, cleanup, plug in GPT4 to answer questions
simensma-fresh 088b03e
MDS-6086 Added missing tests, cleanup
simensma-fresh 92c7727
Tweak sonar-project
simensma-fresh 6c8e26b
Update to reportPaths
simensma-fresh f7f1b49
Updated tests
simensma-fresh 431d4f1
Add missing test
simensma-fresh 3aaf28c
MDS-6086 Added tests for permit condition pipeline
simensma-fresh 8233995
MDS-6086 Fix sonarcloud issues
simensma-fresh File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -8,6 +8,7 @@ | |
import os | ||
from time import sleep | ||
|
||
import oauthlib | ||
from app.compare_extraction_results import validate_condition | ||
from dotenv import find_dotenv, load_dotenv | ||
from oauthlib.oauth2 import BackendApplicationClient | ||
|
@@ -42,12 +43,31 @@ def authenticate_with_oauth(): | |
) | ||
return oauth_session | ||
|
||
def refresh_token(oauth_session): | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Added automatic refreshing of auth token here as it would sometimes timeout due to how long the extraction process takes |
||
oauth_session.fetch_token( | ||
TOKEN_URL, | ||
client_secret=PERMITS_CLIENT_SECRET, | ||
) | ||
|
||
return oauth_session | ||
|
||
def request(oauth_session, url, method, **kwargs): | ||
try: | ||
response = getattr(oauth_session, method)(url, **kwargs) | ||
response.raise_for_status() | ||
except oauthlib.oauth2.rfc6749.errors.TokenExpiredError: | ||
print('Token expired. Refreshing token...') | ||
oauth_session = refresh_token(oauth_session) | ||
response = getattr(oauth_session, method)(url, **kwargs) | ||
response.raise_for_status() | ||
|
||
return response | ||
|
||
def extract_conditions_from_pdf(pdf_path, oauth_session): | ||
# Kick off the permit conditions extraction process | ||
with open(pdf_path, "rb") as pdf_file: | ||
files = {"file": (os.path.basename(pdf_path), pdf_file, "application/pdf")} | ||
response = oauth_session.post(f"{PERMIT_SERVICE_ENDPOINT}/permit_conditions", files=files) | ||
response = request(oauth_session, f"{PERMIT_SERVICE_ENDPOINT}/permit_conditions", 'post', files=files) | ||
response.raise_for_status() | ||
|
||
task_id = response.json().get('id') | ||
|
@@ -60,12 +80,11 @@ def extract_conditions_from_pdf(pdf_path, oauth_session): | |
# Poll the status endpoint until the task is complete | ||
while status not in ("SUCCESS", "FAILURE"): | ||
sleep(3) | ||
status_response = oauth_session.get(f"{PERMIT_SERVICE_ENDPOINT}/permit_conditions/status", params={"task_id": task_id}) | ||
status_response = request(oauth_session, f"{PERMIT_SERVICE_ENDPOINT}/permit_conditions/status", 'get', params={"task_id": task_id}) | ||
status_response.raise_for_status() | ||
|
||
status = status_response.json().get('status') | ||
|
||
print(json.dumps(status_response.json(), indent=2)) | ||
|
||
if status != "SUCCESS": | ||
raise Exception(f"Failed to extract conditions from PDF. Task status: {status}") | ||
|
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just included a couple of tweaks to the report generation to include the extracted
meta
dict in the report html