-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
EVA-3696 - Processing via a scanner and new brokering method #232
EVA-3696 - Processing via a scanner and new brokering method #232
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work, also very useful for me because it just makes the orchestration more concrete and easier for me to understand.
For testing, I wouldn't mind including some integration tests against our submission-ws and Biosamples in dev, as long as we clean up the dev dbs and the tests aren't too long (or run selectively, e.g. manually triggered or on tags only) I think it's fine.
We can also just mock the submission-ws and write unit tests. Incidentally I think this is also easier if we use a client object as per my suggestion, means just one thing to mock rather than patching all over the place...
PROCESSING_STATUS = [READY_FOR_PROCESSING, FAILURE, SUCCESS, RUNNING, ON_HOLD] | ||
|
||
|
||
def sub_ws_auth(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you think it's worth extracting the submission WS client into common-pyutils, so it can be used in both eva-sub-cli and eva-submission? It's some extra refactoring, but I think it could be beneficial in the long run to keep python interactions with the submission WS in one place.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes that would be useful. I'll create a new ticket for it.
|
||
def scan(self): | ||
def _scan_per_status(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was initially very confused about why we needed to scan both tables, before I realised that this scan is (I think) only used to add the first processing step. If that's the case, then maybe it could be a bit less generic and used only in that specific situation - even if we used it for other operations (e.g. scanning for cancelled submissions to clean up the db or something), I don't think creating a SubmissionStep
for validation would make sense in those cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that it is a bit confusing.
The way it is currently implemented we have a generic scanner that can find submissions with different status of processing step/processing status.
Then subclasses define what these steps and statuses are.
Only the new submission scanner was implemented but I've added the other onenow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I get that, I just don't see what other Scanners would use _scan_for_new_per_submission_status
besides the new submission scanner, so maybe that logic could be exclusive to that scanner. It's not a big deal though, thanks for adding the other scanners too.
pretty_print(header, lines) | ||
|
||
|
||
class NewSubmissionScanner(SubmissionScanner): | ||
|
||
statuses = ['UPLOADED'] | ||
step_statuses = [] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is mostly a matter of taste, but I think I would prefer the different scanning tasks as methods rather than classes. So we would have just one SubmissionScanner
with the generic _scan_per_step_status
helper method, and find_new_submissions
, find_completed_submission_steps
, etc. that call that method with the relevant statuses.
On the other hand, maybe there's more functionality that would go into these subclasses that I'm not thinking of, in which case having the extra classes makes sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That definitely could be refactored that way. I guess I was worked that this would make the SubmissionScanner class too bug but that might not be a valid concern.
No description provided.