-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
🐛 Bugfix: properly handle timeout when copying project #2655
🐛 Bugfix: properly handle timeout when copying project #2655
Conversation
Codecov Report
@@ Coverage Diff @@
## master #2655 +/- ##
========================================
- Coverage 78.1% 77.6% -0.6%
========================================
Files 635 635
Lines 25918 25944 +26
Branches 2507 2513 +6
========================================
- Hits 20259 20145 -114
- Misses 5000 5140 +140
Partials 659 659
Flags with carried forward coverage won't be shown. Click here to find out more.
|
services/web/server/src/simcore_service_webserver/projects/projects_utils.py
Outdated
Show resolved
Hide resolved
66f964f
to
c76c7bd
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
more to come
db = app[APP_PROJECT_DBAPI] | ||
await director_v2_api.delete_pipeline(app, user_id, UUID(project_uuid)) | ||
await db.delete_user_project(user_id, project_uuid) | ||
# requests storage to delete all project's stored data | ||
await delete_data_folders_of_project(app, project_uuid, user_id) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@GitHK please notice this change since it points one of the main source of the problem
- First, the function ends with
_from_db
and this call triggers the storage API (which takes very long to repond). - Second, the same function was again called via
delete_project_data
in delete_project ... which means that storage gets two heavy delete calls
@@ -6887,6 +6887,12 @@ paths: | |||
schema: | |||
type: string | |||
description: 'Option to create a template from existing project: as_template={study_uuid}' | |||
- name: copy_data |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess this option should help implementing ITISFoundation/osparc-issues#560
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it is a first step
async def update_project_without_enforcing_checks( | ||
self, project_data: Dict, project_uuid: str | ||
async def update_project_without_checking_permissions( | ||
self, project_data: Dict, project_uuid: str, hidden: Optional[bool] = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
MINOR:
- for readability on the caller, i would enforce using key in opptionals, i.e.
- Don't we have a ProjectDict somewhere?
async def update_project_without_checking_permissions(
self,
project_data: ProjectDict,
project_uuid: ProjectIDStr,
*,
hidden: Optional[bool] = None,
):
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
services/web/server/src/simcore_service_webserver/projects/projects_db.py
Show resolved
Hide resolved
prj_states: ProjectState = await get_project_states_for_user( | ||
user_id, project_uuid, app | ||
) | ||
log.error( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TIP: for errors (specially those we need to debug with support) we need good info. Having variable names is more handy and easily achievable using new f-strings in 3.8
log.error(
"Project %s already locked in state %s. Please check with support.",
f"{project_uuid=}",
f"{prj_states.locked.status=}",
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm not sure I see the point here. I will get the name of the variable on top?
well I think the message already gives it out right? But I keep that formating for later. I find the f-string on top of the '%s' thing a bit much. looking forward to passing f-strings directly into the log message instead.
services/web/server/src/simcore_service_webserver/projects/projects_api.py
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🎉 great fix. added some comments/thoughts around it that hope are useful. I added @GitHK
and you on some comments to underline some fixes so avoid these issues in future designs
thx! :-)
|
||
await clone_data_coro | ||
else: | ||
await clone_data_coro |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
QUESTION? why dont' you lock with notification when not as_template
? I guess this is creating a normal project so the user in principle does not see in the UI the thumbnail of the dashboard. Nonetheless, logically it should also lock? what would happen in one day the GUI changes? that will lead to an inconsistent state?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so the lock is here to prevent people from modifying a project while it is accessed.
in the case of from_template
we copy from a template (which is a read-only snapshot per se) to a new study. Per construction the template cannot be modified, therefore there is no need to lock it.
Now that I think of it, we could actually lock the new study until it is complete... maybe something to think about.
else: | ||
await clone_data_coro | ||
# unhide the project if needed since it is now complete | ||
if not new_project_was_hidden_before_data_was_copied: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
THOUGHT: perhaps features like these should be better managed with some kind of guards implemented with context managers. Upon enter we can store some fields that can be ensured upon exit ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agreed.. here I felt a bit bad... the function did already too much before, but I was a bit reluctant of taking the time to rework it completely.
else: | ||
raise web.HTTPCreated(text=json.dumps(project), content_type="application/json") | ||
log.debug("project created successfuly") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TIP: debug messages should be content richer and not only plain text. Use the trick i mentioned above with f-strings to display the value of variables in the log e.g.
log.debug("project %s for %s created successfuly with %s",
f"{project_id=}",
f"{user_id=}",
f"{as_template=}",
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
question.. what if the project_id does not exist?
@@ -339,7 +371,7 @@ async def replace_project(request: web.Request): | |||
) | |||
|
|||
try: | |||
projects_api.validate_project(request.app, new_project) | |||
await projects_api.validate_project(request.app, new_project) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
THOUGHT: did you by any chance check the speed improvement here? I guess the biggest will happen when listing projects. I think there is a validation before responding ... BTW did you change it tehre as well? That would definitively make a difference when e.g.e the test users gets too many failing studies (have just removed around 300+ in staging today)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm.. not sure we still validate there
services/web/server/src/simcore_service_webserver/projects/projects_utils.py
Outdated
Show resolved
Hide resolved
@@ -563,7 +568,7 @@ async def remove_all_projects_for_user(app: web.Application, user_id: int) -> No | |||
"The project can be removed as is not shared with write access with other users" | |||
) | |||
try: | |||
await delete_project_from_db(app, project_uuid, user_id) | |||
await delete_project(app, project_uuid, user_id) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@GitHK @sanderegg the new naming makes clear that the entire project is now deleted
9fdfcaa
to
4f937d6
Compare
What do these changes do?
When creating a project FROM a TEMPLATE or AS a TEMPLATE FROM a Study, the webserver does the following:
In case 2. fails or times-out (due to the webserver or more often due to the webbrowser timing out), the project is still listed for the user, although it is not valid and in an undefined state.
This PR aims to fix this issue by creating the new project as hidden by default UNTIL all the data was completely copied. In case of time out or of disconnection, the handler is cancelled and that cancellation is now handled. This will in turn delete the project, thus cleaning up.
As users still need something akin to copying only the skeleton an additional optional query parameter has been added to the creation project endpoint such as:
POST /projects now takes as query parameters the following:
Bonus:
now lock the study when creating a template
Related issue/s
How to test
Checklist