-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
🐛 Fix file-picker downstream service notification issues #3058
Conversation
Codecov Report
@@ Coverage Diff @@
## master #3058 +/- ##
========================================
+ Coverage 75.0% 80.8% +5.7%
========================================
Files 715 716 +1
Lines 30849 30903 +54
Branches 4024 4032 +8
========================================
+ Hits 23166 24974 +1808
+ Misses 6820 5061 -1759
- Partials 863 868 +5
Flags with carried forward coverage won't be shown. Click here to find out more.
|
…ed issue with parallel inputs downloads
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
great, but I have some doubts on a few things, please have a look, we can discuss these.
@@ -170,6 +170,76 @@ async def _get_data_from_port(port: Port) -> Tuple[Port, ItemConcreteValue]: | |||
return (port, ret) | |||
|
|||
|
|||
async def _download_files( | |||
target_path: Path, download_tasks: Deque[Coroutine[Any, int, Any]] | |||
) -> Tuple[dict[str, Any], ByteSize]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Could you create a TypeDict at least, to know what is returned?
- check if you can use
deque
instead ofDeque
- you can use tuple instead of Tuple
- if you bother on butting a deque here, why then not just pass a Sequence type? I mean the best and most secure/fast way is to pass a tuple because then you also pass the fact that the fct will not modify the argument, plus tuple is faster than deque... also I doubt a bit that going from list to deque brings so much of a difference... especially when downloading GBs...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- I've found out we already have something similar defined
OutputsDict
. Will be using that. - Sorry I don't understand the suggestion.
- 👍
- There is no life changing improvement. The deque is just a better suited for appending data than a list. The change from list to deuque brings no real life benefit. I do not agree on the use of tuples, since they are immutable ad when constructing the sequence you need to append data to it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
deque
is usable for typing
services/dynamic-sidecar/src/simcore_service_dynamic_sidecar/modules/nodeports.py
Show resolved
Hide resolved
return data, ByteSize(transferred_bytes) | ||
|
||
|
||
@run_sequentially_in_context() | ||
async def download_target_ports( | ||
port_type_name: PortTypeName, target_path: Path, port_keys: List[str] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use list[str]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will decide that if I remove Dict
or List
from a function in a module from now on, I will also remove it from the entire module.
@pcrespov could we agree on this?
services/web/server/src/simcore_service_webserver/projects/projects_db.py
Outdated
Show resolved
Hide resolved
def _cast_outputs_store(dict_data: dict[str, Any]) -> None: | ||
for data in dict_data.get("outputs", {}).values(): | ||
if "store" in data: | ||
data["store"] = f"{data['store']}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor: actually I think I would prefer going to int than to string... especially since the day it is fixed it will be an int for sure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That can work as well. String was selected since is the safer type cast between the two.
@@ -724,6 +811,10 @@ def _update_workbench( | |||
.returning(literal_column("*")) | |||
) | |||
project: RowProxy = await result.fetchone() | |||
|
|||
for node_update_task in nodes_update_tasks: | |||
fire_and_forget_task(node_update_task) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
did you check what happens if you have computational services running and stuff like that? you do not end up in a loop?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I've checked, and sorry for the bad naming. This is only for frontend services. And for now of all the frontend services only the file picker will be used. I've changed the names to reflect this.
Computational services are not influences by this.
services/web/server/src/simcore_service_webserver/projects/projects_db.py
Outdated
Show resolved
Hide resolved
services/web/server/src/simcore_service_webserver/projects/projects_db.py
Outdated
Show resolved
Hide resolved
id="different keys but missing key and outputs do not trigger", | ||
), | ||
pytest.param( | ||
{"key": "simcore/services/frontend/file-picker"}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if I get that correctly now this function will be called from comp_tasks_listening_task and also from PATCH /projects right?
is there any test for the case of computational services, and/or other dynamic services? or is this purely and only for file-picker? in which case please change the name of your functions to reflect that (if not this would be confusing).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this affects comp_tasks_listening_task
and PATCH /projects
(aka, when the fronted saves the project)
The only affects services are frontend services. And also, for now only the file-picker is involved. I've renamed the functions to highlight this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it is urgent, i might let it go through but I am concerned about the circular dependency though. Will recheck again later and try to help with a solution.
services/web/server/tests/unit/with_dbs/01/test_comp_tasks_listening_task.py
Outdated
Show resolved
Hide resolved
services/web/server/tests/unit/with_dbs/02/test_project_utils.py
Outdated
Show resolved
Hide resolved
# this function is called from: | ||
# - `projects/projects_db.py` | ||
# - `computation_comp_tasks_listening_task.py` (was originally here) | ||
from . import projects_api |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is really not good. We should have a look at it together. The more we delay solving this the worse it becomes
Please add an issue to follow up on this and make sure there is a FIXME here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added FIXME and created an issue
#3069
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I checked and reduced a bit the dependency but there is still a loop that IMO reveals a design problem in the changes of your PR
The problem i see is that you are including plugin-level API (i.e. projects_api
) in the implementation of db-API repository (i.e. projects_db
) while it should be the other way around i.e. the plugin-level API uses the db-API.
More details on the code follow ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this. How did you mange to find the loop, just by visually checking?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
RTFM? :-D ... jokes aside, I did it both visually and following the code.
But there is also the option to find them automatically --show-cycles
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NOTE: Perhaps add an example of --show-cycles
in the doc of the script?
services/web/server/src/simcore_service_webserver/projects/projects_utils.py
Show resolved
Hide resolved
services/web/server/src/simcore_service_webserver/projects/projects_utils.py
Outdated
Show resolved
Hide resolved
services/web/server/src/simcore_service_webserver/projects/projects_utils.py
Outdated
Show resolved
Hide resolved
if "store" in data: | ||
data["store"] = int(data["store"]) | ||
|
||
if current_dict.get("key") == "simcore/services/frontend/file-picker": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry, this implementation feel very hacky provided the generic name (find_changed_dict_keys
) you gave to the function! Make sure it is only used in your specific case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
MINOR please consider comment above
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I think the comment in the function is explaining what the issue is. I'd guess that we need a better name for the function since it was moved from its original module. Changing to find_changed_node_keys
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please check the feedback i provided you together with the PR and consider re-designing your changes in projects/projects_db.py
. Try re-designing this logic on top (i.e. using) of the project's db-API.
The circular import is a warning that we misusing the hierarchies of our design.
@@ -724,6 +739,10 @@ def _update_workbench( | |||
.returning(literal_column("*")) | |||
) | |||
project: RowProxy = await result.fetchone() | |||
|
|||
for frontend_node_update_task in frontend_nodes_update_tasks: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the main cause of the circular import discussed in https://github.com/ITISFoundation/osparc-simcore/pull/3058/files#r880401810
IMO the db-API layer for projects should know nothing about front-end nodes etc . The idea here is create a repository pattern (SEE e.g. https://www.cosmicpython.com/book/chapter_02_repository.html) whose responsibility is accessing the database and hiding all sql-alchemy specialized logic.
The operations you are doing here IMO belong outside projects_db
, where the concept of e.g. front-end-node is well defined but not here, where the only things that are defined are the columns of the table etc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I managed to move it outside in the end. You are correct, those functions should have not been inside the projects_db
since it has a different objective.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
if "store" in data: | ||
data["store"] = int(data["store"]) | ||
|
||
if current_dict.get("key") == "simcore/services/frontend/file-picker": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
MINOR please consider comment above
services/web/server/tests/unit/with_dbs/02/test_project_utils.py
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very neat, this will close some gaps. Thanks a lot for doing it!
services/web/server/src/simcore_service_webserver/projects/projects_db.py
Outdated
Show resolved
Hide resolved
services/web/server/src/simcore_service_webserver/projects/projects_nodes_utils.py
Outdated
Show resolved
Hide resolved
services/web/server/src/simcore_service_webserver/projects/projects_nodes_utils.py
Outdated
Show resolved
Hide resolved
services/web/server/src/simcore_service_webserver/projects/projects_nodes_utils.py
Outdated
Show resolved
Hide resolved
from models_library.projects import ProjectID | ||
from .projects_utils import get_frontend_node_outputs_changes | ||
from servicelib.utils import fire_and_forget_task | ||
from . import projects_api |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actually a little bit weird that a utils calls the project_api... maybe that is one part of your issue with circular dependencies
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, I've removed the circular dependency, this ok now. This is no longer called from the projects_db
but from the projects_handlers_crud.py
. PC actually pointed out the issue.
Kudos, SonarCloud Quality Gate passed!
|
What do these changes do?
Related issue/s
How to test
3.
Checklist