Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enforce Editor Lock for Write, Append, and Upload Operations in DataNode #2122

Closed
wants to merge 7 commits into from
Closed
Show file tree
Hide file tree
Changes from 6 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 15 additions & 3 deletions taipy/core/data/_file_datanode_mixin.py
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@
from taipy.common.logger._taipy_logger import _TaipyLogger

from .._entity._reload import _self_reload
from ..exceptions import DataNodeIsBeingEdited
from ..reason import InvalidUploadFile, NoFileToDownload, NotAFile, ReasonCollection, UploadFileCanNotBeRead
from .data_node import DataNode
from .data_node_id import Edit
Expand Down Expand Up @@ -100,7 +101,15 @@

return ""

def _upload(self, path: str, upload_checker: Optional[Callable[[str, Any], bool]] = None) -> ReasonCollection:
def _upload(
self,
data: Optional[Any] = None,
path: str = "",
upload_checker: Optional[Callable[[str, Any], bool]] = None,
job_id: Optional["JobId"] = None,

Check failure on line 109 in taipy/core/data/_file_datanode_mixin.py

View workflow job for this annotation

GitHub Actions / partial-tests / linter

Name "JobId" is not defined [name-defined]
editor_id: Optional[str] = None,
**kwargs: Dict[str, Any]
) -> None:
Comment on lines +104 to +112
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't get this change.

You only need one new argument: an optional editor_id. In particular, we don't want data or job_id. We must also return a reason collection!

Why not add a kwargs parameter to pass it to the track edit method, indeed. But this is out of the scope of the issue.

Suggested change
def _upload(
self,
data: Optional[Any] = None,
path: str = "",
upload_checker: Optional[Callable[[str, Any], bool]] = None,
job_id: Optional["JobId"] = None,
editor_id: Optional[str] = None,
**kwargs: Dict[str, Any]
) -> None:
def _upload(self, path: str, upload_checker: Optional[Callable[[str, Any], bool]] = None, **kwargs: Dict[str, Any]) -> ReasonCollection:

"""Upload a file data to the data node.

Arguments:
Expand All @@ -112,6 +121,9 @@
Returns:
True if the upload was successful, otherwise False.
"""
if self.edit_in_progress and self.editor_id != editor_id:
raise DataNodeIsBeingEdited(self.id, self.editor_id)

Comment on lines +124 to +126
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of raising an exception, you should build a specific Reason and add it to the reason_collection, as it is done below in the code for instance. You can re-use the DataNodeEditInProgress reason class.

I would also move this code after the creation of the reason_collection variable:
reason_collection = ReasonCollection()

from ._data_manager_factory import _DataManagerFactory

reason_collection = ReasonCollection()
Expand All @@ -124,7 +136,7 @@
self.__logger.error(f"Error while uploading {upload_path.name} to data node {self.id}:") # type: ignore[attr-defined]
self.__logger.error(f"Error: {err}")
reason_collection._add_reason(self.id, UploadFileCanNotBeRead(upload_path.name, self.id)) # type: ignore[attr-defined]
return reason_collection

Check failure on line 139 in taipy/core/data/_file_datanode_mixin.py

View workflow job for this annotation

GitHub Actions / partial-tests / linter

No return value expected [return-value]

if upload_checker is not None:
try:
Expand All @@ -138,15 +150,15 @@

if not can_upload:
reason_collection._add_reason(self.id, InvalidUploadFile(upload_path.name, self.id)) # type: ignore[attr-defined]
return reason_collection

Check failure on line 153 in taipy/core/data/_file_datanode_mixin.py

View workflow job for this annotation

GitHub Actions / partial-tests / linter

No return value expected [return-value]

shutil.copy(upload_path, self.path)

self.track_edit(timestamp=datetime.now()) # type: ignore[attr-defined]
self._write(data)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The _upload method should not write data directly like that. It is designed to replace the data of a File data node with a new file without passing through the write mechanism.

Please revert this change.

self.track_edit(job_id=job_id, editor_id=editor_id, **kwargs) # type: ignore[attr-defined]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
self.track_edit(job_id=job_id, editor_id=editor_id, **kwargs) # type: ignore[attr-defined]
self.track_edit(timestamp=datetime.now(), editor_id=editor_id, **kwargs) # type: ignore[attr-defined]

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

THere is no possible job_id. This cannot be called by the orchestrator.

self.unlock_edit() # type: ignore[attr-defined]
_DataManagerFactory._build_manager()._set(self) # type: ignore[arg-type]

return reason_collection

Check failure on line 161 in taipy/core/data/_file_datanode_mixin.py

View workflow job for this annotation

GitHub Actions / partial-tests / linter

No return value expected [return-value]

def _read_from_path(self, path: Optional[str] = None, **read_kwargs) -> Any:
raise NotImplementedError
Expand Down
25 changes: 19 additions & 6 deletions taipy/core/data/data_node.py
Original file line number Diff line number Diff line change
Expand Up @@ -415,7 +415,7 @@ def read(self) -> Any:
)
return None

def append(self, data, job_id: Optional[JobId] = None, **kwargs: Dict[str, Any]):
def append(self, data, job_id: Optional[JobId] = None, editor_id: Optional[str] = None, **kwargs: Dict[str, Any]):
"""Append some data to this data node.

Arguments:
Expand All @@ -426,12 +426,14 @@ def append(self, data, job_id: Optional[JobId] = None, **kwargs: Dict[str, Any])
"""
from ._data_manager_factory import _DataManagerFactory

if self.edit_in_progress and self.editor_id != editor_id:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you elaborate on why you did not use the same if condition as the write function?
What if editor_id is None?

raise DataNodeIsBeingEdited(self.id, self.editor_id)

self._append(data)
self.track_edit(job_id=job_id, **kwargs)
self.unlock_edit()
jrobinAV marked this conversation as resolved.
Show resolved Hide resolved
self.track_edit(job_id=job_id, editor_id=editor_id, **kwargs)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
self.track_edit(job_id=job_id, editor_id=editor_id, **kwargs)
self.track_edit(job_id=job_id, editor_id=editor_id, **kwargs)
self.unlock_edit()

_DataManagerFactory._build_manager()._set(self)

def write(self, data, job_id: Optional[JobId] = None, **kwargs: Dict[str, Any]):
def write(self, data, job_id: Optional[JobId] = None, editor_id: Optional[str] = None, **kwargs: Dict[str, Any]):
"""Write some data to this data node.

Arguments:
Expand All @@ -442,9 +444,16 @@ def write(self, data, job_id: Optional[JobId] = None, **kwargs: Dict[str, Any]):
"""
from ._data_manager_factory import _DataManagerFactory

if self.edit_in_progress and editor_id and self.editor_id != editor_id:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand correctly, this covers the case where the data node is already locked by a different editor than the one attempting to write it. We also need to check that the self.editor_expiration_date (if it exists) is not past.

raise DataNodeIsBeingEdited(self.id, self.editor_id)

if not editor_id and self.edit_in_progress:
print("Orchestrator writing without editor_id")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This covers the case where the Orchestrator attempts to write a data node locked by an editor. We can eventually add an info log using the _logger attribute, but we don't want any print in the production code. Please remove it.


self._write(data)
self.track_edit(job_id=job_id, **kwargs)
self.unlock_edit()
Comment on lines -452 to -453
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PLease revert these two lines.

self.last_edit_date = datetime.now()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The last_edit_date setting is already done in the track_edit method called right after. Please revert this line.

self.edit_in_progress = False # Ensure it's not locked after writing
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The unlock edit that you removed should be reverted. It already sets the self.edit_in_progress to False.

self.track_edit(job_id=job_id, editor_id=editor_id, **kwargs)
_DataManagerFactory._build_manager()._set(self)

def track_edit(self, **options):
Expand All @@ -471,6 +480,10 @@ def lock_edit(self, editor_id: Optional[str] = None):
Arguments:
editor_id (Optional[str]): The editor's identifier.
"""

if self._editor_expiration_date and datetime.now() > self._editor_expiration_date:
self.unlock_edit()

Comment on lines +505 to +508
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see what case we want to cover here. For me, the next condition already covers this case. Can you elaborate? Otherwise, we should revert it.

if editor_id:
if (
self.edit_in_progress
Expand Down
12 changes: 11 additions & 1 deletion taipy/core/data/in_memory.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,6 @@
from typing import Any, Dict, List, Optional, Set

from taipy.common.config.common.scope import Scope

from .._version._version_manager_factory import _VersionManagerFactory
from .data_node import DataNode
from .data_node_id import DataNodeId, Edit
Expand Down Expand Up @@ -98,3 +97,14 @@ def _read(self):

def _write(self, data):
in_memory_storage[self.id] = data

def _append(self, data):
"""Append data to the existing data in the in-memory storage."""
if self.id not in in_memory_storage:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

self.id must not be None. you can remove it from the condition.

in_memory_storage[self.id] = []

if not isinstance(in_memory_storage[self.id], list):
in_memory_storage[self.id] = [in_memory_storage[self.id]]
Comment on lines +106 to +107
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If in_memory_storage[self.id] is not a list, a NotImplementedError should be raised.


in_memory_storage[self.id].append(data) # Append new data
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if the previous data of the data node has no append method.
In other words, what if in_memory_storage[self.id] has no append method?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok I added a if instance condition to resolve that. Please review it and let me know if any changes required.


30 changes: 30 additions & 0 deletions tests/core/data/test_data_node.py
Original file line number Diff line number Diff line change
Expand Up @@ -745,3 +745,33 @@ def test_change_data_node_name(self):
# This new syntax will be the only one allowed: https://github.com/Avaiga/taipy-core/issues/806
dn.properties["name"] = "baz"
assert dn.name == "baz"

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe we are missing a few test cases here:

  • An editor editor_id_1 locked the data node, but the expiration date passed. Another "manual" editor, editor_id_2, should succeed in writing the data node.
  • An editor editor_id_1 locked the data node, but the expiration date passed. Another "manual" editor, editor_id_2, should succeed in appending the data node.
  • The orchestrator locks the data node. A "manual" editor should fail to write the data node.
  • The orchestrator locks the data node. The Orchestrator should succeed in writing the data node.
    The orchestrator locks the data node. A "manual" editor should fail to append it.

def test_locked_data_node_write_should_fail_with_wrong_editor(self):
dn = InMemoryDataNode("dn", Scope.SCENARIO)
dn.lock_edit("editor_1")

# Should raise exception for wrong editor
with pytest.raises(DataNodeIsBeingEdited):
dn.write("data", editor_id="editor_2")

# Should succeed with correct editor
dn.write("data", editor_id="editor_1")
assert dn.read() == "data"

def test_locked_data_node_append_should_fail_with_wrong_editor(self):
dn = InMemoryDataNode("dn", Scope.SCENARIO)
dn.lock_edit("editor_1")

with pytest.raises(DataNodeIsBeingEdited):
dn.append("data", editor_id="editor_2")

dn.append("data", editor_id="editor_1")
assert dn.read() == ["data"]

def test_orchestrator_write_without_editor_id(self):
dn = InMemoryDataNode("dn", Scope.SCENARIO)
dn.lock_edit("editor_1")

# Orchestrator write without editor_id should succeed
dn.write("orchestrator_data")
assert dn.read() == "orchestrator_data"
Loading