Add two classes for HasHDF/HasDict compat #1364

pmrv · 2024-03-02T07:51:13Z

and outline transition path in docstrings there.

pmrv · 2024-03-02T08:03:35Z

@jan-janssen and I discussed something like this last week. The problem statement is essentially: We want everything as to_dict, but don't want to convert all to_hdf at once. Some code also still expects to be able to call to_hdf and again we don't want to change those all at one as well. So this introduces two mixins that can translate implementers of HasHDF and HasDict between each other. If we use them we can then slowly fade out to_hdf implementation by manually converting them to HasHDF over time. Once all objects are converted, we deprecate HasHDFfromDict.to_hdf to show us what we need to update in the surrounding code, and when that is done we remove both classes again.

Tests will need to follow and it's not yet used by any objects, but I want to discuss this plan and the design first.

pyiron_base/interfaces/has_dict.py

pyiron_base/storage/parameters.py

pmrv · 2024-03-03T14:06:01Z

Error: The action 'Test' has timed out after 5 minutes.

pmrv · 2024-03-04T15:03:03Z

pyiron_base/interfaces/has_dict.py

        pass

-    def _type_to_dict(self):
+    def to_dict(self):


Suggested change

def to_dict(self):

@HasDict.to_dict

def to_dict(self):

What about this so people don't have to put _...?

Do literature search what is common in the community? Some reference to OOP best practices?

I poked around for this, but I found nothing conclusive. The best I got was this SO thread from eight years ago; they lay out the same choices we're already aware of (OP's default was to do it the way you have here, if that's anything), but the question is left unanswered as "well, that's opinion". Personally, I find the @abstractmethod flag on the private method of the same name the clearest...but I agree it's opinion.

I would like to stick to the current convention then. Since this is a private class used internally anyway, it should be fairly easy to change in the future anyway, in case we decide differently.

pmrv · 2024-03-19T16:10:50Z

I've added a test in form of the DataContainer implementation of HasDictFromHDF and adapted DummyHDFio to work with this.

@jan-janssen I'd prefer to merge this in before proceeding with your to_dict series of PRs. Changes needed there to use this would hopefully be minimal.

I've noticed as well when running the tests locally that there is one location where we create a job with run_mode=="manual" that hangs for a long time before the tests proceed. This may be the issue of the timeout problems we are seeing with the CI, but I couldn't locate the test case yet. I have a feeling this may be related to the files changes with the decompression that we did recently. I have also seen a bunch unrelated test failures locally.

jan-janssen · 2024-03-19T18:49:36Z

@jan-janssen I'd prefer to merge this in before proceeding with your to_dict series of PRs. Changes needed there to use this would hopefully be minimal.

The only pull requests which use HasDict are #1380 and #1377 . As those are required for pyiron/pyiron_atomistics#1356 I would prefer to merge those first before we change the interface.

pmrv · 2024-03-25T16:04:40Z

The reason for the unit tests hanging is this line

pyiron_base/tests/job/test_worker.py

Line 26 in 0bdde9b

self.sub_project.wait_for_jobs()

pmrv · 2024-03-25T16:10:07Z

The reason for the unit tests hanging is this line

pyiron_base/tests/job/test_worker.py

Line 26 in 0bdde9b

self.sub_project.wait_for_jobs()

What I don't understand is why this is suddenly a problem as this PR doesn't touch worker code at all.

pmrv · 2024-03-26T12:35:11Z

Seems this only breaks, once I touched GenericParameters.

pmrv · 2024-03-26T12:54:08Z

@jan-janssen I can't seem to fix this quickly today, so go ahead with the other PRs.

jan-janssen · 2024-08-01T20:19:48Z

pyiron_base/jobs/job/extension/executable.py

-    def to_dict(self):
-        executable_dict = self._type_to_dict()
-        executable_storage_dict = self.storage._type_to_dict()
-        executable_storage_dict["READ_ONLY"] = self.storage._read_only
-        executable_storage_dict.update(self.storage.to_builtin())
-        executable_dict["executable"] = executable_storage_dict
-        return executable_dict
+    def _to_dict(self):
+        full_dict = self.storage.to_dict()
+        data = full_dict.pop("data")
+        return {"executable": {**full_dict, **data}}


In principle, I like the idea with the DummyHDFio. But for classes like the Executable, I am not even sure if they need a DataContainer. I think the goal is to get to pickle-able classes in the future, with the to_dict() and from_dict() method being part of the transition. So I feel switching to _to_dict() here goes in the wrong direction.

The DummyHDFio is not used for the executable or the data container, so I'm not sure what you mean. I agree that in the future Executable could be a simple dataclass or so, but again here I just modified it, so it conforms to the new _to_dict.

jan-janssen · 2024-08-01T20:23:45Z

pyiron_base/jobs/job/extension/server/generic.py

-    def to_dict(self):
-        server_dict = self._type_to_dict()
-        server_dict.update(
-            {
-                "user": self._user,
-                "host": self._host,
-                "run_mode": self.run_mode.mode,
-                "queue": self.queue,
-                "qid": self._queue_id,
-                "cores": self.cores,
-                "threads": self.threads,
-                "new_h5": self.new_hdf,
-                "structure_id": self.structure_id,
-                "run_time": self.run_time,
-                "memory_limit": self.memory_limit,
-                "accept_crash": self.accept_crash,
-            }
-        )
+    def _to_dict(self):
+        # server_dict = self._type_to_dict()
+        server_dict = {
+            "user": self._user,
+            "host": self._host,
+            "run_mode": self.run_mode.mode,
+            "queue": self.queue,
+            "qid": self._queue_id,
+            "cores": self.cores,
+            "threads": self.threads,
+            "new_h5": self.new_hdf,
+            "structure_id": self.structure_id,
+            "run_time": self.run_time,
+            "memory_limit": self.memory_limit,
+            "accept_crash": self.accept_crash,
+        }


In analogy to the Executable class, does the Server class need the transition from to_dict() to _to_dict() or can we modify the class in a way that the __getstate__() returns the same dictionary as to_dict(). Again my focus would be on simplifying the class hierarchy.

Both Server and Executable derived from HasDict before this PR. I'm ok with changing this in the future, but here I just did the minimal changes to keep existing sub classes working with the changes I did to HasDict (which is just adding the type info automatically to the results of to_dict).

I add this to the bucket list. I would like to replace both with data classes sooner rather than later then we can potentially also reuse them in the functional approach.

jan-janssen

Looks good to me

jan-janssen · 2024-08-15T16:09:07Z

@pmrv As #1578 and #1580 are now merged - can you update this pull request, so we can include it in the next release? I think the three pull requests together streamline the to_dict() and from_dict() interface, so I would like to release them as part of pyiron_base=0.10.0.

for more information, see https://pre-commit.ci

pmrv · 2024-08-16T10:01:00Z

contrib tests fail because its dependencies cannot currently be installed concurrently with the ones from atomistics.

pmrv · 2024-08-16T10:37:03Z

This is in a good state now, but I'm still trying to add recursive to_dict/from_dict (in a separate PR).

Adds a create_from_dict function that takes the output of HasDict.to_dict and turns it into a live object, similar to our earlier to_object() method on ProjectHDFio. Adds an instantiate class method to HasDict to support this, this allows HasDictfromHDF to work and will be useful for dataclasses in the future. HasDict.to_dict now goes over the contents of what is returned from _to_dict and automatically converts any HasDict/HasHDF objects it finds. I haven't used this in downstream code yet to keep the change small, but in principle this will allow GenericJob/DataContainer to stop calling to_dict on their children explicitly and let the generic interface handle it. The rest of the changes are renaming everything to _from_dict/_to_dict and normalizing the argument name to obj_dict.

jan-janssen · 2024-08-17T10:32:12Z

@pmrv Can this be merged? Or should we wait for ta new h5io_browser version to include h5io/h5io_browser#61 ?

pyiron_base/jobs/job/extension/server/generic.py

pmrv · 2024-08-17T11:30:29Z

Both this and #1602 dont need the h5io fix, thats just something I found while trying to get the jobs read their dicts from HDF recursively.

for more information, see https://pre-commit.ci

[minor] Recursive to/from_dict

PR #1364 changed saved the contents of the container in a data subgroup, but that breaks backwards compat and is not actually necessary for the TemplateJob, so revert it here.

* Revert changes to DataContainer storage format PR #1364 changed saved the contents of the container in a data subgroup, but that breaks backwards compat and is not actually necessary for the TemplateJob, so revert it here. * Handle list in to_dict --------- Co-authored-by: Marvin Poul <poul@mpie.de>

pmrv added the enhancement New feature or request label Mar 2, 2024

pmrv commented Mar 2, 2024

View reviewed changes

pyiron_base/interfaces/has_dict.py Outdated Show resolved Hide resolved

pmrv commented Mar 2, 2024

View reviewed changes

pyiron_base/interfaces/has_dict.py Outdated Show resolved Hide resolved

jan-janssen reviewed Mar 2, 2024

View reviewed changes

pyiron_base/storage/parameters.py Outdated Show resolved Hide resolved

jan-janssen reviewed Mar 2, 2024

View reviewed changes

pyiron_base/storage/parameters.py Show resolved Hide resolved

jan-janssen marked this pull request as draft March 2, 2024 10:02

pmrv commented Mar 4, 2024

View reviewed changes

pmrv marked this pull request as ready for review March 19, 2024 16:11

jan-janssen marked this pull request as draft March 19, 2024 18:56

pmrv force-pushed the hasdict branch from f49cc89 to 0bdde9b Compare March 25, 2024 15:59

pmrv mentioned this pull request Mar 26, 2024

Implement HasDict for HasStorage #1398

Closed

pmrv force-pushed the hasdict branch 2 times, most recently from 621fc12 to ad9a086 Compare July 29, 2024 18:11

pmrv mentioned this pull request Jul 29, 2024

Adapt GenericParameters #1563

Merged

pmrv force-pushed the hasdict branch from 847059e to fef199f Compare August 1, 2024 14:32

pmrv added integration Start the integration tests with pyiron_atomistics/contrib for this PR format_black reformat the code using the black standard labels Aug 1, 2024

pmrv marked this pull request as ready for review August 1, 2024 15:47

jan-janssen reviewed Aug 1, 2024

View reviewed changes

jan-janssen approved these changes Aug 2, 2024

View reviewed changes

pmrv and others added 14 commits August 16, 2024 11:42

Use HasDict to break down values in to_builtin

1fd134d

[pre-commit.ci] auto fixes from pre-commit.com hooks

13aea07

for more information, see https://pre-commit.ci

Adapt Executable to new HasDict

1d328ec

Respect group name

0463018

Implement HasDict directly for DataContainer

f2750c2

Implement HasDict directly for DataContainer

9d9d9e7

[pre-commit.ci] auto fixes from pre-commit.com hooks

777bb50

for more information, see https://pre-commit.ci

Adapt GenericParameters

1b00832

[pre-commit.ci] auto fixes from pre-commit.com hooks

446595f

for more information, see https://pre-commit.ci

Adapt Executable.from_dict to DataContainer

f814bc1

Redefine HasDict._type_to_dict so that GenericJob can overload it

7fbc682

Make _type_to_dict compatible across the various Has* classes

f22eee9

Include DICT_VERSION in internal nodes and restore read_only

a3e4277

[pre-commit.ci] auto fixes from pre-commit.com hooks

5c946df

for more information, see https://pre-commit.ci

pmrv force-pushed the hasdict branch from e90961c to 5c946df Compare August 16, 2024 09:46

pmrv added 2 commits August 16, 2024 17:15

Fix typos

8ae825b

jan-janssen reviewed Aug 17, 2024

View reviewed changes

pyiron_base/jobs/job/extension/server/generic.py Show resolved Hide resolved

pmrv and others added 3 commits August 19, 2024 17:54

Add docstrings and rearrange method order

6db8b9e

[pre-commit.ci] auto fixes from pre-commit.com hooks

d710333

for more information, see https://pre-commit.ci

Merge pull request #1602 from pyiron/hasdictrec

013a21b

[minor] Recursive to/from_dict

pmrv merged commit 49e4ffd into main Aug 19, 2024
23 of 27 checks passed

pmrv deleted the hasdict branch August 19, 2024 19:57

pmrv mentioned this pull request Aug 23, 2024

Revert changes to DataContainer storage format #1620

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add two classes for HasHDF/HasDict compat #1364

Add two classes for HasHDF/HasDict compat #1364

pmrv commented Mar 2, 2024

pmrv commented Mar 2, 2024

pmrv commented Mar 3, 2024

pmrv Mar 4, 2024

pmrv Mar 4, 2024

liamhuber Mar 4, 2024

pmrv Mar 19, 2024

pmrv commented Mar 19, 2024

jan-janssen commented Mar 19, 2024

pmrv commented Mar 25, 2024

pmrv commented Mar 25, 2024

pmrv commented Mar 26, 2024

pmrv commented Mar 26, 2024

jan-janssen Aug 1, 2024

pmrv Aug 2, 2024

jan-janssen Aug 1, 2024

pmrv Aug 2, 2024

jan-janssen Aug 2, 2024

jan-janssen left a comment

jan-janssen commented Aug 15, 2024

pmrv commented Aug 16, 2024

pmrv commented Aug 16, 2024

jan-janssen commented Aug 17, 2024

pmrv commented Aug 17, 2024

Add two classes for HasHDF/HasDict compat #1364

Add two classes for HasHDF/HasDict compat #1364

Conversation

pmrv commented Mar 2, 2024

pmrv commented Mar 2, 2024

pmrv commented Mar 3, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pmrv commented Mar 19, 2024

jan-janssen commented Mar 19, 2024

pmrv commented Mar 25, 2024

pmrv commented Mar 25, 2024

pmrv commented Mar 26, 2024

pmrv commented Mar 26, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jan-janssen left a comment

Choose a reason for hiding this comment

jan-janssen commented Aug 15, 2024

pmrv commented Aug 16, 2024

pmrv commented Aug 16, 2024

jan-janssen commented Aug 17, 2024

pmrv commented Aug 17, 2024