-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add some changes to vcs #1203
base: release
Are you sure you want to change the base?
add some changes to vcs #1203
Conversation
@pgierz, I'm putting this into clickup and assign it to you, if you don't have time for it we can reassign it to me in some point. |
Hey Miguel, I have something that might be able to handle caching very easily. Have a look at the newest commit. |
Cache seems to be working now as I would expect:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice @pgierz! Really cool and elegant solution :) Just one comment on how we can improve it. Let me know when you are done with that, then I can start the real tests.
@@ -708,7 +709,12 @@ def add_vcs_info(config): | |||
vcs_versions[model] = f"Unable to locate model_dir for {model}." | |||
continue | |||
if helpers.is_git_repo(model_dir): | |||
vcs_versions[model] = helpers.get_all_git_info(model_dir) | |||
if esm_vcs_info_file_cache.is_younger_than(model_dir): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to think about this, maybe we want to compare it with the age of the binary: is not until one compiles it and uses the binary that the diff needs to be refreshed?
At the same time we need to check whether the binary in the model dir and in the work folder are the same, because ESM-Tools does not update the bin in a simulation unless you run esm_runscripts with the option --update-files=bin
(or some argument like that)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I wasn't entirely sure about this, I'll think more about it. At the moment, it checks the modification time of the directory. The naive assumption that "changing a file in a directory will change the modification time of the directory" is wrong, apparently:
$ mkdir foo
$ gstat foo # gstat because Darwin != Linux
File: foo
Size: 64 Blocks: 0 IO Block: 4096 directory
Device: 1,14 Inode: 24969841 Links: 2
Access: (0755/drwxr-xr-x) Uid: (952471260/ pgierz) Gid: ( 0/ wheel)
Access: 2024-08-09 15:19:30.833895558 +0200
Modify: 2024-08-09 15:19:30.833895558 +0200
Change: 2024-08-09 15:19:30.834023725 +0200
Birth: 2024-08-09 15:19:30.833895558 +0200
$ sleep 10
$ touch foo/a_file
$ gstat foo
File: foo
Size: 96 Blocks: 0 IO Block: 4096 directory
Device: 1,14 Inode: 24970749 Links: 3
Access: (0755/drwxr-xr-x) Uid: (952471260/ pgierz) Gid: ( 0/ wheel)
Access: 2024-08-09 15:21:26.892522871 +0200
Modify: 2024-08-09 15:21:55.836772107 +0200
Change: 2024-08-09 15:21:55.836772107 +0200
Birth: 2024-08-09 15:21:26.892522871 +0200
$ echo "my_change" > foo/a_file
$ sleep 3
$ gstat foo
File: foo
Size: 96 Blocks: 0 IO Block: 4096 directory
Device: 1,14 Inode: 24970749 Links: 3
Access: (0755/drwxr-xr-x) Uid: (952471260/ pgierz) Gid: ( 0/ wheel)
Access: 2024-08-09 15:22:46.780003463 +0200
Modify: 2024-08-09 15:21:55.836772107 +0200
Change: 2024-08-09 15:21:55.836772107 +0200
Birth: 2024-08-09 15:21:26.892522871 +0200
...so I'll need to think of something else. But that is just a matter of changing the logic inside is_younger_than
. (I like this name, it makes it easy to read 😉)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The access time changes, maybe we can use that...
I added an Audit Log, which has all possible info you could possibly wish for to check modification times, access times, and checksums. See in src/esm_runscripts/checksum_helpers.py This, together with the git stuff, should give all possible information someone would want if their model changed halfway during the experiment. |
Only thing left to do is parallelising the git diff and checksum parts. |
VCS fixes and improvements.
TODO
For this PR I will not implement the checking functionality, only the writing of the git status and diff file.