-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use ctime in file digest cache key #18003
Conversation
856188a
to
e6e6e50
Compare
2723c48
to
254d490
Compare
@janakdr Could you review this fix?
Edit: Is it possible that ctime is changed when the action inputs are made read-only? That could make the digest cache useless for output files with this change. Edit: Can confirm that the failure goes away if I comment out https://cs.opensource.google/bazel/bazel/+/9800ffd1a471582fb42fd782e2e2eeb39dba6c9b:src/main/java/com/google/devtools/build/lib/skyframe/ActionMetadataHandler.java;l=731, so this probably requires more extensive changes. Edit: The Windows file system implementation does not support ctimes, which means that the issue isn't fixed by this PR on Windows. However, ctime could be obtained with native calls. |
ad913d0
to
26c0be6
Compare
/cc @justinhorvitz can you take a look at this? |
94b6c3c
to
0ba0fad
Compare
Does the latest change in 1affa3e fix this? |
@justinhorvitz Yes, at least as far as the tests are concerned. I don't know whether the cache thrashing is only relevant for Regarding ctime on Windows: I could add native functions that get it, but this requires an additional stat call per file. I have no intuition for performance on Windows, would that still be better than disabling the digest cache? |
I don't have much personal stake in external bazel or Windows performance so let's ask someone else. @tjgq do you see any potential scenario where tree artifact outputs would have a changed |
Just want to note that these are separate issues:
|
Yup, understood. |
1affa3e
to
a2c9a39
Compare
@@ -120,6 +120,22 @@ public long getNodeId() { | |||
return System.identityHashCode(this); | |||
} | |||
|
|||
@Override | |||
public synchronized int getPermissions() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see what this needs to be synchronized with. Also, please make it final
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Made it final
and removed synchronized
. I am not entirely sure about the intended behavior of getPermissions
when the file permissions are updated concurrently, but I agree that synchronized
doesn't help as the setters aren't synchronized.
a2c9a39
to
984683f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR looks good to me, but I think we should get a second opinion specifically for the tree artifact question.
beac457
to
8043408
Compare
@bazel-io flag |
@bazel-io fork 6.2.0 |
@keertk @meteorcloudy I think that we should probably fork this for 5.4.1 and 6.1.2 as well as it fixes a correctness issue. |
@bazel-io fork 5.4.1 |
@bazel-io fork 6.1.2 |
@fmeum Some update: we are importing this CL, but one internal test was broken, looking into a fix. |
@@ -727,7 +727,8 @@ private static FileArtifactValue fileArtifactValueFromStat( | |||
} | |||
|
|||
private void setPathPermissionsIfFile(Path path) throws IOException { | |||
if (path.isFile(Symlinks.NOFOLLOW)) { | |||
FileStatus stat = path.stat(Symlinks.NOFOLLOW); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this use statIfFound
? All of the other stats in this file use that to tolerate a nonexistent file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True, it should to preserve the original semantics. I pushed a commit to do this. I am a bit surprised that no test failed due to this change - do all missing files at this point result in a missing outputs
error down the line anyway?
The broken test was counting the number of stats, but wasn't doing so in a very robust way. I've updated it which should unblock the import. Meanwhile, while looking into this, I found I have one more question which I asked in |
I'm thinking about https://bazel.build/reference/command-line-reference#flag--experimental_use_hermetic_linux_sandbox that populates sandboxes by creating hard links. Ctime is changed when additional hard links are created to existing inodes (link count changes). How is this change affecting the file digest cache for that use case? |
I assume it would make it so that digests are recomputed every time a hard link is created, which doesn't sound good. Could you maybe try this out with Bazel built from this PR? I don't know enough about the hermetic sandbox to think of ways to improve digest caching for it. Since Bazel doesn't really care about changes to metadata, we would ideally like to have a "modification time, but not overrideable by users", for which |
File digests are now additionally keyed by ctime for supported file system implementations. Since Bazel has a non-zero default for `--cache_computed_file_digests`, this may be required for correctness in cases where different files have identical mtime and inode number. For example, this can happen on Linux when files are extracted from a tar file with fixed mtime and are then replaced with `mv`, which preserves inodes. Since Java (N)IO doesn't have support for reading file ctimes on Windows, a new method backed by a native implementation is added to `WindowsFileOperation`. Adding a call to this function to `stat` uncovered previously silent bugs where Unix-style `PathFragment`s were created on Windows: 1. Bzlmod's `createLocalRepoSpec` did not correctly extract the path from a registry's `file://` URI on Windows. 2. `--package_path` isn't usable with absolute paths on Windows as it splits on `:`. Since the flag is deprecated, this commit fixes the tests rather than the implementation. Fixes #14723 Closes #18003. PiperOrigin-RevId: 524297459 Change-Id: I96bfc0210e2f71bf8603c7b7cc5eb06a04048c85 Co-authored-by: Fabian Meumertzheim <fabian@meumertzhe.im>
File digests are now additionally keyed by ctime for supported file system implementations. Since Bazel has a non-zero default for `--cache_computed_file_digests`, this may be required for correctness in cases where different files have identical mtime and inode number. For example, this can happen on Linux when files are extracted from a tar file with fixed mtime and are then replaced with `mv`, which preserves inodes. Since Java (N)IO doesn't have support for reading file ctimes on Windows, a new method backed by a native implementation is added to `WindowsFileOperation`. Adding a call to this function to `stat` uncovered previously silent bugs where Unix-style `PathFragment`s were created on Windows: 1. Bzlmod's `createLocalRepoSpec` did not correctly extract the path from a registry's `file://` URI on Windows. 2. `--package_path` isn't usable with absolute paths on Windows as it splits on `:`. Since the flag is deprecated, this commit fixes the tests rather than the implementation. Fixes #14723 Closes #18003. PiperOrigin-RevId: 524297459 Change-Id: I96bfc0210e2f71bf8603c7b7cc5eb06a04048c85 Co-authored-by: Fabian Meumertzheim <fabian@meumertzhe.im>
File digests are now additionally keyed by ctime for supported file system implementations. Since Bazel has a non-zero default for `--cache_computed_file_digests`, this may be required for correctness in cases where different files have identical mtime and inode number. For example, this can happen on Linux when files are extracted from a tar file with fixed mtime and are then replaced with `mv`, which preserves inodes. Since Java (N)IO doesn't have support for reading file ctimes on Windows, a new method backed by a native implementation is added to `WindowsFileOperation`. Adding a call to this function to `stat` uncovered previously silent bugs where Unix-style `PathFragment`s were created on Windows: 1. Bzlmod's `createLocalRepoSpec` did not correctly extract the path from a registry's `file://` URI on Windows. 2. `--package_path` isn't usable with absolute paths on Windows as it splits on `:`. Since the flag is deprecated, this commit fixes the tests rather than the implementation. Fixes bazelbuild#14723 Closes bazelbuild#18003. PiperOrigin-RevId: 524297459 Change-Id: I96bfc0210e2f71bf8603c7b7cc5eb06a04048c85
File digests are now additionally keyed by ctime for supported file system implementations. Since Bazel has a non-zero default for `--cache_computed_file_digests`, this may be required for correctness in cases where different files have identical mtime and inode number. For example, this can happen on Linux when files are extracted from a tar file with fixed mtime and are then replaced with `mv`, which preserves inodes. Since Java (N)IO doesn't have support for reading file ctimes on Windows, a new method backed by a native implementation is added to `WindowsFileOperation`. Adding a call to this function to `stat` uncovered previously silent bugs where Unix-style `PathFragment`s were created on Windows: 1. Bzlmod's `createLocalRepoSpec` did not correctly extract the path from a registry's `file://` URI on Windows. 2. `--package_path` isn't usable with absolute paths on Windows as it splits on `:`. Since the flag is deprecated, this commit fixes the tests rather than the implementation. Fixes #14723 Closes #18003. PiperOrigin-RevId: 524297459 Change-Id: I96bfc0210e2f71bf8603c7b7cc5eb06a04048c85
File digests are now additionally keyed by ctime for supported file system implementations. Since Bazel has a non-zero default for `--cache_computed_file_digests`, this may be required for correctness in cases where different files have identical mtime and inode number. For example, this can happen on Linux when files are extracted from a tar file with fixed mtime and are then replaced with `mv`, which preserves inodes. Since Java (N)IO doesn't have support for reading file ctimes on Windows, a new method backed by a native implementation is added to `WindowsFileOperation`. Adding a call to this function to `stat` uncovered previously silent bugs where Unix-style `PathFragment`s were created on Windows: 1. Bzlmod's `createLocalRepoSpec` did not correctly extract the path from a registry's `file://` URI on Windows. 2. `--package_path` isn't usable with absolute paths on Windows as it splits on `:`. Since the flag is deprecated, this commit fixes the tests rather than the implementation. Fixes bazelbuild#14723 Closes bazelbuild#18003. PiperOrigin-RevId: 524297459 Change-Id: I96bfc0210e2f71bf8603c7b7cc5eb06a04048c85
File digests are now additionally keyed by ctime for supported file system implementations. Since Bazel has a non-zero default for
--cache_computed_file_digests
, this may be required for correctness in cases where different files have identical mtime and inode number. For example, this can happen on Linux when files are extracted from a tar file with fixed mtime and are then replaced withmv
, which preserves inodes.Since Java (N)IO doesn't have support for reading file ctimes on Windows, a new method backed by a native implementation is added to
WindowsFileOperation
. Adding a call to this function tostat
uncovered previously silent bugs where Unix-stylePathFragment
s were created on Windows:createLocalRepoSpec
did not correctly extract the path from a registry'sfile://
URI on Windows.--package_path
isn't usable with absolute paths on Windows as it splits on:
. Since the flag is deprecated, this commit fixes the tests rather than the implementation.Fixes #14723