-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cache is not stored/reused from previous builds #3042
Comments
Is this, perchance, an expected behaviour and supposed to be solved in version 0.7 (P2) according to https://bazel.build/roadmap.html P2. All external repositories can use the local cache ? |
No, it doesn't have anything to do with external repositories. The reason is that the Bazel action cache does not store file contents, so we can't recover the content from the first step - it gets overwritten in step 3. @damienmg actually wants to get rid of the action cache (as it currently exists) and use a 'spawn cache' instead, which is more like the remote caching we're working on. However, it's unclear how we can avoid making copies of all the output files and generally preventing unbounded growth of the cache. Put differently, if you use the remote cache, you will not see this behavior. |
Can't the local cache look like the remote cache (content-addressable)? I imagine most remote cache implementations are using a bounded LRU / LFU. |
That is my prototype to do a LRU disk based cache for spawn action. it is unclear if it would replace the action cache. |
@jgavris We could certainly make changes here, and that's exactly what @damienmg is proposing. And certainly, we'd want an appropriate eviction strategy. However, it'd have to be larger than the size of your output tree to be useful, so it'd at least double the disk space needed for Bazel, so this isn't a change to make lightly. |
Nope, only pending change that I need to rebase, would love to merge them
thus.
…On Thu, Sep 7, 2017 at 7:53 PM Rahul Malik ***@***.***> wrote:
@ulfjack <https://github.com/ulfjack> @damienmg
<https://github.com/damienmg> - Is there a version of this LRU cache
behind an experiment flag? I was thinking of rolling my own solution but
would rather use one built into bazel itself.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#3042 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ADjHf1_JlWS0y0YDpbbmZzERm27iM_CAks5sgCbUgaJpZM4Nk4kV>
.
|
@ulfjack @damienmg The status of the pending change is not clear to me. Does it work as is? Any comments if it could be used already today for earlier adopters. I also disagree, with the priority here. From the user perspective, I would just expect sane local disk cache behaviour (across all workspaces and branches on local machine). That is exactly what Buck does per default here: [cache]
mode = dir
dir = buck-out/cache Moreover, with my Buck patch: "Add support for user home in cache directory name" the cache directory can be generically assigned to user home directory: [cache]
mode = dir
dir = ~/.gerritcodereview/buck-cache Now, for all workspaces/clones, for all branches, you would get cache hits with Buck.
That's what
To make my POV even more clear: If I would be on Bazel team, I would stop working on anything else, until this bug is fixed and released. |
I'd agree but it sounds like the discussion internally hasn't reached consensus? I'm running a remote cache locally to get around this issue but I expected this behavior from the start and it was a shock to myself and my team that it didn't work this way. The local remote cache is not an ideal solution since I'm also setting up a true remote cache on another cluster and I can't configure more than one cache per bazel invocation |
I don't think at this point anyone is opposing adding support for a local cache, but we're not actively working on it. It should be fairly straightforward to add support for a local cache at this point. @damienmg's change is outdated - most of the classes have been refactored. I'd go about it by moving the LruCache to a more generic location, and possibly refactoring its API. The most straightforward way to do it seems to be to reuse RemoteActionCache, and implement SimpleBlobStore. Ideally, we'd move any code that's independent of remote / local into a generic class (AbstractSpawnCache?), although it might use some protos from the remote protocol to represent action keys and whatnot. |
@ulfjack - While not completely familiar with the code, it seems like we might be able to use |
We have two interfaces: ActionCache and RemoteActionCache Implementing ActionCache is a lot more complicated and a lot more difficult to get right, so I do not recommend doing that. Implementing RemoteActionCache is much simpler and easier to get right. Ignore the "Remote" prefix, it doesn't have any meaning at this point. Similarly, the "Remote" prefix of RemoteSpawnCache has no meaning (sorry about that). The SimpleBlobStoreActionCache implements RemoteActionCache. You can use a RemoteSpawnCache with a SimpleBlobStoreActionCache with a OnDiskBlobStore to implement local caching - prototyping this should be trivial. There are two problems with doing it that way:
|
Thanks for the tips. I'll look into prototyping something next week. Would it be reasonable to say that the usage of ActionCache would be replaced with SpawnCache or are there reasons to keep ActionCache around? |
|
@davido - Thanks for putting this together! That is a lot more straightforward than what I was thinking of doing (replacing all usage of ActionCache with SpawnCache). I think this is a good start and will likely benefit from a few additions:
Thoughts? Happy to help build these changes on top of your PR. @ulfjack - Given this is behind an experimental flag, will this be able to land before having eviction implemented or is that required for this to merge? |
@rahul-malik Yeah, eviction strategy is definitely something that would need to be added at some point. I'm not sure i understand your second point about replacement of action cache? There is no action cache right now for non remote execution strategy. I've thoroughly tested my CL And while I see cache hits, when building in the same workspace and switching across different branches, i still see cache misses when switching across different workspaces. Steps to reproduce. Apply my CL and do something like that:
I would expect the cache hits in the last step, but I'm seeing that everythinng gets recompiled and the cache directory content was expanded with new cache entries. Just a WAG: this is because execution root contributes to the digest algorithm and is the part of they cache key, so that this requirement can't work atm? Can we tweak key computation here to make this work? |
@davido - My second point was mainly around the fact that this PR provides the behavior we expected by default for the local cache. To enable this behavior you have to use the remote strategy but given the code snippet below, you would be unable to utilize this local disk cache in combination with a REST / Hazelcast cache.
|
@ulfjack Ah, right, i saw that, but forgot. Thanks, for pointing this out. I will re-test it later today. |
@davido - Another thought also: Does the cache file path need to be absolute? It would be easier to share the change with other developers if we were able to put it under the home directory for instance. |
@rahul-malik Yeah, exactly! Right now it's not possible, but I'm planning to add another change on top of this CL, something similar, to what I did 4 years ago in Buck, as Gerrit Code Review was using Buck. We must support "~/.gerritcodereview/bazel-cache/cas" and commit it under GIT in |
+1
…On Mon, 18 Sep 2017 at 18:07 David Ostrovsky ***@***.***> wrote:
@rahul-malik <https://github.com/rahul-malik> Yeah, exactly! Right now
it's not possible, but I'm planning to add another change on top of this, something,
similar, to what I did 4 years ago in Buck
<facebook/buck@a1ba001>,
as Gerrit Code Review was using Buck. We *must* support
"$HOME/.gerritcodereview/bazel-cache/cas".
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#3042 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABUIF2zE3qUB0wjhxES0Dk2icFHd_e1Uks5sjocmgaJpZM4Nk4kV>
.
|
This was fixed in the latest patch set of the CL. Now you can pass in generic
Also note, that these options could (and should) be committed into the GIT tree in the
|
@davido : what do you mean with latest patchset of the CL? Is this on Bazel git master? |
@nordlow - The CL he's referring to is on Gerrit right now. You can see the patch here: https://bazel-review.googlesource.com/c/bazel/+/16810 |
@nordlow Can you build custom version of Bazel with this CL, and repeat your benchmark? Note, that you would need to pass these three experimental options:
|
@davido - did this end up merging? Looks like the PR is approved? |
@rahul-malik I think it should be merged in the next days. @damienmg Any comment what is the ETA? |
I send the CR for internal review but @ulfjack is away today and there seems to be internal test failure (which should not) so I Monday afternoon probably, tuesday at worse. |
Not until it is the default I think.
…On Thu, Sep 28, 2017, 9:26 AM David Ostrovsky ***@***.***> wrote:
@damienmg <https://github.com/damienmg> this can be closed now with
82859b0
<82859b0>,
right?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#3042 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ADjHf82Bopus9ITZ10R7P8kfMVJF4q7Eks5sm0oygaJpZM4Nk4kV>
.
|
Nope 0.6.1 was a patch release. You can try 0.7 release candidate.
…On Thu, Oct 12, 2017, 8:55 AM Rahul Malik ***@***.***> wrote:
@damienmg <https://github.com/damienmg> @davido
<https://github.com/davido> - Did this not make it into 0.6.1? I just
updated and was hoping to have this feature.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#3042 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ADjHf9kYfIZGlP4l61zzR0mpCit4xX_6ks5srbf5gaJpZM4Nk4kV>
.
|
Confirmed, that this feature is included in 0.7, e.g.: https://storage.googleapis.com/bazel/0.7.0/rc2/index.html . |
@davido just tried |
@Globegitter Unfortunately, home directory resolution ( $ cat ~/.bazelrc | grep cas
build --experimental_local_disk_cache_path=/home/davido/.gerritcodereview/bazel-cache/cas Note, that's why we cannot add this line to Gerrit Code Review's own tools/bazelr.c file, so that this will be enabled per default: $ cat tools/bazel.rc
build --experimental_local_disk_cache_path=~/.gerritcodereview/bazel-cache/cas |
@ittaiz @Globegitter home directory resolution (~) should be supported now, see #4852. |
What |
Apparently I misremembered - I thought we had a LruCache class for the repository cache. Sorry for the confusion. |
I think this feature is now available via |
Bazel currently doesn't use caching in step 5 of
Test case (that generates 10000 files that are built to a static library) is here: https://github.com/nordlow/build-system-benchmark/blob/master/test_bazel.sh
This is a severe limitation in continous integration server clusters that continously switch between different features branches that differ only in a very small percentage of the code.
Buck doesn't have this limitation.
Is this a known issue?
I'm using bazel-0.4.5 on Ubuntu 17.04.
The text was updated successfully, but these errors were encountered: