-
Notifications
You must be signed in to change notification settings - Fork 17.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cmd/go: clean GOCACHE based on disk usage #29561
Comments
I think this would be a good thing to have, especially since the cache will be required starting in Go 1.12. I don't think there's a fixed maximum cache size that would work for everyone, but maybe we could at least make eviction more aggressive when the disk is low on space. |
We're no longer allowing I seem to remember that @rsc designed the cache to automatically evict based on modification time. Seems sound to me to also automatically evict if the disk is full enough that it might realistically hit errors if it builds a few large packages (say, if there's less than 500MB left). |
This sounds like a bit of a can of worms. How do you check for remaining disk space in a cross-platform way? How much free space should trigger a cache eviction? How do you guarantee that, having evicted some cache, the build process won't still run out of disk anyway? If you're building on a small disk, then wouldn't running |
I personally don't know how easy it would be to control "does the disk have enough space left" in a sane and portable way. I was simply pointing out that since the current eviction algorithm takes into account timestamps, perhaps it should also use some data from the filesystem or disk. If we can make that work, we could avoid adding more knobs to the go tool.
That might not be a great solution - what if you're building a very large project like Kubernetes? It might produce many hundreds of megabytes of build cache, so it's not unreasonable to think that it could on its own be enough to fill up some filesystems. |
Indeed. So this proposal would, at best, kick the problem slightly further down the road. |
I just filled up my hard drive again, and am waiting for Since |
I'm not going to claim that this is the best possible solution, but I just run |
Those of you who do "unusual things", what sizes do you encounter that are a problem? I can't remember the last time I cleaned |
I got a 200GB gocache folder once, after 12hrs of compiler fuzzing. |
@mvdan this morning it clocked in at a little over 250GB. In the past I’ve hit 400GB. I’d probably have hit that last night except my script died because I ran out of disk space. @ianlancetaylor I see you did there. :) To expand on why that isn’t a suitable solution for me: In normal use, everything is fine. This only happens to me when I start an overnight computation, like compiling 40 different toolchain commits for every platform. I’d have to run go clean -cache in the middle of the night. I could start a script to do that, but that is an easy thing to forget. I could set up a cron to always clear it every hour, but that would slow down my non-Go-toolchain work, which is a higher priority. Or I could have the script doing the work clean the cache, except that this is a script I’m making publicly available (compilecmp), which means I’ll be clearing other people’s’ caches, which seems unfriendly. I guess I could have my script create a temp dir, set GOCACHE to it, and clear it regularly. I’ll try that. |
Another reason to want to be able to disable the cache in these circumstances is to avoid the wear and tear on my SSD of writing and then immediately deleting 100s of GB. I could set up a RAM disk, except that that is fiddly and platform-specific, and I’m trying to maintain a tool to be used by not-just-me. |
Your workflows are definitely heavier than mine :) is there a way to expose this "no GOCACHE" mode only to advanced users, so that we don't encourage the broader community to turn off the cache in general? Perhaps hide it behind an undocumented flag? |
One idea: I'm not sure I fully understand why there's so much concern about people disabling the cache. We've forced it on everyone for long enough that we should be past the FUD. And if people really want to waste resources, that's their business. And for the folks who have a genuine need to disable the cache, they can. |
I just ran into this issue. I've known about the Go build cache, but I assumed it was managed automatically by cmd/go, or that it would warn me when manual intervention is required (like how git auto-GCs or whatever, but occasionally nags you into doing thtat manually). FWIW, running "git clean -cache" ended up freeing 212GB of my workstation's 730GB disk. |
@mdempsky too much defer fuzzing? :^) One workaround I've adopted for compiler fuzzing: in the driver, invoking |
I don't think I'm doing anything special (just working on a ~100kLOC project) but I regularly run out of space. It takes a month or two, but I have a 240GB disk with ~100GB free and the go cache will eat all of that eventually. I could set something up but it's pretty obnoxious that Go eats all my available disk space. |
@mvdan Maybe an environment variable for "Limit the go cache to X% of the disk"? A default value of 10% or maybe even 2% would be fine on the vast majority of systems. Having more than 10% of my disk consumed by a build cache is not ideal. For seriously space constrained systems, add a note to the documentation, "If you build go on a very small disk you may want to increase this limit". |
@firelizzard18 it's surprising that you say it takes a month or two for your cache to fill your disk. Cache entries get cleared at one-day intervals if they are older than five days, as you can see at: go/src/cmd/go/internal/cache/cache.go Lines 273 to 289 in e822b1e
So you would have to build tons of different Go packages within a few days to realistically grow your build cache to hundreds of gigabytes. That's the kind of stress that @ALTree does while fuzzing, or @josharian while compiling many toolchain versions for many platforms. The only other scenario we're aware of is small disk sizes, like the author's 9GiB USB stick. My personal take is that, if the build cache sizes are often a problem while stressing the Go toolchain with tons of different builds, we should start by providing those advanced users with a way to disable build caching entirely, like @josharian's suggestion in #29561 (comment). I don't think any regular users would try to use |
@mvdan I wiped my cache ( |
Can you look into what's the oldest file in that directory tree, by modified time? None of the files should be older than five or six days. |
Looks like the oldest file is 5 days old. I'm copying everything onto another drive so I can clear my main drive. |
is it some ide driven process that constantly compiles/tests? |
@seankhliao VSCode runs |
That's probably why your build cache fills with tens of gigabytes within five days. Running static analysis at every file save will rebuild any modified packages and their transitive dependents, adding new entries to the build cache each time. So you're likely increasing the size of your build cache in the order of megabytes for five days each time you save. We should still do better, but what you're seeing seems to be in line with the current intended behavior. |
IMO a build cache should not consume more than 10% of the space on my PC, and maybe not even that much. It makes sense to allow the build cache to consume a lot more of the disk for a dedicated builder. I think a max disk space percentage environment variable (configurable with |
The main benefit of the current algorithm is simplicity: we just scan the tree for files that haven't been used in 5 days and delete them. We could drop the number of days, but the idea was to avoid a slow start after a long weekend. It is also possible to do incrementally and requires no global state, locks, or synchronization between different instances of the go command. It also automatically sizes to the working set rather than using a fixed amount of disk or thrashing in too small an amount of disk. I'm happy to consider other algorithms, but they need these properties too. Here is one possibility. The cache is basically a mark-sweep GC for old files, using the mtime as an auto-expiring mark bit to keep a file from being reclaimed for a while. We could instead use a generational GC, with a young generation and an old generation. If the cache is filling up with files that are used for a single build or only for a few minutes and then never used again, we could collect the young generation more aggressively. Specifically, we could keep a young/old bit in the mtime as well, perhaps in the low order bits. Every time we write a new cache file, set its mtime to time.Now.Truncate(time.Minute), so that it ends in :00 (0 seconds). If it gets used after an hour, we set the updated mtime to end in :30. Then the sweep can distinguish young vs old files and delete young files more aggressively. If someone has a usage pattern that creates a lot of cache and wants to try it out, patch in https://go.dev/cl/433295. |
Change https://go.dev/cl/433295 mentions this issue: |
I recently reinstalled Linux on a larger drive. I'll check what my cache looks like at the end of next week, then switch to https://go.dev/cl/433295 and see what it looks like the week after that. |
I don't know if my usage has changed or there's something weird about my previous environment but I haven't seen the cache go over 10 GB since |
Unfortunately, this issue still exists - I've run into this multiple times in the last couple of weeks while building and debugging Go on a 16GB partition:
|
Same problem here, the cache regulary grows to >50GB and fills up my partition. I usually don't notice until things (including completely unrelated things) start failing. I have to wonder, I have plenty of caches in ~/.cache, including ccache, node, bazelisk, etc. None of these exhibits this problem, why is Go special? |
We are now regularly (every other week or so) hitting 150GB+ build caches on our CI servers when building a decently-sized project that makes extensive use of generics in core code. It would be nice to have some way of limiting the cache size, or to prune only old/unused cache files, as currently our workaround is to completely clean the cache regularly, which results in very long build times the next time a build job is run. |
I just hit this again: cache >500gb, full hard drive, failing stuff everywhere. I was running a loop in which I perturbed a bit of code, re-ran the test, continued again. A generational cache would probably help. This is one of those infrequent-but-very-painful issues. |
We got a bit tired of having to manually manage the cache directory on our CI servers and ended up writing an internal tool that prunes the cache before each build by removing the least recently accessed files (checked by reading In our case running a single build with a clean cache results in a build cache of ~5 GB, and compiling in different modes (debug, release, test/coverage), possibly with different tags, the cache size quickly adds up over the course of a week, especially when working on more core or heavily generic code. Though it might help if we were more careful about always using the same build options. Perhaps the cache could track access times itself by updating the |
The GOCACHE appears to lack a disk size limit - this is a problem in a space constrained environment and/or when running go on a disk that is nearing capacity. For example, on the openbsd/arm builder (which runs on a USB stick), the
~/.cache/go-build
directory runs past several GB in a very short time, which then leads to various failures (git clone
or go builds). The only option that I currently appear to have is to rungo cache -clean
regularly, in order to keep the cache at a respectable size. It seems that having a configurable upper bound would be preferable and/or free disk space based checks that prevent writes (e.g. evict then write) to the cache from failing when the disk is full due, partly due to a large GOCACHE:The text was updated successfully, but these errors were encountered: