Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Ran test 50N_010E to completion on 10 years (2012-2021) but it failed…
… on the full time series. Even on 2012-2021, unmanaged memory increased over time, getting into orange and then red for all workers. Eventually, workers died, but somehow they restarted and finished the time series. The same thing happened with the full time series two times (workers in the red zone for memory and then dying) but I guess it just happened too many times and eventually the model died. So, adding dask.config.set({"distributed.nanny.pre-spawn-environ.MALLOC_TRIM_THRESHOLD_": 1}) based on my conversation at dask/distributed#5971 (comment) didn't actually reduce unamanaged memory but did make the model push through the accumulated unmanaged memory, at least one or two times. Of course, this isn't a viable solution overall. But it is good data; unmanaged memory accumulation isn't due to MALLOC_TRIM_THRESHOLD_.
- Loading branch information