-
Notifications
You must be signed in to change notification settings - Fork 265
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
NCZarr memory leak with NetCDF 4.9.0 #2733
Comments
The ZARR dataset is only spatially chunked and not in time. |
The data that is read is stored in a per-variable cache. If you have been accessing many variables you may |
Is it possible to limit the size of the cache to a user defined value? |
Yes. Th
It sets parameters for the per-variable cache. So I would suggest calling the function for the given variable with the size set to the space you are willing to use, and with the nelems parameters set to some large number (say 1000) so that the size parameter is the only one that will have an effect. The preemption parameter can be set to 0.5 since it is unused. |
re: Unidata#2733 When addressing the above issue, I noticed that there was a disconnect in NCZarr between nc_set_chunk_cache and nc_set_var_chunk cache. Specifically, setting nc_set_chunk_cache had no impact on the per-variable cache parameters when nc_set_var_chunk_cache was not used. So, modified the NCZarr code so that the per-variable cache parameters are set in this order (#1 is first choice): 1. The values set by nc_set_var_chunk_cache 2. The values set by nc_set_chunk_cache 3. The defaults set by configure.ac
I have now tested nc_set_var_chunk_cache() and nc_set_chunk_cache() with different parameters and could not see any difference in overall memory usage for my workflow.
|
re: PR Unidata#2734 re: Issue Unidata#2733 As a result of an investigation by https://github.com/uweschulzweida, I discovered a significant bug in the NCZarr cache management. This PR extends the above PR to fix that bug. ## Change Overview * Insert extra checks for cache overflow. * Added test cases contingent on the --enable-large-file-tests option. * The Columbia server is down, so it has been temporarily disabled.
I have a large ZARR data set. I want to read it time step by time step. This causes me to exceed the memory limit (1TB) on my machine. It looks like all read data is kept uncompressed in memory. Is this intentional or is it a memory leak?
I am using netCDF 4.9.0. In my application only one time step is stored at a time. When I read the same data set as NetCDF4, my application only needs 100MB.
The text was updated successfully, but these errors were encountered: