-
Notifications
You must be signed in to change notification settings - Fork 262
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Quantization not working with NC_FORMAT_NETCDF4_CLASSIC #2441
Comments
Quantization is not expected to work on classic formats (at this time). The problem is that classic formats use a totally different code base than netCDF-4, and I personally am not at all familiar with that code. (Glenn Davis wrote it, Russ maintained it, and I think @DennisHeimbigner has the greatest familiarity with it now, but it's code that is pretty rarely changed. And quantization with classic formats is much less useful, because there is no compression anyway. So it does not speed up I/O nor save disk space. Am I missing something here? So right now we have quantization built into a netCDF-4 function which is the one that does all the data conversions. (For example if you write ints to an NC_FLOAT var, the ints have to be converted into floats.) Zarr piggy-backs on this because it uses the same function for data conversion. However, it should be working for NC_FORMAT_NETCDF4_CLASSIC. That's just a dumb bug probably, and my fault for not testing that. I will take a look at why that is happening. So I would change this issue title to "Quantization not working with NC_FORMAT_NETCDF4_CLASSIC", since classic formats were never intended to be working with quantization. |
Done. I agree there is no real world use-case for quantization in CLASSIC format. NC_FORMAT_NETCDF4_CLASSIC is very commonly used for compression (CMIP6 requires it), so we were a bit careless in not testing quantization with that format. |
OK, the problem is in hdf5attr.c:
This happens when we are writing the quantization attribute. I'm pondering how best to fix this... OK, the real problem is here:
The variable defines, and thus the attempt to write the quantize attribute, is happening in the final statement, in the call to sync_netcdf4_file(), and by this time the flag has already been changed to be out of define mode. Instead, the sync should happen before the flag is changed. |
As I tried various things and reviewed this code, I saw that I had faced a similar problem some time ago when optimizing the storage of dimids to speed opening of files (and avoid the dreaded performance problems with dimension scale searches). The answer is not to attempt to use netCDF functions to write the attributes, but to use HDF5 directly. This avoids all the complicated checks for enddef that are done in order to exactly mimic the netcdf-3 enddef error handling. I have a PR up now, running through CI. We'll see... |
Though the glass is not empty! Quantization succeeds with NETCDF4 and Zarr.
Program nc_test4/tst_quantize.c tests quantization on a NETCDF4-format file.
That test succeeds. There is no problem with that one format. Unfortunately,
NC_FORMAT_NETCDF4 appears to be the only POSIX format that succeeds with the test.
tst_quantize fails with NC_FORMAT_NETCDF4_CLASSIC and with NC_FORMAT_CLASSIC.
I have not tried with 64BIT or 64BIT_DATA, though I suspect those too fail.
The failures can be seen by changing tst_quantize.c to use a different mode flag,
e.g., NC_NETCDF4|NC_CLASSIC_MODEL|NC_CLOBBER, to nc_create() near line 110.
NCO invocations of nc_def_var_quantize() succeed or fail in the same way
that tst_quantize.c succeeds or fails depending on the file format.
Calling nc_def_var_quantize() on a netCDF file with a format other than
NC_FORMAT_NETCDF4 leads to a state that causes nc_enddef() to fail.
In tst_quantize.c this is near line 162.
The nc_enddef() return code is -38 = NC_ENOTINDEFINE = "NetCDF: Operation
not allowed in data mode". Perhaps nc_def_var_quantize() leaves certain
file formats in data mode, rather than in define mode, which causes
nc_enddef() to fail? I looked and failed to spot any obvious bugs.
These problems occur both in 4.9.0 and in today's main trunk.
I have confirmed this behavior on Linux x86_64 with GCC and on MacOS Mac silicon with Clang.
The text was updated successfully, but these errors were encountered: