-
Notifications
You must be signed in to change notification settings - Fork 265
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CDF-5 fix: let NC_var.len be the true size of variable #478
Conversation
Ok, after checking, this change is incorrect. The use of UINT MAX is used as a flag to |
…llowing the CDF format specification
Whatever fix we get, I think it is important to test the following 6 configurations. |
When running python tests against libnetcdf built with this patch, I see the following (on 64-bit systems only). This will need to be sorted out before merging this fix in or saying that it 'fixes' the problem.
|
Where to obtain netcdf4-python version: 1.3.0? |
I do not have a netcdf4-python installed nearby. |
You are correct. gh478 does not contain your patch. I merged it locally to a temporary branch and observed the failures. I will test the updated patch against NetCDF python tomorrow and follow up here. |
Regarding to developing test programs for 32-bit machines, I think NetCDF may not be able to support CDF-5 on 32-bit machines. The main obstacle is "size_t" being a 4-byte integer on 32-bit platforms, and most of the netCDF APIs have arguments of type size_t. For instance, one cannot define a dimension of size > 2^32, because argument "len" is of type size_t in the "nc_def_dim".
To support CDF-5 on 32-bit machines, NetCDF needs to change its APIs, which will not be backward compatible. Maybe the only option is to disable CDF-5 on 32-bit machine? |
@wkliao 1.3.0 is coming through from the |
1.2.9 also fails in the same place. The CDF-5 on 32-bit machines is problematic and I will have to think about it. In the short term we definitely need to add a way at configure time to make CDF5 support optional. If we are on a 32-bit platform we can disable it by default, possibly, if that is the approach we decide to take. In any event it would give us an avenue for providing a release while working around CDF5-specific issues. For what it is worth, this netcdf-python failure is only manifesting on 64-bit platforms, not 32-bit platforms. |
That python program, tst_cdf5.py, itself is also problematic. Line 7 sets "dimsize" to the maximum value of a signed 64-bit integer, 9223372036854775807 or NC_MAX_INT64. (By the way, the comment should be fixed.) Then, line 19 defines a 1-D variable of type unsigned int with NC_MAX_INT64 elements, which makes the variable size (=NC_MAX_INT64 x 4 bytes) bigger than an unsigned 64-bit integer can represent (overflow). In this case, NetCDF should throw an error code, such as NC_EINTOVERFLOW. By the way, in PnetCDF, the limit will be max signed 64-bit integer (X_INT64_MAX - 3), not unsigned, as we use MPI_Offset which is a signed long long.
I wrote a C program to mimic the python test program and it ran without error. However, the correct behavior should be trowing NC_EVARSIZE.
|
… 3 for CDF-5 files
This patch 8b3d32c checks variable size against (X_INT64_MAX - 3) for CDF-5 files and throws NC_EVARSIZE. Below is a revised test program to check the expected error code.
|
…ger then X_INT64_MAX - 3 for CDF-5 files
I think throwing NC_EVARSIZE in nc_def_var is better than in nc_enddef.
|
…n) of all variables follows the same increasing order as they were defined.
Can we get a summary comment about what all this pr is doing? |
The last 5 commits added two test programs and a corrupted CDF-5 file for testing. Without the patches from this pull request, NetCDF should fail both test programs. |
Failure building on Windows; should be a simple fix, once that's sorted will get it merged. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See my comment on the pr conversation
I do not understand the Also,, much of this change should not be cdf5 specific. There is no reason Also, if this is correct |
I meant to say NetCDF-4 is unable to support CDF-5 when size_t is 4-byte. |
Sorry, but I am missing something., I need to see an explicit statement about why |
|
There is probably a larger point here. We should also never be using the type "long" |
I did mean NetCDF-4, specifically NetCDF-4 APIs. Most of the arguments in NetCDF-4 APIs are of type size_t and size_t is always 4-byte on 32bit machines. This makes NetCDF-4 not possible to support CDF-5 properly on 32bit machine. Can you point out the internal variables that should be defined as long long? |
At least these:
In v1hpg.c: In nc3internal.c: Note: ncio.h has a number, but that api probably needs fixing |
Also, if you can compile with proper conversion warning, you should |
With AC_SYS_LARGEFILE in configure.ac, off_t will automatically set to an 8-byte integer if the 32-bit machine supports large file access. So, the only question left is for those variables of type size_t. If we agree to disable CDF-5 wherever size_t is not 8-byte, then the rest of compile-time type casting warnings becomes harmless. Adding -Wconversion to CFLAGS will produce a long list of compile-time warnings about type casting, majority of them are not CDF-5 related. I also believe they probably are harmless. Of course, removing all such warnings is the best, but requires a big effort. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should not ever rely on a flag to determine
the type size; this introduces a "hidden" dependency between the code and the type.
It is always safer to use a known size all the time.
More to the point, if we are going to do this, then let's do it right and in the cleanest |
What you asked is out of scope of this PR. Based on your argument, I wonder why you don't apply the same standard to all other PRs? This PR passes all the tests on both 64 and 32 bit machines. I do not see the point of blocking this PR using a problem that has been in NetCDF for a long time. I also disagree your argument of one should not ever rely on a flag to determine the type size. That has been a common practice in autotools. It really depends on whether you can use it properly. We can debate this for a long time, but the most important question is whether you would like this PR in NetCDF to fix the CDF5 problem observed/reported by @czender or not. |
IMO, the whole varying type size thing was a hack because the long long I will leave the call to Ward. If he takes your PR as is, then I am ok with that. |
@wkliao unfortunately I can't help get this PR merged to Unidata netcdf. If you find it useful, I am maintaining HPC NetCDF https://github.com/HPC-NetCDF/netcdf-c which has this PR merged, as well as most of the other PRs pending. HPC Netcdf is a drop-in replacement for Unidata netcdf, with advanced HPC features. (It is intended that all new features from HPC Netcdf will also be back-ported to Unidata netcdf, but that may take a while due to limited Unidata resources.) |
Happy Saturday all; I’ve just checked my work email, and am getting caught up on this. Lacking a demonstrated issue, I think the best course of action will be to accept this PR and if there is a potential issue that needs addressed we can do it via a follow up issue/pull request discussion. Let me merge a PR that fixes a window test and then I’ll follow on with this one. I’ll get that done now. |
@wkliao |
We all want the best for NetCDF, just approaching it from different angles. @edhartnett, HPC NetCDF is an interesting work. You have recently contributed lots of patches to NetCDF. I wish I can be as energetic as you. |
@wkliao I don't believe in letting the grass grow under my feet. HPC NetCDF is about to get more interesting. ;-) |
This PR should fix the problem discussed #463