-
Notifications
You must be signed in to change notification settings - Fork 265
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Test failure building against hdf5 1.10.0 #244
Comments
I've also reported to hdf5 folks. Not sure who is to blame. |
You should be aware that HDF5 1.10 is not supported (yet) for netcdf-c. |
Here is the response from the hdf5 folks: It looks like netCDF needs to update their tests to use H5Oget_info in The H5Gget_objtype_by_idx function was deprecated in HDF5-1.8 and https://www.hdfgroup.org/HDF5/doc/RM/RM_H5G.html#Group-GetObjTypeByIdx hdf5 1.10 is officially released now. |
Thanks for sharing your response from the HDF Group. Since the HDF5 1.10 release changed the underlying format, we're going to have to give some thought about how to move forward adopting the HDF5 1.10 feature set. For now, we're going to require the 1.8 libraries until we have a chance to evaluate things. I see from the 1.10 release that they are providing a way to retain the 1.8 API (which is good), but I'm wondering if we'll need to specify a new file format type, to differentiate between netCDF4 files created with 1.8 and netCDF4 files created with 1.10. Almost certainly this will be the case when we leverage the new 1.10 features, but what about files created using the 1.10 library via the 1.8 API? Anyways, more thought is needed; for now, I recommend staying with hdf5 1.8 for netcdf. |
I'm still getting this with 4.4.1-rc1:
|
Ok, I've recreated this and am sorting it out. I had planned on treating it in a straight-forward manner; check for the existence of the deprecated function calls and only run the tests if they were found. I've now observed something strange, however; the test fails when building with autoconf, as reported. Building with cmake, however, results in the same test passing. I haven't started tracking down what's going on; I wanted to make a note of it before progressing. |
Ok. The comments above regarding replacing the deprecated symbols all hold true. However, they don't seem to be the problem here. In In a cmake-based build, the loop exits after 5 iterations, as expected. In an autotools-based build, we run into problems when This suggests there is a memory problem somewhere. My next avenue of investigation will be to recompile libhdf5-1.10.0 with |
Compiling netcdf-c with autotools as follows eliminates the error observed in
These settings mimic the default
Note that if hdf5 is built in Investigation is ongoing. |
Summary of findings thus-far
HDF5-twiddling observations:
ConclusionAt this point, I'm not sure there there's much we can do on our end about this other than change the default optimization level used by autotools/configure. We must eventually update these tests to move away from the deprecated functions (as outlined above), but our use of deprecated functions in these tests do not seem to be causing any problems. Rather, something with how the code is being optimized with |
Further summary of compiling netcdf-c, linking against libhdf5 1.10.0.
|
I've modified |
The fix (for now) has been merged, will revisit in the future, but for now it should work out of the box with autoconf as well as cmake. |
Coulkd you make the choice of opt. flags dependent on the HDF5 version? |
This "fix" is a bit problematic for Fedora as "-g -O2" are the default flags used in Fedora. But I'm fine with sticking with hdf5 1.8 in Fedora for now. |
Have you reported this issue to the HDF5 team? I think it should be regarded as a bug if the code cannot work with any level of compiler optimization. |
I haven't, but I will; I'm not certain if the issue is truly with the HDF library, since optimization with GCC 4.4 presents no problems, but I will pass along the info when i speak with them next week. I should also try 1.10.0-patch1 to see if the issue persists. I agree with the sentiment that the removal of the optimization isn't a great fix, and am open to other ideas. This should at least work in the meantime. |
If I understand correctly, you have some code that works with some level of optimization, but does not work with a different level of optimization. That's clearly a bug in HDF5. They are doing something clever with memory and they are not quite being clever enough. |
We also have O2 by default in Arch, and no more hdf5 1.8 for us. And I confirm this is still and issue with GCC 7.1.1 and HDF5 1.10.1:
|
Revisiting this for the imminent release candidate. |
UPDATE: I went back and re-read the thread, nevermind below. I'll continue to work on this. I'm ~~~~still~~~~ failing to recreate this issue, using |
|
Leaving this open for now but this still appears to be a compiler optimization issue, somehow. I still can't recreate it on my end (although I could previously) but will mitigate the deprecated API call at least. |
Hum… My
What’s yours? So I can try it here. What compiler? |
I was not building with parallel, but can try that. My command was as follows:
|
So, I’ve retried with a serial HDF5 version, same issue.
Compil flags:
I’ll try some other configuration flags and see if that changes anything. |
Thanks for the flags @ArchangeGabriel I will try to recreate on a fresh VM with the same flags you've used. |
Ok, using your flags I've been able to recreate this @ArchangeGabriel. Diving into GDB now to see what I can see. |
@ArchangeGabriel What version of gcc are you using? |
7.1.1 as said above. ;) |
Ok. Changing optimization from |
OK, I’ll try that, but in one hour or so, I’m a bit busy right now. ;) |
I am seeing the behavior previously described here although I am still uncertain why exactly it is happening or what I can do about it, at this point. I'll leave this issue open and would love to hear any ideas anybody else has, while I mull this over. |
@ArchangeGabriel No hurry, thanks! Just trying to poke any holes that I can, or establish some sort of pattern. Using the same flags and the same compiler/linker, but configuring with |
I’ve realized I never posted the results, but gcc 7 with |
Thanks; at this point I have to figure it is either a memory issue somehow in the old test (I'll run it through valgrind and check the results when I have a chance), or this hdf5-only test has exposed an issue in libhdf5. In either case, perhaps it is best to disable it as part of the distribution, for the time being, if I can confirm the fix is not something we can do on our end. |
Is there any downside in building with CMake rather than autotools? That could be a solution for most people. |
From a technical standpoint, no, although the majority of users would be adjusting their workflow; |
Upstream changes: ## 4.6.1 - March 15, 2018 * [Bug Fix] Corrected an issue which could result in a dap4 failure. See [Github #888](Unidata/netcdf-c#888) for more information. * [Bug Fix][Enhancement] Allow `nccopy` to control output filter suppresion. See [Github #894](Unidata/netcdf-c#894) for more information. * [Enhancement] Reverted some new behaviors that, while in line with the netCDF specification, broke existing workflows. See [Github #843](Unidata/netcdf-c#843) for more information. * [Bug Fix] Improved support for CRT builds with Visual Studio, improves zlib detection in hdf5 library. See [Github #853](Unidata/netcdf-c#853) for more information. * [Enhancement][Internal] Moved HDF4 into a distinct dispatch layer. See [Github #849](Unidata/netcdf-c#849) for more information. ## 4.6.0 - January 24, 2018 * [Enhancement] Full support for using HDF5 dynamic filters, both for reading and writing. See the file docs/filters.md. * [Enhancement] Added an option to enable strict null-byte padding for headers; this padding was specified in the spec but was not enforced. Enabling this option will allow you to check your files, as it will return an E_NULLPAD error. It is possible for these files to have been written by older versions of libnetcdf. There is no effective problem caused by this lack of null padding, so enabling these options is informational only. The options for `configure` and `cmake` are `--enable-strict-null-byte-header-padding` and `-DENABLE_STRICT_NULL_BYTE_HEADER_PADDING`, respectively. See [Github #657](Unidata/netcdf-c#657) for more information. * [Enhancement] Reverted behavior/handling of out-of-range attribute values to pre-4.5.0 default. See [Github #512](Unidata/netcdf-c#512) for more information. * [Bug] Fixed error in tst_parallel2.c. See [Github #545](Unidata/netcdf-c#545) for more information. * [Bug] Fixed handling of corrupt files + proper offset handling for hdf5 files. See [Github #552](Unidata/netcdf-c#552) for more information. * [Bug] Corrected a memory overflow in `tst_h_dimscales`, see [Github #511](Unidata/netcdf-c#511), [Github #505](Unidata/netcdf-c#505), [Github #363](Unidata/netcdf-c#363) and [Github #244](Unidata/netcdf-c#244) for more information. ## 4.5.0 - October 20, 2017 * Corrected an issue which could potential result in a hang while using parallel file I/O. See [Github #449](Unidata/netcdf-c#449) for more information. * Addressed an issue with `ncdump` not properly handling dates on a 366 day calendar. See [GitHub #359](Unidata/netcdf-c#359) for more information. ### 4.5.0-rc3 - September 29, 2017 * [Update] Due to ongoing issues, native CDF5 support has been disabled by **default**. You can use the options mentioned below (`--enable-cdf5` or `-DENABLE_CDF5=TRUE` for `configure` or `cmake`, respectively). Just be aware that for the time being, Reading/Writing CDF5 files on 32-bit platforms may result in unexpected behavior when using extremely large variables. For 32-bit platforms it is best to continue using `NC_FORMAT_64BIT_OFFSET`. * [Bug] Corrected an issue where older versions of curl might fail. See [GitHub #487](Unidata/netcdf-c#487) for more information. * [Enhancement] Added options to enable/disable `CDF5` support at configure time for autotools and cmake-based builds. The options are `--enable/disable-cdf5` and `ENABLE_CDF5`, respectively. See [Github #484](Unidata/netcdf-c#484) for more information. * [Bug Fix] Corrected an issue when subsetting a netcdf3 file via `nccopy -v/-V`. See [Github #425](Unidata/netcdf-c#425) and [Github #463](Unidata/netcdf-c#463) for more information. * [Bug Fix] Corrected `--has-dap` and `--has-dap4` output for cmake-based builds. See [GitHub #473](Unidata/netcdf-c#473) for more information. * [Bug Fix] Corrected an issue where `NC_64BIT_DATA` files were being read incorrectly by ncdump, despite the data having been written correctly. See [GitHub #457](Unidata/netcdf-c#457) for more information. * [Bug Fix] Corrected a potential stack buffer overflow. See [GitHub #450](Unidata/netcdf-c#450) for more information. ### 4.5.0-rc2 - August 7, 2017 * [Bug Fix] Addressed an issue with how cmake was implementing large file support on 32-bit systems. See [GitHub #385](Unidata/netcdf-c#385) for more information. * [Bug Fix] Addressed an issue where ncgen would not respect keyword case. See [GitHub #310](Unidata/netcdf-c#310) for more information. ### 4.5.0-rc1 - June 5, 2017 * [Enhancement] DAP4 is now included. Since dap2 is the default for urls, dap4 must be specified by (1) using "dap4:" as the url protocol, or (2) appending "#protocol=dap4" to the end of the url, or (3) appending "#dap4" to the end of the url Note that dap4 is enabled by default but remote-testing is disbled until the testserver situation is resolved. * [Enhancement] The remote testing server can now be specified with the `--with-testserver` option to ./configure. * [Enhancement] Modified netCDF4 to use ASCII for NC_CHAR. See [Github Pull request #316](Unidata/netcdf-c#316) for more information. * [Bug Fix] Corrected an error with how dimsizes might be read. See [Github #410](Unidata/netcdf-c#410) for more information. * [Bug Fix] Corrected an issue where 'make check' would fail if 'make' or 'make all' had not run first. See [Github #339](Unidata/netcdf-c#339) for more information. * [Bug Fix] Corrected an issue on Windows with Large file tests. See [Github #385](Unidata/netcdf-c#385]) for more information. * [Bug Fix] Corrected an issue with diskless file access, see [Pull Request #400](Unidata/netcdf-c#400) and [Pull Request #403](Unidata/netcdf-c#403) for more information. * [Upgrade] The bash based test scripts have been upgraded to use a common test_common.sh include file that isolates build specific information. * [Upgrade] The bash based test scripts have been upgraded to use a common test_common.sh include file that isolates build specific information. * [Refactor] the oc2 library is no longer independent of the main netcdf-c library. For example, it now uses ncuri, nclist, and ncbytes instead of its homegrown equivalents. * [Bug Fix] `NC_EGLOBAL` is now properly returned when attempting to set a global `_FillValue` attribute. See [GitHub #388](Unidata/netcdf-c#388) and [GitHub #389](Unidata/netcdf-c#389) for more information. * [Bug Fix] Corrected an issue where data loss would occur when `_FillValue` was mistakenly allowed to be redefined. See [Github #390](Unidata/netcdf-c#390), [GitHub #387](Unidata/netcdf-c#387) for more information. * [Upgrade][Bug] Corrected an issue regarding how "orphaned" DAS attributes were handled. See [GitHub #376](Unidata/netcdf-c#376) for more information. * [Upgrade] Update utf8proc.[ch] to use the version now maintained by the Julia Language project (https://github.com/JuliaLang/utf8proc/blob/master/LICENSE.md). * [Bug] Addressed conversion problem with Windows sscanf. This primarily affected some OPeNDAP URLs on Windows. See [GitHub #365](Unidata/netcdf-c#365) and [GitHub #366](Unidata/netcdf-c#366) for more information. * [Enhancement] Added support for HDF5 collective metadata operations when available. Patch submitted by Greg Sjaardema, see [Pull request #335](Unidata/netcdf-c#335) for more information. * [Bug] Addressed a potential type punning issue. See [GitHub #351](Unidata/netcdf-c#351) for more information. * [Bug] Addressed an issue where netCDF wouldn't build on Windows systems using MSVC 2012. See [GitHub #304](Unidata/netcdf-c#304) for more information. * [Bug] Fixed an issue related to potential type punning, see [GitHub #344](Unidata/netcdf-c#344) for more information. * [Enhancement] Incorporated an enhancement provided by Greg Sjaardema, which may improve read/write times for some complex files. Basically, linked lists were replaced in some locations where it was safe to use an array/table. See [Pull request #328](Unidata/netcdf-c#328) for more information.
Environment Information
configure
)C
code to recreate the issue?Summary of Issue
Building netcdf 4.4.0 against hdf5 1.10.0 I get the following test failure:
Steps to reproduce the behavior
See https://copr.fedorainfracloud.org/coprs/orion/hdf5110/build/172669/
The text was updated successfully, but these errors were encountered: