-
-
Notifications
You must be signed in to change notification settings - Fork 244
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Replace H5detect's build-time detection of C99 integer properties with a #1400
Conversation
table-driven routine, `H5T__init_native_int()`, that is run at library initialization time. Always respect the alignment used by the compiler for integers. The library invites trouble by using different alignment than the compiler expects. Here and there update a comment about alignment. Retire the H5detect code that tries to find the least permissible integer alignment by running experiments and catching any signals thrown or unexpected results.
cd0587b
to
558e75a
Compare
@seanm I believe this overhaul of H5detect / native-type initialization should help with some of the issues you had? |
Does anybody have an opinion about the precision of int_least8_t and int_fast8_t? Seems like it should be 8 bits rather than |
Great, thanks for doing this! Should probably fix the compiler error on OpenBSD that was re-introduced with this revert. Will you be doing some similar treatment for floats too? |
Also, looks like you are close to being able to remove |
The errors below showed up in the "Windows Latest MSVC" test in https://github.com/HDFGroup/hdf5/actions/runs/1762792143. The checks above won't complete because of the clang-format commit. The test log is at https://github.com/HDFGroup/hdf5/runs/4984099680?check_suite_focus=true with these errors beginning at line 385 after a number of warnings beginning at line 342. D:\a\hdf5\hdf5\src\H5Tnative.c(1134,1): error C2099: initializer is not a constant [D:\a\hdf5\build\src\hdf5-shared.vcxproj] |
purposefully make unaligned accesses any longer. Remove some setjmp(3)-related code.
of the macro back the way it was and disable clang-format.
offsetof() instead of pointer arithmetic. I hope that this is what MSVC wants.
I thought those weren't required to be exactly 8 bits, only int8_t is guaranteed to be 8 bits. As far as I can tell this applies to precision as well as width. Then again, strictly speaking, looking at the size isn't guaranteed to give you the correct precision (though I doubt HDF5 would work correctly on a system with weird integer types). |
Have we looked into the potential performance/memory effects of this change? I worked on a similar system a long time ago and we wanted to allow unaligned access wherever the system handled it correctly to maximize performance (for example, otherwise a compound may need to go through a more expensive type conversion process). On a related note, have you looked at the H5_NO_ALIGNEMNT_RESTRICTIONS that is checked in configure and used in H5Tvlen.c? I agree that conforming to the C standard is good as long as it doesn't substantially hurt performance. |
This is correct. The 'least' types are guaranteed to be at least that size, but could be larger. I think we should generally keep the 'least' types as small as possible since they're more related to minimizing memory usage, while the 'fast' types should of course be the fastest size for the platform. |
Has this been checked for a cross-compile environment? Probably one of the systems at an NNSA facility could be used. |
Successfully compiled for knl nodes on haswell login nodes on theta at anl, cori at nersc, and mutrino at sandia with gcc, cce, llvm, and intel compilers. Tests passed on theta and mutrino with the exception of testphdf5_tldsc with some compilers. I wasn't able to run batch jobs at nersc; allocation must have changed. 32 bit and 64 bit on big-endian Power8 echidna with gcc and IBM XL compilers also built successfully and all serial tests passed. |
On Fri, Feb 04, 2022 at 01:06:07PM -0800, jhendersonHDF wrote:
> > Does anybody have an opinion about the precision of int_least8_t and int_fast8_t? Seems like it should be 8 bits rather than `8 * sizeof(int_least8_t)` or `8 * sizeof(int_fast8_t)`, respectively. It's an easy change to make in the `H5T__init_native_int()` table.
>
> I thought those weren't required to be exactly 8 bits, only int8_t is guaranteed to be 8 bits. As far as I can tell this applies to precision as well as width. Then again, strictly speaking, looking at the size isn't guaranteed to give you the correct precision (though I doubt HDF5 would work correctly on a system with weird integer types).
This is correct. The 'least' types are guaranteed to be at least that size, but could be larger. I think we should generally keep the 'least' types as small as possible since they're more related to minimizing memory usage, while the 'fast' types should of course be the fastest size for the platform.
The question I have is a little bit different than the one I believe you
are answering.
Let us say that int_least8_t is 32 bits wide on platform X. When
platform X saves H5T_NATIVE_INT_LEAST8-type data to an HDF5 file, do we
want for the in-file representation to represent that the data has 32
bits of precision, or 8 bits of precision? One the one hand, the use of
H5T_NATIVE_INT_LEAST8 suggests that the user only really intended to use
8 of 32 bits; on the other hand, what if in actual fact they produced
some values outside of the [INT8_MIN, INT8_MAX] interval?
Dave
|
And it's not just hypothetical.
Then they have created a bug I'd say. |
We talked about it and decided we should set the precision to be the full width (unless we want to test if it really does have lower precision). The reasoning is we should save what the type actually is in memory, and not try to divine what the user was trying to use it for. Users can expect "native" types to be represented in exactly the same manner on disk as they are in memory. |
On Fri, Feb 04, 2022 at 12:04:18PM -0800, Neil Fortner wrote:
Have we looked into the potential performance/memory effects of this
change? I worked on a similar system a long time ago and we wanted to
allow unaligned access wherever the system handled it correctly to
maximize performance (for example, otherwise a compound may need to go
through a more expensive type conversion process).
I think that sometimes the in-memory representation of a compound may be
larger. Depends how smart the library is about layout.
Can you recommend an existing program to use to investigate performance
changes? If we find that there *is* a change, then I am sure that we
can bring the performance back up without invoking undefined behavior.
On a related note, have you looked at the H5_NO_ALIGNEMNT_RESTRICTIONS
that is checked in configure and used in H5Tvlen.c?
I see that, now. Optimizing compilers obviate any need for it. If you
`memcpy` to the temporary variable instead of invoking the undefined
behavior of loading through a (potentially) unaligned pointer, GCC 8.3.0
still emits a single `mov` instruction for x86_64.
Dave
|
|
||
if ((*table[j].hidp = H5I_register(H5I_DATATYPE, dt, FALSE)) < 0) | ||
return FAIL; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason you can't put the contents of this loop in the NATIVE_ENTRY_INITIALIZER macro and avoid the strange procedure with the tables? If not some comments should be added to explain what the loop is doing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Compared with what came before, I use the preprocessor sparingly and replace some code with data.
I can add some general comments.
It looks like these values are only used for H5Tget_native_type(), so this change should be fine overall. I see you removed the comment about it only being used for H5Tget_native_type() though, is there a reason for that? |
Looks like a sentence got lost in the shuffle. I added it back. |
I will treat floats, yeah. |
Any news on getting this glorious change merged? |
H5detect.c to H5Tnative.c and rename H5T__init_native_int() to H5T__init_native_internal() to try to match its more general purpose.
weren't turned off to begin with.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Definitely looks like a nice improvement over the old H5detect.
} endian_exemplar = {.byte = {1}}; | ||
|
||
return (endian_exemplar.u64 == 1) ? H5T_ORDER_LE : H5T_ORDER_BE; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I'd be otherwise in favor of using endian.h when available but that's probably just fine for that purpose
struct { | ||
char c; | ||
H5R_ref_t x; | ||
} ref; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe you could use a macro to define each of the structs, just to reduce the repetitiveness
, NATIVE_ENTRY_INITIALIZER(INT_LEAST64, int_least64_t, 0, true) | ||
, NATIVE_ENTRY_INITIALIZER(UINT_LEAST64, uint_least64_t, 0, false) | ||
, NATIVE_ENTRY_INITIALIZER(INT_FAST64, int_fast64_t, 0, true) | ||
, NATIVE_ENTRY_INITIALIZER(UINT_FAST64, uint_fast64_t, 0, false) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here I feel like eventually the generation of the table could be simplified with macros but not a big deal. Or could be grouped with signed / unsigned and could also remove the extra 0.
* Removes unused definitions from module headers (#1624) * Fix these Doxygen warnings #1581 (#1589) * Fixes a typo in H5.c (#1639) * free MPI_Group/MPI_Comm/MPI_Datatype objects (#1638) * free MPI_Group/MPI_Comm/MPI_Datatype objects * fix clang-format style * Adds build and license shields to README.md (#1641) * First stab at a Github status bar * Adds a .tokeignore file for counting lines of code accurately * Yanks lines of code calculation since it wildly overcounts * not depend on doIO to free an MPI_Comm object (#1642) * free MPI datatypes previously created (#1637) * Retrieve MPI-IO hints used by MPI library after file open (#1636) H5Pget_fapl_mpio() should return an MPI info object containing all the MPI-IO hints used by the MPI library underneath, after the file is opened. Some hints, such as cb_nodes (number of I/O aggregators), are useful for HDF5 applications and I/O libraries built on top of HDF5. * OESS-168: Remove clang warnings. (#1309) * OESS-168: Remove clang warnings. * OESS-168: Address @lrknox review. * OESS-168: Remove clang warnings. (#1376) * Remove H5_NO_ALIGNMENT_RESTRICTIONS (#1426) * Do not conditionally compile code that uses a pointer dereference and assignment to copy a potentially unaligned variable to aligned automatic storage, or vice versa. Instead, always use naked `memcpy(3)`s. Disassembling the generated code reveals that the `memcpy(3)`s optimize (`-O3`) to a single `mov` instruction for x86_64, which is not strict about alignment. This change reduces the size of code and scripts by 143 lines, eases our way to cross-compilation, and avoids invoking undefined behavior. * Committing clang-format changes * Per discussion, use HD and add comments. Co-authored-by: github-actions <41898282+github-actions[bot]@users.noreply.github.com> * Cleans up some HL library code that inappropriately returns htri_t values cast to herr_t (#1651) * Cleans up some HL library code that inappropriately returns htri_t values cast to herr_t * Committing clang-format changes * Formatted source Co-authored-by: github-actions <41898282+github-actions[bot]@users.noreply.github.com> * Mirror vfd test fixes (#1629) * Use the FAPL that was created earlier in the test (and delete an unused variable). This allows 'make check-vfd' to pass with --enable-mirror-vfd. * Check for testing directory before creating, to avoid warning from bash. Clean out .libs directory before re-using it (after a failed test), to remove any files generated by libtool. * Committing clang-format changes * Increment error count on failed file open and skip tests for VFDs that need modified filenames. * Skip the mirror VFD for 'make check-vfd' - the mirror VFD requires networking configuration parameters and can't be provided for an automated test that is configured with an environment variable. Co-authored-by: github-actions <41898282+github-actions[bot]@users.noreply.github.com> * Removes HDF Group paths, adds shellcheck fixes (#1656) For more information: https://www.shellcheck.net/wiki/SC2086 -- Double quote to prevent globbing ... https://www.shellcheck.net/wiki/SC2230 -- which is non-standard. Use builti... https://www.shellcheck.net/wiki/SC2006 -- Use $(...) notation instead of le... * HDFFV-11306 Fixed (#1657) * HDFFV-11306, * Fixed it so both h5open_f and h5close_f can be called multiple times. * Fixed an issue with open objects remaining after h5close_f was called. * Added additional tests. * comments clean-up * Develop clang format java (#1653) * added HDFFV-11306 entry (#1662) * Adds the -q flag to all swmr test programs, quieting noisy output (#1665) * Adds paths-ignore to the Github pull request workflow (#1663) * Changes Github action `hdf5 dev CI` to `PR hdf5 dev CI` (#1666) So the PR action name is not the same as the one in main.yml * Replace H5detect's build-time detection of C99 integer properties with a (#1400) * Replace H5detect's build-time detection of C99 integer properties with a table-driven routine, `H5T__init_native_int()`, that is run at library initialization time. * Improve handling of copying of dynamic libraries and clean them up after (#1681) test finishes. Co-authored-by: Allen Byrne <50328838+byrnHDF@users.noreply.github.com> Co-authored-by: Wei-keng Liao <wkliao@users.noreply.github.com> Co-authored-by: H. Joe Lee <hyoklee@hdfgroup.org> Co-authored-by: David Young <dyoung@hdfgroup.org> Co-authored-by: github-actions <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Quincey Koziol <koziol@lbl.gov> Co-authored-by: Scot Breitenfeld <brtnfld@hdfgroup.org>
* Removes unused definitions from module headers (#1624) * Fix these Doxygen warnings #1581 (#1589) * Fixes a typo in H5.c (#1639) * free MPI_Group/MPI_Comm/MPI_Datatype objects (#1638) * free MPI_Group/MPI_Comm/MPI_Datatype objects * fix clang-format style * Adds build and license shields to README.md (#1641) * First stab at a Github status bar * Adds a .tokeignore file for counting lines of code accurately * Yanks lines of code calculation since it wildly overcounts * not depend on doIO to free an MPI_Comm object (#1642) * free MPI datatypes previously created (#1637) * Retrieve MPI-IO hints used by MPI library after file open (#1636) H5Pget_fapl_mpio() should return an MPI info object containing all the MPI-IO hints used by the MPI library underneath, after the file is opened. Some hints, such as cb_nodes (number of I/O aggregators), are useful for HDF5 applications and I/O libraries built on top of HDF5. * OESS-168: Remove clang warnings. (#1309) * OESS-168: Remove clang warnings. * OESS-168: Address @lrknox review. * OESS-168: Remove clang warnings. (#1376) * Remove H5_NO_ALIGNMENT_RESTRICTIONS (#1426) * Do not conditionally compile code that uses a pointer dereference and assignment to copy a potentially unaligned variable to aligned automatic storage, or vice versa. Instead, always use naked `memcpy(3)`s. Disassembling the generated code reveals that the `memcpy(3)`s optimize (`-O3`) to a single `mov` instruction for x86_64, which is not strict about alignment. This change reduces the size of code and scripts by 143 lines, eases our way to cross-compilation, and avoids invoking undefined behavior. * Committing clang-format changes * Per discussion, use HD and add comments. Co-authored-by: github-actions <41898282+github-actions[bot]@users.noreply.github.com> * Cleans up some HL library code that inappropriately returns htri_t values cast to herr_t (#1651) * Cleans up some HL library code that inappropriately returns htri_t values cast to herr_t * Committing clang-format changes * Formatted source Co-authored-by: github-actions <41898282+github-actions[bot]@users.noreply.github.com> * Mirror vfd test fixes (#1629) * Use the FAPL that was created earlier in the test (and delete an unused variable). This allows 'make check-vfd' to pass with --enable-mirror-vfd. * Check for testing directory before creating, to avoid warning from bash. Clean out .libs directory before re-using it (after a failed test), to remove any files generated by libtool. * Committing clang-format changes * Increment error count on failed file open and skip tests for VFDs that need modified filenames. * Skip the mirror VFD for 'make check-vfd' - the mirror VFD requires networking configuration parameters and can't be provided for an automated test that is configured with an environment variable. Co-authored-by: github-actions <41898282+github-actions[bot]@users.noreply.github.com> * Removes HDF Group paths, adds shellcheck fixes (#1656) For more information: https://www.shellcheck.net/wiki/SC2086 -- Double quote to prevent globbing ... https://www.shellcheck.net/wiki/SC2230 -- which is non-standard. Use builti... https://www.shellcheck.net/wiki/SC2006 -- Use $(...) notation instead of le... * HDFFV-11306 Fixed (#1657) * HDFFV-11306, * Fixed it so both h5open_f and h5close_f can be called multiple times. * Fixed an issue with open objects remaining after h5close_f was called. * Added additional tests. * comments clean-up * Develop clang format java (#1653) * added HDFFV-11306 entry (#1662) * Adds the -q flag to all swmr test programs, quieting noisy output (#1665) * Adds paths-ignore to the Github pull request workflow (#1663) * Changes Github action `hdf5 dev CI` to `PR hdf5 dev CI` (#1666) So the PR action name is not the same as the one in main.yml * Replace H5detect's build-time detection of C99 integer properties with a (#1400) * Replace H5detect's build-time detection of C99 integer properties with a table-driven routine, `H5T__init_native_int()`, that is run at library initialization time. * Improve handling of copying of dynamic libraries and clean them up after (#1681) test finishes. * Warnings fixes (#1680) * Clean stack size warnings in sio_engine (#1687) * Fixes stack size warnings in tcoords.c (#1688) * Minor things noticed while bringing VFD SWMR in line with develop (#1691) * Removed dead code, weird formatting, and other badness * Fixed remaining stack size warnings in onion VFD * Committing clang-format changes Co-authored-by: Allen Byrne <50328838+byrnHDF@users.noreply.github.com> Co-authored-by: Wei-keng Liao <wkliao@users.noreply.github.com> Co-authored-by: H. Joe Lee <hyoklee@hdfgroup.org> Co-authored-by: David Young <dyoung@hdfgroup.org> Co-authored-by: github-actions <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Quincey Koziol <koziol@lbl.gov> Co-authored-by: Scot Breitenfeld <brtnfld@hdfgroup.org> Co-authored-by: jhendersonHDF <jhenderson@hdfgroup.org>
On Thu, Apr 14, 2022 at 09:19:03AM -0700, Sean McBride wrote:
Any news on getting this glorious change merged?
It is waiting for one more approval.
Dave
|
Is it? It shows at "merged". Back in April I think... |
table-driven routine,
H5T__init_native_int()
, that is run at libraryinitialization time.
Always respect the alignment used by the compiler for integers. The
library invites trouble by using different alignment than the compiler
expects. Here and there update a comment about alignment.
Retire the H5detect code that tries to find the least permissible
integer alignment by running experiments and catching any signals thrown
or unexpected results.