Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clean up thread-local error stacks in all threads #4852

Merged
merged 3 commits into from
Oct 1, 2024

Conversation

mattjala
Copy link
Contributor

@mattjala mattjala commented Sep 18, 2024

When threadsafety is enabled, threadlocal error stacks in threads that don't call H5_term_library do not have any error records stored on them cleaned up properly. This leads to leaked memory and, before the recent changes to H5E, an infinite loop during library close.

Set of changes:

  • Cleanup threadlocal error stack at individual thread exit time
  • Rename H5E__clear_stack() to H5E__destroy_stack() to contrast its intention with H5E_clear_stack()
  • Add a ttsafe test

This PR was originally much more complex, and the below description is out-of-date.

=====
Fixing this is complicated by the fact that H5TS's cleanup is currently performed after every individual module is terminated. In order for the records to be released properly, H5TS must terminate before H5E and after H5CX, necessitating rearranging the termination order. As a side effect, most H5TS resources that were previously set up and released on a per-process basis (H5TS_tinfo_mtx_s, H5TS_api_info_p, H5TS_thrd_info_key_g) are now handled per-thread.

This is currently a draft PR due to an issue with thread once macros on MacOS builds. The flag the library uses to perform only one initialization of H5TS across all threads, H5TS_first_thread_init_s, is not currently cleaned up at library close. If the library is closed and re-opened, the library will see the old value and not properly initialize H5TS.

This can be solved on most systems by re-initializing H5TS_first_init_s = H5TS_ONCE_INITIALIZER during H5TS termination. However, it appears that on MacOS, the PTHREAD_ONCE_INIT macro is defined in a manner that causes problems with this approach:

 In file included from /Users/runner/work/hdf5/hdf5/src/H5private.h:65,
                 from /Users/runner/work/hdf5/hdf5/src/H5TSint.c:35:
/Users/runner/work/hdf5/hdf5/src/H5TSint.c: In function 'H5TS_term_package':
/Users/runner/work/hdf5/hdf5/src/H5TSprivate.h:71:36: error: expected expression before '{' token
   71 | #define H5TS_ONCE_INITIALIZER      PTHREAD_ONCE_INIT
      |                                    ^~~~~~~~~~~~~~~~~
/Users/runner/work/hdf5/hdf5/src/H5TSint.c:151:25: note: in expansion of macro 'H5TS_ONCE_INITIALIZER'
  151 |     H5TS_first_init_s = H5TS_ONCE_INITIALIZER;
      |                         ^~~~~~~~~~~~~~~~~~~~~

My best guess is that MacOS enforces the requirement that PTHREAD_ONCE_INIT only be used for static initializations, and this initialization during termination is dynamic. Windows threads have both a static and dynamic once flag initializer (INIT_ONCE_STATIC_INIT and InitOnceInitialize) but I haven't been able to find a dynamic counterpart for pthreads.

@mattjala mattjala added Merge - Develop Only This cannot be merged to previous versions of HDF5 (file format or breaking API changes) Priority - 2. Medium ⏹ It would be nice to have this in the next release Component - C Library Core C library issues (usually in the src directory) Type - Bug Please report security issues to help@hdfgroup.org instead of creating an issue on GitHub labels Sep 18, 2024
@mattjala mattjala self-assigned this Sep 18, 2024
Copy link
Contributor

@qkoziol qkoziol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that most of the library changes should go in. I've added comments, but I'm also happy to talk in-person or on a call.

src/H5.c Outdated Show resolved Hide resolved
src/H5Eint.c Outdated Show resolved Hide resolved
src/H5TSint.c Outdated Show resolved Hide resolved
src/H5TSint.c Outdated Show resolved Hide resolved
src/H5TSint.c Outdated Show resolved Hide resolved
src/H5TSint.c Outdated Show resolved Hide resolved
src/H5TSint.c Outdated Show resolved Hide resolved
src/H5TSint.c Outdated Show resolved Hide resolved
@mattjala
Copy link
Contributor Author

Vastly simplified this after a discussion with Quincey.

  • Cleanup threadlocal error stack at individual thread exit time
  • Rename H5E__clear_stack() to H5E__destroy_stack() to contrast its intention with H5E_clear_stack()
  • Add a ttsafe test

test/ttsafe_error_stacks.c Outdated Show resolved Hide resolved
Copy link
Contributor

@qkoziol qkoziol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small change, otherwise good :-)

@derobins derobins merged commit e2da353 into HDFGroup:develop Oct 1, 2024
56 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Component - C Library Core C library issues (usually in the src directory) Merge - Develop Only This cannot be merged to previous versions of HDF5 (file format or breaking API changes) Priority - 2. Medium ⏹ It would be nice to have this in the next release Type - Bug Please report security issues to help@hdfgroup.org instead of creating an issue on GitHub
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants