Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add tables_open_file func to retry opening #21

Merged
merged 1 commit into from
Feb 11, 2021
Merged

Conversation

taldcroft
Copy link
Member

Description

This is an attempt to work around errors like this that happen occasionally in several Ska cron jobs (kadi, cheta, mica, arc) that update HDF5 files in place:

2021-02-08 04:03:57,089 Updating stats file /proj/sot/ska3/flight/data/eng_archive/data/acisdeahk/daily/TMP_FEP1_MONG.h5
2021-02-08 04:03:57,175   Adding 3 records
2021-02-08 04:03:57,211 Updating stats file /proj/sot/ska3/flight/data/eng_archive/data/acisdeahk/5min/TMP_FEP1_MONG.h5
Traceback (most recent call last):
  File "/proj/sot/ska3/flight/bin/cheta_update_server_archive", line 33, in <module>
    sys.exit(load_entry_point('Ska.engarchive==4.51.0', 'console_scripts', 'cheta_update_server_archive')())
  File "/proj/sot/ska3/flight/lib/python3.8/site-packages/cheta/update_archive.py", line 1186, in main
    main_loop()
  File "/proj/sot/ska3/flight/lib/python3.8/site-packages/cheta/update_archive.py", line 266, in main_loop
    update_stats(colname, '5min', msid)
  File "/proj/sot/ska3/flight/lib/python3.8/site-packages/cheta/update_archive.py", line 507, in update_stats
    stats = tables.open_file(stats_file, mode='a',
  File "/proj/sot/ska3/flight/lib/python3.8/site-packages/tables/file.py", line 315, in open_file
    return File(filename, mode, title, root_uep, filters, **kwargs)
  File "/proj/sot/ska3/flight/lib/python3.8/site-packages/tables/file.py", line 778, in __init__
    self._g_new(filename, mode, **params)
  File "tables/hdf5extension.pyx", line 492, in tables.hdf5extension.File._g_new
tables.exceptions.HDF5ExtError: HDF5 error back trace

  File "H5F.c", line 509, in H5Fopen
    unable to open file
  File "H5Fint.c", line 1400, in H5F__open
    unable to open file
  File "H5Fint.c", line 1615, in H5F_open
    unable to lock the file
  File "H5FD.c", line 1640, in H5FD_lock
    driver lock request failed
  File "H5FDsec2.c", line 941, in H5FD_sec2_lock
    unable to lock file, errno = 11, error message = 'Resource temporarily unavailable'

End of HDF5 error back trace

Unable to open/create file '/proj/sot/ska3/flight/data/eng_archive/data/acisdeahk/5min/TMP_FEP1_MONG.h5'

Testing

  • [N/A] Passes unit tests

  • Functional testing

  • Updated Ska.engarchive.update_archive to use this new function in every place that an HDF5 file is opened (substitute tables_open_file for tables.open_file.

  • Make a new conda env on HEAD

  • Install this package and Ska.engarchive into the environment using python setup.py install --single-version-externally-managed --record=record.txt

  • Go to ska_testr repo and run run_testr --include Ska.engarchive. This runs the long test that exercises update_archive and this new function. PASS.

@taldcroft taldcroft mentioned this pull request Feb 8, 2021
1 task
@jeanconn
Copy link
Contributor

This overall seems fine, but the default logging behavior isn't clear to me. I'd like to actually see something if the retry "helped". Will I get that from this implementation?

@taldcroft
Copy link
Member Author

Good question. Once again the logging module has surprised me with unexpected behavior (below), but the upshot is that if tables_open_file does "help" then you will see a message to stdout from these lines:

                func_name = getattr(f, '__name__', 'func')
                func_call = f'{func_name}({call_args_str})'
                logger.warning(f'WARNING: {func_call} exception: {e}, retrying '
                               f'in {_delay} seconds...')

Note that logger here will be from logging_logger = logging.getLogger(__name__) in the retry.py module, but that's generally OK in our applications.

The surprise. If you just get a logger and have not done any configuration and there are no handlers, it still will output messages to stdout. It must be delegating to the root logger, but the configuration of the root logger silently changes if you use it once. The logging package really deserves its bad reputation IMO.

In [1]: import logging                                                          

In [2]: my_logger = logging.getLogger('asdfasdf')                               

In [3]: my_logger.warning('this is a warning')                                  
this is a warning

In [4]: my_logger.handlers                                                      
Out[4]: []

In [5]: logging.warning('direct warning through logging')                       
WARNING:root:direct warning through logging

In [6]: my_logger.warning('my_logger behavior is changed')                      
WARNING:asdfasdf:my_logger behavior is changed

@jeanconn
Copy link
Contributor

Thanks for the insight that I should just remain confused about logging ! For my use case this seems fine.

@taldcroft taldcroft merged commit 0867570 into master Feb 11, 2021
@taldcroft taldcroft deleted the tables-open-retry branch February 11, 2021 15:36
This was referenced Feb 26, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants