Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NetCDF thread safety take two #5095

Merged
merged 67 commits into from
Feb 20, 2023
Merged
Show file tree
Hide file tree
Changes from 66 commits
Commits
Show all changes
67 commits
Select commit Hold shift + click to select a range
6caae04
Unpin netcdf4.
trexfeathers Nov 22, 2022
9dae5d0
Temporarily enable GHA on this branch.
trexfeathers Nov 22, 2022
cdcd0e2
Temporarily enable GHA on this branch.
trexfeathers Nov 22, 2022
1512708
Temporarily enable GHA on this branch.
trexfeathers Nov 22, 2022
2dc3aff
Experiment to disable wheel CI on forks.
trexfeathers Nov 22, 2022
c5ab0e4
Disable segfaulting routines.
trexfeathers Nov 22, 2022
235e5fa
More temporary changes to get CI passing.
trexfeathers Nov 22, 2022
efaab93
More temporary changes to get CI passing.
trexfeathers Nov 22, 2022
301dbc6
Finessed segfault skipping.
trexfeathers Nov 22, 2022
9ece95c
Bring in changed from SciTools/iris#5061.
pp-mo Nov 11, 2022
5103545
Re-instate test_load_laea_grid.
trexfeathers Nov 22, 2022
95f337a
Adaptations to get the tests passing.
trexfeathers Nov 24, 2022
fa82db7
Use typing.Mapping instead.
trexfeathers Nov 24, 2022
b6756d4
Get doctests passing.
trexfeathers Nov 24, 2022
256936c
CF only resolve non-url filenames.
trexfeathers Nov 24, 2022
cf9ba29
Confirm thread safety fixes.
trexfeathers Nov 25, 2022
aed32c8
Remove dummy assignment.
trexfeathers Nov 25, 2022
835c02d
Restored plot_nemo What's New entry.
trexfeathers Nov 25, 2022
4e2b590
_add_aux_factories temporarily release global lock.
trexfeathers Nov 25, 2022
a8819b4
Remove per-file locking.
trexfeathers Nov 25, 2022
8207d6d
Remove remaining test workarounds.
trexfeathers Nov 25, 2022
c0faa6f
Remove remaining comments.
trexfeathers Nov 25, 2022
8010b2f
Correct use of CFReader context manager.
trexfeathers Nov 25, 2022
2fb274f
Refactor for easier future maintenance.
trexfeathers Nov 30, 2022
3aaaa1a
Rename netcdf _thread_safe, add header.
trexfeathers Nov 30, 2022
02d9a8c
Full use of ThreadSafeAggregators.
trexfeathers Dec 1, 2022
b301120
Full use of ThreadSafeAggregators.
trexfeathers Dec 1, 2022
6b7c15c
Remove remaining imports of NetCDF4.
trexfeathers Dec 1, 2022
ab733a7
Test to ensure netCDF4 is via _thread_safe module.
trexfeathers Dec 2, 2022
d066d39
More refined netcdf._thread_safe classes.
trexfeathers Dec 2, 2022
cea0e22
_thread_safe docstrings.
trexfeathers Dec 6, 2022
5ed0ebb
Restore original NetCDF code where possible.
trexfeathers Dec 6, 2022
5b3e609
Revert changes to 2.3.rst.
trexfeathers Dec 6, 2022
2edb3ce
Update lockfiles.
trexfeathers Dec 6, 2022
c36c5f5
Merge remote-tracking branch 'upstream/main' into netcdf_segs
trexfeathers Dec 6, 2022
3059666
Additions to _thread_safe.py
trexfeathers Dec 8, 2022
97d5891
Remove temporary CI shims.
trexfeathers Dec 8, 2022
01be381
New locking stategy for NetCDFDataProxy.
trexfeathers Dec 9, 2022
2c972f3
NetCDFDataProxy simpler use of netCDF4 lock.
trexfeathers Dec 12, 2022
cb49765
Update lock files.
trexfeathers Dec 12, 2022
140a6c8
Merge remote-tracking branch 'upstream/main' into netcdf_segs
trexfeathers Dec 12, 2022
b2cc5ad
Go back to using a Threading Lock.
trexfeathers Dec 12, 2022
128ebaa
Remove superfluous pass commands in test_cf.py.
trexfeathers Jan 9, 2023
8d72578
Rename _thread_safe to _thread_safe_nc.
trexfeathers Jan 9, 2023
68588b5
Rename thread safe classes to be 'Wrappers'.
trexfeathers Jan 9, 2023
5d351db
Better contained getattr and setattr pattern.
trexfeathers Jan 9, 2023
d53d194
Explicitly name netCDF4 module in _thread_safe_nc docstring.
trexfeathers Jan 9, 2023
d9bce6b
Better docstring for _ThreadSafeWrapper.
trexfeathers Jan 9, 2023
ed17c83
Better comment about THREAD_SAFE_FLAG.
trexfeathers Jan 9, 2023
9b23b30
list() wrapping within _GLOBAL_NETCDF4_LOCK, to account for generators.
trexfeathers Jan 9, 2023
8faa2ea
Merge branch 'main' into netcdf_segs
trexfeathers Jan 9, 2023
3a76fc2
More accurate thread_safe docstrings in netcdf.saver.
trexfeathers Jan 9, 2023
3fb9679
Merge branch 'netcdf_segs' of github.com:trexfeathers/iris into netcd…
trexfeathers Jan 9, 2023
af1aa05
Merge remote-tracking branch 'upstream/main' into netcdf_segs
trexfeathers Feb 10, 2023
22f55e3
Split netcdf integration tests into multiple modules.
trexfeathers Feb 10, 2023
9d75765
Tests for non-thread-safe NetCDF behaviour.
trexfeathers Feb 13, 2023
c86881b
Docstring accuracy.
trexfeathers Feb 14, 2023
5a53d36
Correct use of dask config set (context manager).
trexfeathers Feb 14, 2023
d95f3f8
Update dependencies.
trexfeathers Feb 14, 2023
f6eb131
Review - don't need the first-class import of iris.tests.
trexfeathers Feb 15, 2023
2df6278
Better name for the loading test.
trexfeathers Feb 15, 2023
2095658
Better selection of data to load.
trexfeathers Feb 15, 2023
e03659a
What's New entry.
trexfeathers Feb 15, 2023
93ac4c4
Improve tests.
trexfeathers Feb 15, 2023
3ffe3dc
Merge remote-tracking branch 'upstream/main' into netcdf_segs
trexfeathers Feb 20, 2023
94da55c
Update lock files.
trexfeathers Feb 20, 2023
a6b893c
Increase chunking on test_save.
trexfeathers Feb 20, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions docs/src/common_links.inc
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,7 @@
.. _issues on GitHub: https://github.com/SciTools/iris/issues?q=is%3Aopen+is%3Aissue+sort%3Areactions-%2B1-desc
.. _python-stratify: https://github.com/SciTools/python-stratify
.. _iris-esmf-regrid: https://github.com/SciTools-incubator/iris-esmf-regrid
.. _netCDF4: https://github.com/Unidata/netcdf4-python


.. comment
Expand Down
8 changes: 6 additions & 2 deletions docs/src/userguide/glossary.rst
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
.. include:: ../common_links.inc

.. _glossary:

Glossary
Expand Down Expand Up @@ -125,7 +127,7 @@ Glossary
of formats.

| **Related:** :term:`CartoPy` **|** :term:`NumPy`
| **More information:** `Matplotlib <https://scitools.org.uk/cartopy/docs/latest/>`_
| **More information:** `matplotlib`_
|

Metadata
Expand All @@ -143,9 +145,11 @@ Glossary
When Iris loads this format, it also especially recognises and interprets data
encoded according to the :term:`CF Conventions`.

__ `NetCDF4`_

| **Related:** :term:`Fields File (FF) Format`
**|** :term:`GRIB Format` **|** :term:`Post Processing (PP) Format`
| **More information:** `NetCDF-4 Python Git <https://github.com/Unidata/netcdf4-python>`_
| **More information:** `NetCDF-4 Python Git`__
|

NumPy
Expand Down
8 changes: 6 additions & 2 deletions docs/src/whatsnew/2.1.rst
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
.. include:: ../common_links.inc

v2.1 (06 Jun 2018)
******************

Expand Down Expand Up @@ -67,7 +69,7 @@ Incompatible Changes
as an alternative.

* This release of Iris contains a number of updated metadata translations.
See this
See this
`changelist <https://github.com/SciTools/iris/commit/69597eb3d8501ff16ee3d56aef1f7b8f1c2bb316#diff-1680206bdc5cfaa83e14428f5ba0f848>`_
for further information.

Expand All @@ -84,14 +86,16 @@ Internal
calendar.

* Iris updated its time-handling functionality from the
`netcdf4-python <http://unidata.github.io/netcdf4-python/>`_
`netcdf4-python`__
``netcdftime`` implementation to the standalone module
`cftime <https://github.com/Unidata/cftime>`_.
cftime is entirely compatible with netcdftime, but some issues may
occur where users are constructing their own datetime objects.
In this situation, simply replacing ``netcdftime.datetime`` with
``cftime.datetime`` should be sufficient.

__ `netCDF4`_

* Iris now requires version 2 of Matplotlib, and ``>=1.14`` of NumPy.
Full requirements can be seen in the `requirements <https://github.com/SciTools/iris/>`_
directory of the Iris' the source.
3 changes: 2 additions & 1 deletion docs/src/whatsnew/latest.rst
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,8 @@ This document explains the changes made to Iris for this release
🐛 Bugs Fixed
=============

#. N/A
#. `@trexfeathers`_ and `@pp-mo`_ made Iris' use of the `netCDF4`_ library
thread-safe. (:pull:`5095`)


💣 Incompatible Changes
Expand Down
3 changes: 2 additions & 1 deletion lib/iris/experimental/ugrid/load.py
Original file line number Diff line number Diff line change
Expand Up @@ -209,7 +209,8 @@ def load_meshes(uris, var_name=None):

result = {}
for source in valid_sources:
meshes_dict = _meshes_from_cf(CFUGridReader(source))
with CFUGridReader(source) as cf_reader:
meshes_dict = _meshes_from_cf(cf_reader)
meshes = list(meshes_dict.values())
if var_name is not None:
meshes = list(filter(lambda m: m.var_name == var_name, meshes))
Expand Down
26 changes: 23 additions & 3 deletions lib/iris/fileformats/cf.py
Original file line number Diff line number Diff line change
Expand Up @@ -20,10 +20,10 @@
import re
import warnings

import netCDF4
import numpy as np
import numpy.ma as ma

from iris.fileformats.netcdf import _thread_safe_nc
import iris.util

#
Expand Down Expand Up @@ -1050,7 +1050,9 @@ def __init__(self, filename, warn=False, monotonic=False):
#: Collection of CF-netCDF variables associated with this netCDF file
self.cf_group = self.CFGroup()

self._dataset = netCDF4.Dataset(self._filename, mode="r")
self._dataset = _thread_safe_nc.DatasetWrapper(
self._filename, mode="r"
)

# Issue load optimisation warning.
if warn and self._dataset.file_format in [
Expand All @@ -1068,6 +1070,19 @@ def __init__(self, filename, warn=False, monotonic=False):
self._build_cf_groups()
self._reset()

def __enter__(self):
# Enable use as a context manager
# N.B. this **guarantees* closure of the file, when the context is exited.
# Note: ideally, the class would not do so much work in the __init__ call, and
# would do all that here, after acquiring necessary permissions/locks.
# But for legacy reasons, we can't do that. So **effectively**, the context
# (in terms of access control) alreday started, when we created the object.
return self

def __exit__(self, exc_type, exc_value, traceback):
# When used as a context-manager, **always** close the file on exit.
self._close()

@property
def filename(self):
"""The file that the CFReader is reading."""
Expand Down Expand Up @@ -1294,10 +1309,15 @@ def _reset(self):
for nc_var_name in self._dataset.variables.keys():
self.cf_group[nc_var_name].cf_attrs_reset()

def __del__(self):
def _close(self):
# Explicitly close dataset to prevent file remaining open.
if self._dataset is not None:
self._dataset.close()
self._dataset = None

def __del__(self):
# Be sure to close dataset when CFReader is destroyed / garbage-collected.
self._close()


def _getncattr(dataset, attr, default=None):
Expand Down
Loading