SciTools · ESadek-MO · Nov 21, 2023 · Sep 28, 2022 · Oct 27, 2022 · Feb 20, 2023
diff --git a/docs/src/further_topics/metadata.rst b/docs/src/further_topics/metadata.rst
@@ -91,6 +91,16 @@ actual `data attribute`_ names of the metadata members on the Iris class.
    metadata members are Iris specific terms, rather than recognised `CF Conventions`_
    terms.
 
+.. note::
+
+    :class:`~iris.cube.Cube` :attr:`~iris.cube.Cube.attributes` implement the
+    concept of dataset-level and variable-level attributes, to enable correct
+    NetCDF loading and saving (see :class:`~iris.cube.CubeAttrsDict` and NetCDF
+    :func:`~iris.fileformats.netcdf.saver.save` for more). ``attributes`` on
+    the other classes do not have this distinction, but the ``attributes``
+    members of ALL the classes still have the same interface, and can be
+    compared.
+
 
 Common Metadata API
 ===================
@@ -128,10 +138,12 @@ For example, given the following :class:`~iris.cube.Cube`,
             source                      'Data from Met Office Unified Model 6.05'
 
 We can easily get all of the associated metadata of the :class:`~iris.cube.Cube`
-using the ``metadata`` property:
+using the ``metadata`` property (note the specialised
+:class:`~iris.cube.CubeAttrsDict` for the :attr:`~iris.cube.Cube.attributes`,
+as mentioned earlier):
 
     >>> cube.metadata
-    CubeMetadata(standard_name='air_temperature', long_name=None, var_name='air_temperature', units=Unit('K'), attributes={'Conventions': 'CF-1.5', 'STASH': STASH(model=1, section=3, item=236), 'Model scenario': 'A1B', 'source': 'Data from Met Office Unified Model 6.05'}, cell_methods=(CellMethod(method='mean', coord_names=('time',), intervals=('6 hour',), comments=()),))
+    CubeMetadata(standard_name='air_temperature', long_name=None, var_name='air_temperature', units=Unit('K'), attributes=CubeAttrsDict(globals={'Conventions': 'CF-1.5'}, locals={'STASH': STASH(model=1, section=3, item=236), 'Model scenario': 'A1B', 'source': 'Data from Met Office Unified Model 6.05'}), cell_methods=(CellMethod(method='mean', coord_names=('time',), intervals=('6 hour',), comments=()),))
 
 We can also inspect the ``metadata`` of the ``longitude``
 :class:`~iris.coords.DimCoord` attached to the :class:`~iris.cube.Cube` in the same way:
@@ -675,8 +687,8 @@ For example, consider the following :class:`~iris.common.metadata.CubeMetadata`,
 
 .. doctest:: metadata-combine
 
-    >>> cube.metadata  # doctest: +SKIP
-    CubeMetadata(standard_name='air_temperature', long_name=None, var_name='air_temperature', units=Unit('K'), attributes={'Conventions': 'CF-1.5', 'STASH': STASH(model=1, section=3, item=236), 'Model scenario': 'A1B', 'source': 'Data from Met Office Unified Model 6.05'}, cell_methods=(CellMethod(method='mean', coord_names=('time',), intervals=('6 hour',), comments=()),))
+    >>> cube.metadata
+    CubeMetadata(standard_name='air_temperature', long_name=None, var_name='air_temperature', units=Unit('K'), attributes=CubeAttrsDict(globals={'Conventions': 'CF-1.5'}, locals={'STASH': STASH(model=1, section=3, item=236), 'Model scenario': 'A1B', 'source': 'Data from Met Office Unified Model 6.05'}), cell_methods=(CellMethod(method='mean', coord_names=('time',), intervals=('6 hour',), comments=()),))
 
 We can perform the **identity function** by comparing the metadata with itself,
 
@@ -701,7 +713,7 @@ which is replaced with a **different value**,
     >>> metadata != cube.metadata
     True
     >>> metadata.combine(cube.metadata)  # doctest: +SKIP
-    CubeMetadata(standard_name=None, long_name=None, var_name='air_temperature', units=Unit('K'), attributes={'STASH': STASH(model=1, section=3, item=236), 'source': 'Data from Met Office Unified Model 6.05', 'Model scenario': 'A1B', 'Conventions': 'CF-1.5'}, cell_methods=(CellMethod(method='mean', coord_names=('time',), intervals=('6 hour',), comments=()),))
+    CubeMetadata(standard_name=None, long_name=None, var_name='air_temperature', units=Unit('K'), attributes={'STASH': STASH(model=1, section=3, item=236), 'Model scenario': 'A1B', 'source': 'Data from Met Office Unified Model 6.05', 'Conventions': 'CF-1.5'}, cell_methods=(CellMethod(method='mean', coord_names=('time',), intervals=('6 hour',), comments=()),))
 
 The ``combine`` method combines metadata by performing a **strict** comparison
 between each of the associated metadata member values,
@@ -724,7 +736,7 @@ Let's reinforce this behaviour, but this time by combining metadata where the
     >>> metadata != cube.metadata
     True
     >>> metadata.combine(cube.metadata).attributes
-    {'Model scenario': 'A1B'}
+    CubeAttrsDict(globals={}, locals={'Model scenario': 'A1B'})
 
 The combined result for the ``attributes`` member only contains those
 **common keys** with **common values**.
@@ -810,16 +822,17 @@ the ``from_metadata`` class method. For example, given the following
 
 .. doctest:: metadata-convert
 
-    >>> cube.metadata  # doctest: +SKIP
-    CubeMetadata(standard_name='air_temperature', long_name=None, var_name='air_temperature', units=Unit('K'), attributes={'Conventions': 'CF-1.5', 'STASH': STASH(model=1, section=3, item=236), 'Model scenario': 'A1B', 'source': 'Data from Met Office Unified Model 6.05'}, cell_methods=(CellMethod(method='mean', coord_names=('time',), intervals=('6 hour',), comments=()),))
+    >>> cube.metadata
+    CubeMetadata(standard_name='air_temperature', long_name=None, var_name='air_temperature', units=Unit('K'), attributes=CubeAttrsDict(globals={'Conventions': 'CF-1.5'}, locals={'STASH': STASH(model=1, section=3, item=236), 'Model scenario': 'A1B', 'source': 'Data from Met Office Unified Model 6.05'}), cell_methods=(CellMethod(method='mean', coord_names=('time',), intervals=('6 hour',), comments=()),))
 
 We can easily convert it to a :class:`~iris.common.metadata.DimCoordMetadata` instance
 using ``from_metadata``,
 
 .. doctest:: metadata-convert
 
-    >>> DimCoordMetadata.from_metadata(cube.metadata)  # doctest: +SKIP
-    DimCoordMetadata(standard_name='air_temperature', long_name=None, var_name='air_temperature', units=Unit('K'), attributes={'Conventions': 'CF-1.5', 'STASH': STASH(model=1, section=3, item=236), 'Model scenario': 'A1B', 'source': 'Data from Met Office Unified Model 6.05'}, coord_system=None, climatological=None, circular=None)
+    >>> newmeta = DimCoordMetadata.from_metadata(cube.metadata)
+    >>> print(newmeta)
+    DimCoordMetadata(standard_name=air_temperature, var_name=air_temperature, units=K, attributes={'Conventions': 'CF-1.5', 'STASH': STASH(model=1, section=3, item=236), 'Model scenario': 'A1B', 'source': 'Data from Met Office Unified Model 6.05'})
 
 By examining :numref:`metadata members table`, we can see that the
 :class:`~iris.cube.Cube` and :class:`~iris.coords.DimCoord` container
@@ -849,9 +862,9 @@ class instance,
 
 .. doctest:: metadata-convert
 
-    >>> longitude.metadata.from_metadata(cube.metadata)
-    DimCoordMetadata(standard_name='air_temperature', long_name=None, var_name='air_temperature', units=Unit('K'), attributes={'Conventions': 'CF-1.5', 'STASH': STASH(model=1, section=3, item=236), 'Model scenario': 'A1B', 'source': 'Data from Met Office Unified Model 6.05'}, coord_system=None, climatological=None, circular=None)
-
+   >>> newmeta = longitude.metadata.from_metadata(cube.metadata)
+   >>> print(newmeta)
+   DimCoordMetadata(standard_name=air_temperature, var_name=air_temperature, units=K, attributes={'Conventions': 'CF-1.5', 'STASH': STASH(model=1, section=3, item=236), 'Model scenario': 'A1B', 'source': 'Data from Met Office Unified Model 6.05'})
 
 .. _metadata assignment:
 
@@ -978,7 +991,7 @@ Indeed, it's also possible to assign to the ``metadata`` property with a
     >>> longitude.metadata
     DimCoordMetadata(standard_name='longitude', long_name=None, var_name='longitude', units=Unit('degrees'), attributes={}, coord_system=GeogCS(6371229.0), climatological=False, circular=False)
     >>> longitude.metadata = cube.metadata
-    >>> longitude.metadata  # doctest: +SKIP
+    >>> longitude.metadata
     DimCoordMetadata(standard_name='air_temperature', long_name=None, var_name='air_temperature', units=Unit('K'), attributes={'Conventions': 'CF-1.5', 'STASH': STASH(model=1, section=3, item=236), 'Model scenario': 'A1B', 'source': 'Data from Met Office Unified Model 6.05'}, coord_system=GeogCS(6371229.0), climatological=False, circular=False)
 
 Note that, only **common** metadata members will be assigned new associated

diff --git a/docs/src/userguide/iris_cubes.rst b/docs/src/userguide/iris_cubes.rst
@@ -85,7 +85,10 @@ A cube consists of:
     data dimensions as the coordinate has dimensions.
 
 * an attributes dictionary which, other than some protected CF names, can
-  hold arbitrary extra metadata.
+  hold arbitrary extra metadata. This implements the concept of dataset-level
+  and variable-level attributes when loading and and saving NetCDF files (see
+  :class:`~iris.cube.CubeAttrsDict` and NetCDF
+  :func:`~iris.fileformats.netcdf.saver.save` for more).
 * a list of cell methods to represent operations which have already been
   applied to the data (e.g. "mean over time")
 * a list of coordinate "factories" used for deriving coordinates from the

diff --git a/docs/src/whatsnew/latest.rst b/docs/src/whatsnew/latest.rst
@@ -29,6 +29,16 @@ This document explains the changes made to Iris for this release
 
 ✨ Features
 ===========
+#. `@pp-mo`_, `@lbdreyer`_ and `@trexfeathers`_ improved
+   :class:`~iris.cube.Cube` :attr:`~iris.cube.Cube.attributes` handling to
+   better preserve the distinction between dataset-level and variable-level
+   attributes, allowing file-Cube-file round-tripping of NetCDF attributes. See
+   :class:`~iris.cube.CubeAttrsDict`, NetCDF
+   :func:`~iris.fileformats.netcdf.saver.save` and :data:`~iris.Future` for more.
+   (:pull:`5152`, `split attributes project`_)
+
+#. `@rcomer`_ rewrote :func:`~iris.util.broadcast_to_shape` so it now handles
+   lazy data. (:pull:`5307`)
 
 #. `@trexfeathers`_ and `@HGWright`_ (reviewer) sub-categorised all Iris'
    :class:`UserWarning`\s for richer filtering. The full index of
@@ -45,7 +55,7 @@ This document explains the changes made to Iris for this release
    the year of December) instead of the following year (the default behaviour).
    (:pull:`5573`)
 
-   #. `@HGWright`_ added :attr:`~iris.coords.Coord.ignore_axis` to allow manual
+#. `@HGWright`_ added :attr:`~iris.coords.Coord.ignore_axis` to allow manual
    intervention preventing :func:`~iris.util.guess_coord_axis` from acting on a
    coordinate. (:pull:`5551`)
 
@@ -151,4 +161,4 @@ This document explains the changes made to Iris for this release
 
 .. _NEP29 Drop Schedule: https://numpy.org/neps/nep-0029-deprecation_policy.html#drop-schedule
 .. _codespell: https://github.com/codespell-project/codespell
-
+.. _split attributes project: https://github.com/orgs/SciTools/projects/5?pane=info
diff --git a/lib/iris/__init__.py b/lib/iris/__init__.py
@@ -141,7 +141,9 @@ def callback(cube, field, filename):
 class Future(threading.local):
     """Run-time configuration controller."""
 
-    def __init__(self, datum_support=False, pandas_ndim=False):
+    def __init__(
+        self, datum_support=False, pandas_ndim=False, save_split_attrs=False
+    ):
         """
         A container for run-time options controls.
 
@@ -163,6 +165,11 @@ def __init__(self, datum_support=False, pandas_ndim=False):
         pandas_ndim : bool, default=False
             See :func:`iris.pandas.as_data_frame` for details - opts in to the
             newer n-dimensional behaviour.
+        save_split_attrs : bool, default=False
+            Save "global" and "local" cube attributes to netcdf in appropriately
+            different ways :  "global" ones are saved as dataset attributes, where
+            possible, while "local" ones are saved as data-variable attributes.
+            See :func:`iris.fileformats.netcdf.saver.save`.
 
         """
         # The flag 'example_future_flag' is provided as a reference for the
@@ -174,14 +181,18 @@ def __init__(self, datum_support=False, pandas_ndim=False):
         # self.__dict__['example_future_flag'] = example_future_flag
         self.__dict__["datum_support"] = datum_support
         self.__dict__["pandas_ndim"] = pandas_ndim
+        self.__dict__["save_split_attrs"] = save_split_attrs
+
         # TODO: next major release: set IrisDeprecation to subclass
         #  DeprecationWarning instead of UserWarning.
 
     def __repr__(self):
         # msg = ('Future(example_future_flag={})')
         # return msg.format(self.example_future_flag)
-        msg = "Future(datum_support={}, pandas_ndim={})"
-        return msg.format(self.datum_support, self.pandas_ndim)
+        msg = "Future(datum_support={}, pandas_ndim={}, save_split_attrs={})"
+        return msg.format(
+            self.datum_support, self.pandas_ndim, self.save_split_attrs
+        )
 
     # deprecated_options = {'example_future_flag': 'warning',}
     deprecated_options = {}

diff --git a/lib/iris/_merge.py b/lib/iris/_merge.py
@@ -22,6 +22,9 @@
     multidim_lazy_stack,
 )
 from iris.common import CoordMetadata, CubeMetadata
+from iris.common._split_attribute_dicts import (
+    _convert_splitattrs_to_pairedkeys_dict as convert_splitattrs_to_pairedkeys_dict,
+)
 import iris.coords
 import iris.cube
 import iris.exceptions
@@ -390,23 +393,27 @@ def _defn_msgs(self, other_defn):
                 )
             )
         if self_defn.attributes != other_defn.attributes:
-            diff_keys = set(self_defn.attributes.keys()) ^ set(
-                other_defn.attributes.keys()
+            attrs_1, attrs_2 = self_defn.attributes, other_defn.attributes
+            diff_keys = sorted(
+                set(attrs_1.globals) ^ set(attrs_2.globals)
+                | set(attrs_1.locals) ^ set(attrs_2.locals)
             )
             if diff_keys:
                 msgs.append(
                     "cube.attributes keys differ: "
                     + ", ".join(repr(key) for key in diff_keys)
                 )
             else:
+                attrs_1, attrs_2 = [
+                    convert_splitattrs_to_pairedkeys_dict(dic)
+                    for dic in (attrs_1, attrs_2)
+                ]
                 diff_attrs = [
-                    repr(key)
-                    for key in self_defn.attributes
-                    if np.all(
-                        self_defn.attributes[key] != other_defn.attributes[key]
-                    )
+                    repr(key[1])
+                    for key in attrs_1
+                    if np.all(attrs_1[key] != attrs_2[key])
                 ]
-                diff_attrs = ", ".join(diff_attrs)
+                diff_attrs = ", ".join(sorted(diff_attrs))
                 msgs.append(
                     "cube.attributes values differ for keys: {}".format(
                         diff_attrs

diff --git a/lib/iris/common/_split_attribute_dicts.py b/lib/iris/common/_split_attribute_dicts.py
@@ -0,0 +1,125 @@
+# Copyright Iris contributors
+#
+# This file is part of Iris and is released under the BSD license.
+# See LICENSE in the root of the repository for full licensing details.
+"""
+Dictionary operations for dealing with the CubeAttrsDict "split"-style attribute
+dictionaries.
+
+The idea here is to convert a split-dictionary into a "plain" one for calculations,
+whose keys are all pairs of the form ('global', <keyname>) or ('local', <keyname>).
+And to convert back again after the operation, if the result is a dictionary.
+
+For "strict" operations this clearly does all that is needed.  For lenient ones,
+we _might_ want for local+global attributes of the same name to interact.
+However, on careful consideration, it seems that this is not actually desirable for
+any of the common-metadata operations.
+So, we simply treat "global" and "local" attributes of the same name as entirely
+independent. Which happily is also the easiest to code, and to explain.
+"""
+from collections.abc import Mapping, Sequence
+from functools import wraps
+
+
+def _convert_splitattrs_to_pairedkeys_dict(dic):
+    """
+    Convert a split-attributes dictionary to a "normal" dict.
+
+    Transform a :class:`~iris.cube.CubeAttributesDict` "split" attributes dictionary
+    into a 'normal' :class:`dict`, with paired keys of the form ('global', name) or
+    ('local', name).
+
+    If the input is *not* a split-attrs dict, it is converted to one before
+    transforming it.  This will assign its keys to global/local depending on a standard
+    set of choices (see :class:`~iris.cube.CubeAttributesDict`).
+    """
+    from iris.cube import CubeAttrsDict
+
+    # Convert input to CubeAttrsDict
+    if not hasattr(dic, "globals") or not hasattr(dic, "locals"):
+        dic = CubeAttrsDict(dic)
+
+    def _global_then_local_items(dic):
+        # Routine to produce global, then local 'items' in order, and with all keys
+        # "labelled" as local or global type, to ensure they are all unique.
+        for key, value in dic.globals.items():
+            yield ("global", key), value
+        for key, value in dic.locals.items():
+            yield ("local", key), value
+
+    return dict(_global_then_local_items(dic))
+
+
+def _convert_pairedkeys_dict_to_splitattrs(dic):
+    """
+    Convert an input with global/local paired keys back into a split-attrs dict.
+
+    For now, this is always and only a :class:`iris.cube.CubeAttrsDict`.
+    """
+    from iris.cube import CubeAttrsDict
+
+    result = CubeAttrsDict()
+    for key, value in dic.items():
+        keytype, keyname = key
+        if keytype == "global":
+            result.globals[keyname] = value
+        else:
+            assert keytype == "local"
+            result.locals[keyname] = value
+    return result
+
+
+def adjust_for_split_attribute_dictionaries(operation):
+    """
+    Decorator to make a function of attribute-dictionaries work with split attributes.
+
+    The wrapped function of attribute-dictionaries is currently always one of "equals",
+    "combine" or "difference", with signatures like :
+        equals(left: dict, right: dict) -> bool
+        combine(left: dict, right: dict) -> dict
+        difference(left: dict, right: dict) -> None | (dict, dict)
+
+    The results of the wrapped operation are either :
+    * for "equals" (or "__eq__") :  a boolean
+    * for "combine" :  a (converted) attributes-dictionary
+    * for "difference" :  a list of (None or "pair"), where a pair contains two
+        dictionaries
+
+    Before calling the wrapped operation, its inputs (left, right) are modified by
+    converting any "split" dictionaries to a form where the keys are pairs
+    of the form ("global", name) or ("local", name).
+
+    After calling the wrapped operation, for "combine" or "difference", the result can
+    contain a dictionary or dictionaries.  These are then transformed back from the
+    'converted' form to split-attribute dictionaries, before returning.
+
+    "Split" dictionaries  are all of class :class:`~iris.cube.CubeAttrsDict`, since
+    the only usage of 'split' attribute dictionaries is in Cubes (i.e. they are not
+    used for cube components).
+    """
+
+    @wraps(operation)
+    def _inner_function(*args, **kwargs):
+        # Convert all inputs into 'pairedkeys' type dicts
+        args = [_convert_splitattrs_to_pairedkeys_dict(arg) for arg in args]
+
+        result = operation(*args, **kwargs)
+
+        # Convert known specific cases of 'pairedkeys' dicts in the result, and convert
+        # those back into split-attribute dictionaries.
+        if isinstance(result, Mapping):
+            # Fix a result which is a single dictionary -- for "combine"
+            result = _convert_pairedkeys_dict_to_splitattrs(result)
+        elif isinstance(result, Sequence) and len(result) == 2:
+            # Fix a result which is a pair of dictionaries -- for "difference"
+            left, right = result
+            left, right = (
+                _convert_pairedkeys_dict_to_splitattrs(left),
+                _convert_pairedkeys_dict_to_splitattrs(right),
+            )
+            result = result.__class__([left, right])
+        # ELSE: leave other types of result unchanged. E.G. None, bool
+
+        return result
+
+    return _inner_function