Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DOCS: Numpydocs1 #5578

Merged
merged 12 commits into from
Dec 4, 2023
3 changes: 1 addition & 2 deletions docs/src/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,6 @@
#
# All configuration values have a default; values that are commented out
# serve to show the default.

# ----------------------------------------------------------------------------

import datetime
Expand Down Expand Up @@ -195,7 +194,7 @@ def _dotv(version):
todo_include_todos = True

# api generation configuration
autodoc_member_order = "groupwise"
autodoc_member_order = "alphabetical"
autodoc_default_flags = ["show-inheritance"]

# https://www.sphinx-doc.org/en/master/usage/extensions/autodoc.html#confval-autodoc_typehints
Expand Down
51 changes: 26 additions & 25 deletions lib/iris/config.py
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@
The [optional] name of the logger to notify when first imported.

----------

"""

import configparser
Expand All @@ -42,41 +43,37 @@ def get_logger(
name, datefmt=None, fmt=None, level=None, propagate=None, handler=True
):
"""
Create a custom class for logging.

Create a :class:`logging.Logger` with a :class:`logging.StreamHandler`
and custom :class:`logging.Formatter`.

Args:

* name:
Parameters
----------
name
The name of the logger. Typically this is the module filename that
owns the logger.

Kwargs:

* datefmt:
datefmt: optional
The date format string of the :class:`logging.Formatter`.
Defaults to ``%d-%m-%Y %H:%M:%S``.

* fmt:
fmt: optional
The additional format string of the :class:`logging.Formatter`.
This is appended to the default format string
``%(asctime)s %(name)s %(levelname)s - %(message)s``.

* level:
level: optional
The threshold level of the logger. Defaults to ``INFO``.

* propagate:
propagate: optional
Sets the ``propagate`` attribute of the :class:`logging.Logger`,
which determines whether events logged to this logger will be
passed to the handlers of higher level loggers. Defaults to
``False``.

* handler:
handler: optional
Create and attach a :class:`logging.StreamHandler` to the
logger. Defaults to ``True``.

Returns:
A :class:`logging.Logger`.
Returns
-------
:class:`logging.Logger`.

"""
if level is None:
Expand Down Expand Up @@ -118,6 +115,8 @@ def get_logger(
# Returns simple string options
def get_option(section, option, default=None):
"""
Return the option value for the given section.

Returns the option value for the given section, or the default value
if the section/option is not present.

Expand All @@ -131,6 +130,8 @@ def get_option(section, option, default=None):
# Returns directory path options
def get_dir_option(section, option, default=None):
"""
Return the directory path from the given option and section.

Returns the directory path from the given option and section, or
returns the given default value if the section/option is not present
or does not represent a valid directory.
Expand Down Expand Up @@ -196,20 +197,19 @@ def __init__(self, conventions_override=None):
"""
Set up NetCDF processing options for Iris.

Currently accepted kwargs:

* conventions_override (bool):
Parameters
----------
conventions_override : bool, optional
Define whether the CF Conventions version (e.g. `CF-1.6`) set when
saving a cube to a NetCDF file should be defined by
Iris (the default) or the cube being saved.

If `False` (the default), specifies that Iris should set the
Iris (the default) or the cube being saved. If `False`
(the default), specifies that Iris should set the
CF Conventions version when saving cubes as NetCDF files.
If `True`, specifies that the cubes being saved to NetCDF should
set the CF Conventions version for the saved NetCDF files.

Example usages:

Examples
--------
* Specify, for the lifetime of the session, that we want all cubes
written to NetCDF to define their own CF Conventions versions::

Expand Down Expand Up @@ -276,6 +276,7 @@ def _defaults_dict(self):
def context(self, **kwargs):
"""
Allow temporary modification of the options via a context manager.

Accepted kwargs are the same as can be supplied to the Option.

"""
Expand Down
3 changes: 1 addition & 2 deletions lib/iris/fileformats/netcdf/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,7 @@
# This file is part of Iris and is released under the BSD license.
# See LICENSE in the root of the repository for full licensing details.
"""
Module to support the loading and saving of NetCDF files, also using the CF conventions
for metadata interpretation.
Support loading and saving NetCDF files using CF conventions for metadata interpretation.

See : `NetCDF User's Guide <https://docs.unidata.ucar.edu/nug/current/>`_
and `netCDF4 python module <https://github.com/Unidata/netcdf4-python>`_.
Expand Down
80 changes: 45 additions & 35 deletions lib/iris/fileformats/netcdf/_dask_locks.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,45 +5,49 @@
"""
Module containing code to create locks enabling dask workers to co-operate.

This matter is complicated by needing different solutions for different dask scheduler
types, i.e. local 'threads' scheduler, local 'processes' or distributed.
This matter is complicated by needing different solutions for different dask
scheduler types, i.e. local 'threads' scheduler, local 'processes' or
distributed.

In any case, an "iris.fileformats.netcdf.saver.Saver" object contains a netCDF4.Dataset
targeting an output file, and creates a Saver.file_write_lock object to serialise
write-accesses to the file from dask tasks : All dask-task file writes go via a
"iris.fileformats.netcdf.saver.NetCDFWriteProxy" object, which also contains a link
to the Saver.file_write_lock, and uses it to prevent workers from fouling each other.
In any case, an "iris.fileformats.netcdf.saver.Saver" object contains a
netCDF4.Dataset targeting an output file, and creates a Saver.file_write_lock
object to serialise write-accesses to the file from dask tasks : All dask-task
file writes go via a "iris.fileformats.netcdf.saver.NetCDFWriteProxy" object,
which also contains a link to the Saver.file_write_lock, and uses it to prevent
workers from fouling each other.

For each chunk written, the NetCDFWriteProxy acquires the common per-file lock;
opens a Dataset on the file; performs a write to the relevant variable; closes the
Dataset and then releases the lock. This process is obviously very similar to what the
NetCDFDataProxy does for reading lazy chunks.
opens a Dataset on the file; performs a write to the relevant variable; closes
the Dataset and then releases the lock. This process is obviously very similar
to what the NetCDFDataProxy does for reading lazy chunks.

For a threaded scheduler, the Saver.lock is a simple threading.Lock(). The workers
(threads) execute tasks which contain a NetCDFWriteProxy, as above. All of those
contain the common lock, and this is simply **the same object** for all workers, since
they share an address space.
For a threaded scheduler, the Saver.lock is a simple threading.Lock(). The
workers (threads) execute tasks which contain a NetCDFWriteProxy, as above.
All of those contain the common lock, and this is simply **the same object**
for all workers, since they share an address space.

For a distributed scheduler, the Saver.lock is a `distributed.Lock()` which is
identified with the output filepath. This is distributed to the workers by
serialising the task function arguments, which will include the NetCDFWriteProxy.
A worker behaves like a process, though it may execute on a remote machine. When a
distributed.Lock is deserialised to reconstruct the worker task, this creates an object
that communicates with the scheduler. These objects behave as a single common lock,
as they all have the same string 'identity', so the scheduler implements inter-process
communication so that they can mutually exclude each other.
serialising the task function arguments, which will include the
NetCDFWriteProxy. A worker behaves like a process, though it may execute on a
remote machine. When a distributed.Lock is deserialised to reconstruct the
worker task, this creates an object that communicates with the scheduler.
These objects behave as a single common lock, as they all have the same string
'identity', so the scheduler implements inter-process communication so that
they can mutually exclude each other.

It is also *conceivable* that multiple processes could write to the same file in
parallel, if the operating system supports it. However, this also requires that the
libnetcdf C library is built with parallel access option, which is not common.
With the "ordinary" libnetcdf build, a process which attempts to open for writing a file
which is _already_ open for writing simply raises an access error.
In any case, Iris netcdf saver will not support this mode of operation, at present.
parallel, if the operating system supports it. However, this also requires
that the libnetcdf C library is built with parallel access option, which is
not common. With the "ordinary" libnetcdf build, a process which attempts to
open for writing a file which is _already_ open for writing simply raises an
access error. In any case, Iris netcdf saver will not support this mode of
operation, at present.

We don't currently support a local "processes" type scheduler. If we did, the
behaviour should be very similar to a distributed scheduler. It would need to use some
other serialisable shared-lock solution in place of 'distributed.Lock', which requires
a distributed scheduler to function.
behaviour should be very similar to a distributed scheduler. It would need to
use some other serialisable shared-lock solution in place of
'distributed.Lock', which requires a distributed scheduler to function.

"""
import threading
Expand All @@ -55,7 +59,7 @@


# A dedicated error class, allowing filtering and testing of errors raised here.
class DaskSchedulerTypeError(ValueError):
class DaskSchedulerTypeError(ValueError): # noqa: D101
pass


Expand All @@ -82,11 +86,13 @@ def get_dask_array_scheduler_type():

Returns one of 'distributed', 'threads' or 'processes'.
The return value is a valid argument for dask.config.set(scheduler=<type>).
This cannot distinguish between distributed local and remote clusters -- both of
those simply return 'distributed'.
This cannot distinguish between distributed local and remote clusters --
both of those simply return 'distributed'.

NOTE: this takes account of how dask is *currently* configured. It will be wrong
if the config changes before the compute actually occurs.
Notes
-----
This takes account of how dask is *currently* configured. It will
be wrong if the config changes before the compute actually occurs.

"""
if dask_scheduler_is_distributed():
Expand Down Expand Up @@ -114,8 +120,12 @@ def get_worker_lock(identity: str):
"""
Return a mutex Lock which can be shared by multiple Dask workers.

The type of Lock generated depends on the dask scheduler type, which must therefore
be set up before this is called.
The type of Lock generated depends on the dask scheduler type, which must
therefore be set up before this is called.

Parameters
----------
identity : str

"""
scheduler_type = get_dask_array_scheduler_type()
Expand Down
Loading