Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AbstractCode: Add the with_mpi attribute #5922

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 4 additions & 1 deletion aiida/cmdline/commands/cmd_code.py
Original file line number Diff line number Diff line change
Expand Up @@ -250,7 +250,10 @@ def export(code, output_file):
else:
value = getattr(code, key)

code_data[key] = str(value)
# If the attribute is not set, for example ``with_mpi`` do not export it, because the YAML won't be valid for
# use in ``verdi code create`` since ``None`` is not a valid value on the CLI.
if value is not None:
ltalirz marked this conversation as resolved.
Show resolved Hide resolved
code_data[key] = str(value)

with open(output_file, 'w', encoding='utf-8') as yfhandle:
yaml.dump(code_data, yfhandle)
Expand Down
56 changes: 38 additions & 18 deletions aiida/engine/processes/calcjobs/calcjob.py
Original file line number Diff line number Diff line change
Expand Up @@ -913,32 +913,52 @@ def presubmit(self, folder: Folder) -> CalcInfo:

if code_info.code_uuid is None:
raise PluginInternalError('CalcInfo should have the information of the code to be launched')
this_code = load_code(code_info.code_uuid)

# To determine whether this code should be run with MPI enabled, we get the value that was set in the inputs
# of the entire process, which can then be overwritten by the value from the `CodeInfo`. This allows plugins
# to force certain codes to run without MPI, even if the user wants to run all codes with MPI whenever
# possible. This use case is typically useful for `CalcJob`s that consist of multiple codes where one or
# multiple codes always have to be executed without MPI.

this_withmpi = self.node.get_option('withmpi')

# Override the value of `withmpi` with that of the `CodeInfo` if and only if it is set
if code_info.withmpi is not None:
this_withmpi = code_info.withmpi
code = load_code(code_info.code_uuid)

# Here are the three values that will determine whether the code is to be run with MPI _if_ they are not
# ``None``. If any of them are explicitly defined but are not equivalent, an exception is raised. We use the
# ``self._raw_inputs`` to determine the actual value passed for ``metadata.options.withmpi`` and
# distinghuish it from the default.
raw_inputs = self._raw_inputs or {} # type: ignore[var-annotated]
with_mpi_option = raw_inputs.get('metadata', {}).get('options', {}).get('withmpi', None)
with_mpi_plugin = code_info.withmpi
with_mpi_code = code.with_mpi

with_mpi_values = [with_mpi_option, with_mpi_plugin, with_mpi_code]
with_mpi_values_defined = [value for value in with_mpi_values if value is not None]
with_mpi_values_set = set(with_mpi_values_defined)

# If more than one value is defined, they have to be identical, or we raise that a conflict is encountered
if len(with_mpi_values_set) > 1:
error = f'Inconsistent requirements as to whether code `{code}` should be run with or without MPI.'
if with_mpi_option is not None:
error += f'\nThe `metadata.options.withmpi` input was set to `{with_mpi_option}`.'
if with_mpi_plugin is not None:
error += f'\nThe plugin require `{with_mpi_plugin}`.'
if with_mpi_code is not None:
error += f'\nThe code `{code}` required `{with_mpi_code}`.'
raise RuntimeError(error)

# At this point we know that the three explicit values agree if they are defined, so we simply set the value
if with_mpi_values_set:
with_mpi = with_mpi_values_set.pop()
else:
# Fall back to the default of the ``metadata.options.withmpi`` of the ``Calcjob`` class
with_mpi = self.node.get_option('withmpi')

if this_withmpi:
prepend_cmdline_params = this_code.get_prepend_cmdline_params(mpi_args, extra_mpirun_params)
if with_mpi:
prepend_cmdline_params = code.get_prepend_cmdline_params(mpi_args, extra_mpirun_params)
else:
prepend_cmdline_params = this_code.get_prepend_cmdline_params()
prepend_cmdline_params = code.get_prepend_cmdline_params()

cmdline_params = this_code.get_executable_cmdline_params(code_info.cmdline_params)
cmdline_params = code.get_executable_cmdline_params(code_info.cmdline_params)

tmpl_code_info = JobTemplateCodeInfo()
tmpl_code_info.prepend_cmdline_params = prepend_cmdline_params
tmpl_code_info.cmdline_params = cmdline_params
tmpl_code_info.use_double_quotes = [computer.get_use_double_quotes(), this_code.use_double_quotes]
tmpl_code_info.wrap_cmdline_params = this_code.wrap_cmdline_params
tmpl_code_info.use_double_quotes = [computer.get_use_double_quotes(), code.use_double_quotes]
tmpl_code_info.wrap_cmdline_params = code.wrap_cmdline_params
tmpl_code_info.stdin_name = code_info.stdin_name
tmpl_code_info.stdout_name = code_info.stdout_name
tmpl_code_info.stderr_name = code_info.stderr_name
Expand Down
16 changes: 6 additions & 10 deletions aiida/manage/tests/pytest_fixtures.py
Original file line number Diff line number Diff line change
Expand Up @@ -428,16 +428,17 @@ def test_1(aiida_local_code_factory):
:rtype: object
"""

def get_code(entry_point, executable, computer=aiida_localhost, label=None, prepend_text=None, append_text=None):
def get_code(entry_point, executable, computer=aiida_localhost, label=None, **kwargs):
ltalirz marked this conversation as resolved.
Show resolved Hide resolved
"""Get local code.

Sets up code for given entry point on given computer.

:param entry_point: Entry point of calculation plugin
:param executable: name of executable; will be searched for in local system PATH.
:param computer: (local) AiiDA computer
:param prepend_text: a string of code that will be put in the scheduler script before the execution of the code.
:param append_text: a string of code that will be put in the scheduler script after the execution of the code.
:param label: Define the label of the code. By default the ``executable`` is taken. This can be useful if
multiple codes need to be created in a test which require unique labels.
:param kwargs: Additional keyword arguments that are passed to the code's constructor.
:return: the `Code` either retrieved from the database or created if it did not yet exist.
:rtype: :py:class:`~aiida.orm.Code`
"""
Expand Down Expand Up @@ -471,15 +472,10 @@ def get_code(entry_point, executable, computer=aiida_localhost, label=None, prep
description=label,
default_calc_job_plugin=entry_point,
computer=computer,
filepath_executable=executable_path
filepath_executable=executable_path,
**kwargs
)

if prepend_text is not None:
code.prepend_text = prepend_text

if append_text is not None:
code.append_text = append_text

return code.store()

return get_code
Expand Down
28 changes: 28 additions & 0 deletions aiida/orm/nodes/data/code/abstract.py
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,7 @@ class AbstractCode(Data, metaclass=abc.ABCMeta):
_KEY_ATTRIBUTE_APPEND_TEXT: str = 'append_text'
_KEY_ATTRIBUTE_PREPEND_TEXT: str = 'prepend_text'
_KEY_ATTRIBUTE_USE_DOUBLE_QUOTES: str = 'use_double_quotes'
_KEY_ATTRIBUTE_WITH_MPI: str = 'with_mpi'
_KEY_ATTRIBUTE_WRAP_CMDLINE_PARAMS: str = 'wrap_cmdline_params'
_KEY_EXTRA_IS_HIDDEN: str = 'hidden' # Should become ``is_hidden`` once ``Code`` is dropped

Expand All @@ -49,6 +50,7 @@ def __init__(
append_text: str = '',
prepend_text: str = '',
use_double_quotes: bool = False,
with_mpi: bool | None = None,
is_hidden: bool = False,
wrap_cmdline_params: bool = False,
**kwargs
Expand All @@ -59,6 +61,7 @@ def __init__(
:param append_text: The text that should be appended to the run line in the job script.
:param prepend_text: The text that should be prepended to the run line in the job script.
:param use_double_quotes: Whether the command line invocation of this code should be escaped with double quotes.
:param with_mpi: Whether the command should be run as an MPI program.
:param wrap_cmdline_params: Whether to wrap the executable and all its command line parameters into quotes to
form a single string. This is required to enable support for Docker with the ``ContainerizedCode``.
:param is_hidden: Whether the code is hidden.
Expand All @@ -68,6 +71,7 @@ def __init__(
self.append_text = append_text
self.prepend_text = prepend_text
self.use_double_quotes = use_double_quotes
self.with_mpi = with_mpi
self.wrap_cmdline_params = wrap_cmdline_params
self.is_hidden = is_hidden

Expand Down Expand Up @@ -225,6 +229,24 @@ def use_double_quotes(self, value: bool) -> None:
type_check(value, bool)
self.base.attributes.set(self._KEY_ATTRIBUTE_USE_DOUBLE_QUOTES, value)

@property
def with_mpi(self) -> bool | None:
"""Return whether the command should be run as an MPI program.

:return: ``True`` if the code should be run as an MPI program, ``False`` if it shouldn't, ``None`` if unknown.
"""
return self.base.attributes.get(self._KEY_ATTRIBUTE_WITH_MPI, None)

@with_mpi.setter
def with_mpi(self, value: bool | None) -> None:
"""Set whether the command should be run as an MPI program.

:param value: ``True`` if the code should be run as an MPI program, ``False`` if it shouldn't, ``None`` if
unknown.
"""
type_check(value, bool, allow_none=True)
self.base.attributes.set(self._KEY_ATTRIBUTE_WITH_MPI, value)

@property
def wrap_cmdline_params(self) -> bool:
"""Return whether all command line parameters should be wrapped with double quotes to form a single argument.
Expand Down Expand Up @@ -361,4 +383,10 @@ def _get_cli_options(cls) -> dict:
'with single or double quotes.',
'prompt': 'Escape using double quotes',
},
'with_mpi': {
'is_flag': True,
'default': None,
'help': 'Whether the executable should be run as an MPI program.',
'prompt': 'Run with MPI',
},
}
106 changes: 106 additions & 0 deletions docs/source/topics/calculations/usage.rst
Original file line number Diff line number Diff line change
Expand Up @@ -632,6 +632,112 @@ The ``rerunnable`` option enables the scheduler to re-launch the calculation if

Because this depends on the scheduler, its configuration, and the code used, we cannot say conclusively when it will work -- do your own testing! It has been tested on a cluster using SLURM, but that does not guarantee other SLURM clusters behave in the same way.


.. _topics:calculations:usage:calcjobs:mpi:

Controlling MPI
---------------

The `Message Passing Interface <https://en.wikipedia.org/wiki/Message_Passing_Interface>`_ (MPI) is a standardized and portable message-passing standard designed to function on parallel computing architectures.
AiiDA implements support for running calculation jobs with or without MPI enabled.
There are a number of settings that can be used to control when and how MPI is used.

.. _topics:calculations:usage:calcjobs:mpi:computer:

The ``Computer``
~~~~~~~~~~~~~~~~

Each calculation job is executed on a compute resource, which is modeled by an instance of the :class:`~aiida.orm.computers.Computer` class.
If the computer supports running with MPI, the command to use is stored in the ``mpirun_command`` attribute, which is retrieved and set using the :meth:`~aiida.orm.computers.Computer.get_mpirun_command` and :meth:`~aiida.orm.computers.Computer.get_mpirun_command`, respectively.
For example, if the computer has `OpenMPI <https://docs.open-mpi.org/en/v5.0.x/index.html>`_ installed, it can be set to ``mpirun``.
If the ``Computer`` does not specify an MPI command, then enabling MPI for a calculation job is ineffective.

.. _topics:calculations:usage:calcjobs:mpi:code:

The ``Code``
~~~~~~~~~~~~

.. versionadded:: 2.3

When creating a code, you can tell AiiDA that it should be run as an MPI program, by setting the ``with_mpi`` attribute to ``True`` or ``False``.
From AiiDA 2.3 onward, this is the **recommended** way of controlling MPI behavior.
If the code can be run with or without MPI, setting the ``with_mpi`` attribute can be skipped.
It will default to ``None``, leaving the question of whether to run with or without MPI up to the ``CalcJob`` plugin or user input.

.. _topics:calculations:usage:calcjobs:mpi:calcjob-implementation:

The ``CalcJob`` implementation
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

The ``CalcJob`` implementation instructs AiiDA how the codes should be run through the :class:`~aiida.common.datastructures.CalcInfo` object, which it returns from the :meth:`~aiida.engine.processes.calcjobs.calcjob.CalcJob.prepare_for_submission` method.
For each code that the job should run (usually only a single one), a :class:`~aiida.common.datastructures.CodeInfo` object should be added to the list of the ``CalcInfo.codes_info`` attribute.
If the plugin developer knows that the executable being wrapped is *always* MPI program (no serial version available) or *never* an MPI program, they can set the ``withmpi`` attribute of the ``CodeInfo`` to ``True`` or ``False``, respectively.
Note that this setting is fully optional; if the code could be run either way, it is best not to set it and leave it up to the ``Code`` or the ``metadata.options.withmpi`` input.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Note that this setting is fully optional; if the code could be run either way, it is best not to set it and leave it up to the ``Code`` or the ``metadata.options.withmpi`` input.
Doing so is optional, but prevents the plugin from being used with codes or user inputs that do not agree with this setting (AiiDA will raise a ``RuntimeError``).


.. note::

When implementing a ``CalcJob`` that runs a single code, consider using specifying whether MPI should be enabled or disabled through the :ref:`metadata option<topics:calculations:usage:calcjobs:mpi:calcjob-inputs>`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest starting the subsection with this way of doing things.

The semantics of this are also slightly different: This is a way of setting a default behavior, while the CodeInfo is a way of statically enforcing behavior.

In any case, while it is required for backward compatibility, it may be that controlling MPI at the CalcJob level becomes obsolete over time as more and more users start specifying it at the Code level (which makes most sense to me).

This can be accomplished by changing the default in the process specification:

.. code:: python

class SomeCalcJob(CalcJob):

@classmethod
def define(cls, spec):
super().define(spec)
spec.inputs['metadata']['options']['withmpi'].default = True

The advantage over using the ``CodeInfo.withmpi`` attribute is that the default of the metadata option can be introspected programmatically from the process spec, and so is more visible to the user.

Naturally, this approach is not viable for calculation jobs that run multiple codes that are different in whether they require MPI or not.
In this case, one should resort to using the ``CodeInfo.withmpi`` attribute.

.. _topics:calculations:usage:calcjobs:mpi:calcjob-inputs:

The ``CalcJob`` inputs
~~~~~~~~~~~~~~~~~~~~~~

Finally, the MPI setting can be controlled on a per-instance basis, using the ``withmpi`` :ref:`metadata option<topics:calculations:usage:calcjobs:options>`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Finally, the MPI setting can be controlled on a per-instance basis, using the ``withmpi`` :ref:`metadata option<topics:calculations:usage:calcjobs:options>`.
Finally, the MPI setting can be controlled by the user when setting up the calculation by setting the ``withmpi`` :ref:`metadata option<topics:calculations:usage:calcjobs:options>` to ``True`` or ``False``.

If MPI should be enabled or disabled, explicitly set this option to ``True`` or ``False``, respectively.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
If MPI should be enabled or disabled, explicitly set this option to ``True`` or ``False``, respectively.

For example, the following instructs to run all codes in the calculation job with MPI enabled:

.. code:: python

inputs = {
...,
'metadata': {
'options': {
'withmpi': True
}
}
}
submit(CalcJob, **inputs)

The default for this option is set to ``False`` on the base ``CalcJob`` implementation, but it will be overruled if explicitly defined.

.. note::

The value set for the ``withmpi`` option will be applied to all codes.
If a calculation job runs more than one code, and each requires a different MPI setting, this option should not be used, and instead MPI should be controlled :ref:`through the code input <topics:calculations:usage:calcjobs:mpi:code>`.

.. _topics:calculations:usage:calcjobs:mpi:conflict-resolution:

Conflict resolution
~~~~~~~~~~~~~~~~~~~

As described above, MPI can be enabled or disabled for a calculation job on a number of levels:

* The ``Code`` input
* The ``CalcJob`` implementation
* The ``metadata.options.withmpi`` input

MPI is enabled or disabled if any of these values is explicitly set to ``True`` or ``False``, respectively.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
MPI is enabled or disabled if any of these values is explicitly set to ``True`` or ``False``, respectively.

If multiple values are specified and they are not equivalent, a ``RuntimeError`` is raised.
Depending on the conflict, one has to change the ``Code`` or ``metadata.options.withmpi`` input.
If none of the values are explicitly defined, the value specified by the default of ``metadata.options.withmpi`` is taken.


.. _topics:calculations:usage:calcjobs:launch:

Launch
Expand Down
4 changes: 2 additions & 2 deletions docs/source/topics/data_types.rst
Original file line number Diff line number Diff line change
Expand Up @@ -590,8 +590,8 @@ If a default calculation job plugin is defined, a process builder can be obtaine

.. important::

If a containerized code is used for a calculation that sets the :ref:`metadata option <topics:calculations:usage:calcjobs:options>` ``withmpi`` to ``True``, the MPI command line arguments are placed in front of the container runtime.
For example, when running Singularity with ``metadata.options.withmpi = True``, the runline in the submission script will be written as:
If a containerized code is used for a calculation that enables MPI (see :ref:`Controlling MPI <topics:calculations:usage:calcjobs:mpi>`), the MPI command line arguments are placed in front of the container runtime.
For example, when running Singularity with MPI enabled, the runline in the submission script will be written as:

.. code-block:: bash

Expand Down
Loading