Refactor parallelism utilities for public API #12412

jakelishman · 2024-05-15T16:24:24Z

Summary

should_run_in_parallel was added in a stable manner to enable backport to the 1.1 series, but from 1.3 onwards, we want this to be part of the public interface so that others can rely on it too.

As part of this, the parallelisation configuration was made more robust and controllable with context managers. This is convenient beyond just for users - it makes it far easier to control the parallelism during the test suite runs. Several instances where different parts of Qiskit and its test suite reached into deep internals of the parallelism utilities and made significant assumptions about the internal logic are refactored to use public interfaces to achieve what they wanted to.

The multiprocessing detection is changed from making OS-based assumptions about what Python does to simply querying the module for its configuration. This makes it more robust to changes in Python's handling (especially important since 3.14 will change the default start method on Unix). In the future, we may want to change to making these assumptions only if the user hasn't configured the multiprocessing start method themselves.

Details and comments

This was elided from #12410 to make that PR backwards compatible. This PR exposes the feature as part of the public API, so will be new for 1.2.

Depends on #12410.

qiskit-bot · 2024-05-15T16:24:29Z

One or more of the the following people are requested to review this:

@Qiskit/terra-core

coveralls · 2024-05-15T16:44:45Z

Pull Request Test Coverage Report for Build 11633906056

Details

81 of 93 (87.1%) changed or added relevant lines in 8 files are covered.
8 unchanged lines in 3 files lost coverage.
Overall coverage increased (+0.01%) to 88.737%

Changes Missing Coverage	Covered Lines	Changed/Added Lines	%
qiskit/circuit/quantumcircuit.py	2	3	66.67%
qiskit/transpiler/preset_passmanagers/builtin_plugins.py	1	2	50.0%
qiskit/utils/parallel.py	68	78	87.18%

Files with Coverage Reduction	New Missed Lines	%
crates/qasm2/src/expr.rs	1	94.02%
qiskit/user_config.py	1	86.87%
crates/qasm2/src/lex.rs	6	91.98%

Totals
Change from base Build 11632855301:	0.01%
Covered Lines:	76394
Relevant Lines:	86090

💛 - Coveralls

jakelishman · 2024-05-15T17:40:18Z

Now rebased over #12410.

mtreinish · 2024-05-15T18:19:46Z

releasenotes/notes/parallel-check-public-7faed5f3e20e1d03.yaml

+    decision is dependent on how many CPUs are available to Qiskit, what the :mod:`multiprocessing`
+    start method is, how many processes were requested.


I mean sort of, we don't explicitly check what multiprocessing is set to (ie we ignore if the user explicitly sets this to spawn), but the value for PARALLEL_DEFAULT is based on what the OS default start method is.

Yeah, I think I was being more aspirational than correct here. We should do the check based on the multiprocessing start method if that's what we care about, but we don't. I can change the wording.

Maybe we should just do this, it seems totally in scope to update the logic in the function to call multiprocessing.get_start_method() and get rid of the OS based logic here for 1.2.

Maybe we should just do this, it seems totally in scope to update the logic in the function to call multiprocessing.get_start_method() and get rid of the OS based logic here for 1.2.

@mtreinish is this still something we ought to do prior to merging this for 1.3?

Yeah, we still intended to do this

I did this, which then of course broke a bunch of weird other assumptions we were making thirty miles away in the library, so I ended up really rejigging the public API of the parallelism utilities so we didn't need to make internal assumptions across file boundaries.

I'm not entirely sold on the way I wrote the multiprocessing test (see the commit message) - I was trying to match the spirit of the previous assumption-heavy code, but I'm not certain we couldn't do a shade better.

releasenotes/notes/parallel-check-public-7faed5f3e20e1d03.yaml

kevinhartman · 2024-10-30T15:39:01Z

releasenotes/notes/parallel-check-public-7faed5f3e20e1d03.yaml

+    decision is dependent on how many CPUs are available to Qiskit, what the :mod:`multiprocessing`
+    start method is, how many processes were requested.


Maybe we should just do this, it seems totally in scope to update the logic in the function to call multiprocessing.get_start_method() and get rid of the OS based logic here for 1.2.

@mtreinish is this still something we ought to do prior to merging this for 1.3?

`should_run_in_parallel` was added in a stable manner to enable backport to the 1.1 series, but from 1.3 onwards, we want this to be part of the public interface so that others can rely on it too. As part of this, the parallelisation configuration was made more robust and controllable with context managers. This is convenient beyond just for users - it makes it far easier to control the parallelism during the test suite runs. Several instances where different parts of Qiskit and its test suite reached into deep internals of the parallelism utilities and made significant assumptions about the internal logic are refactored to use public interfaces to achieve what they wanted to. The multiprocessing detection is changed from making OS-based assumptions about what Python does to simply querying the module for its configuration. This makes it more robust to changes in Python's handling (especially important since 3.14 will change the default start method on Unix). In the future, we may want to change to making these assumptions only if the user hasn't configured the `multiprocessing` start method themselves.

jakelishman · 2024-11-01T17:53:18Z

I've force-pushed a major new commit that properly refactored a bunch of the parallelism utilities to better support should_run_in_parallel at the level of robustness we'd expect from the public interface. This then removes a bunch of our own library code and test code that was reaching in the belly of the internals of the parallel code and layering assumptions on top of assumptions, and wraps the necessary components in safe interfaces.

I also added a QISKIT_IGNORE_USER_SETTINGS environment variable and configured it to be used in all tox and CI runs by default, which lets us isolate the test suite from the environment - that should make the new tests of what I added reliable, but it's probably something we should have had before anyway.

jakelishman · 2024-11-01T17:56:51Z

Given how major the changes I made here were, and that we're after the feature freeze deadline, I'm fine if we choose to leave this for Qiskit 2.0.

mtreinish · 2024-11-06T11:37:29Z

I agree, given the scope of the changes and that we're only one day out from rc1 lets defer this to 2.0. We can move forward with it pretty soon after rc1 though.

jakelishman added on hold Can not fix yet Changelog: New Feature Include in the "Added" section of the changelog labels May 15, 2024

jakelishman added this to the 1.2.0 milestone May 15, 2024

jakelishman requested a review from a team as a code owner May 15, 2024 16:24

jakelishman force-pushed the parallel-check-public branch from d6d1da9 to 0bdea9c Compare May 15, 2024 17:40

jakelishman removed the on hold Can not fix yet label May 15, 2024

mtreinish reviewed May 15, 2024

View reviewed changes

ElePT modified the milestones: 1.2.0, 1.3.0 Jul 30, 2024

raynelfss assigned mtreinish and kevinhartman Oct 29, 2024

kevinhartman reviewed Oct 30, 2024

View reviewed changes

jakelishman force-pushed the parallel-check-public branch from 0bdea9c to e4fea41 Compare November 1, 2024 17:36

jakelishman requested review from eggerdj and wshanks as code owners November 1, 2024 17:36

jakelishman changed the title ~~Add should_run_in_parallel to public API~~ Refactor parallelism utilities for public API Nov 1, 2024

jakelishman force-pushed the parallel-check-public branch from e4fea41 to 96dbd6e Compare November 1, 2024 17:52

jakelishman requested a review from nonhermitian as a code owner November 1, 2024 17:52

mtreinish modified the milestones: 1.3.0, 2.0.0 Nov 6, 2024

jakelishman mentioned this pull request Nov 10, 2024

qiskit.utils.parallel.should_run_in_parallel() function unexpected return value #13420

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor parallelism utilities for public API #12412

Refactor parallelism utilities for public API #12412

jakelishman commented May 15, 2024 •

edited

Loading

qiskit-bot commented May 15, 2024

coveralls commented May 15, 2024 •

edited

Loading

jakelishman commented May 15, 2024

mtreinish May 15, 2024

jakelishman May 15, 2024

mtreinish May 15, 2024

kevinhartman Oct 30, 2024

jakelishman Oct 30, 2024

jakelishman Nov 1, 2024

kevinhartman Oct 30, 2024

jakelishman commented Nov 1, 2024

jakelishman commented Nov 1, 2024

mtreinish commented Nov 6, 2024

		decision is dependent on how many CPUs are available to Qiskit, what the :mod:`multiprocessing`
		start method is, how many processes were requested.

Refactor parallelism utilities for public API #12412

Are you sure you want to change the base?

Refactor parallelism utilities for public API #12412

Conversation

jakelishman commented May 15, 2024 • edited Loading

Summary

Details and comments

qiskit-bot commented May 15, 2024

coveralls commented May 15, 2024 • edited Loading

Pull Request Test Coverage Report for Build 11633906056

Details

💛 - Coveralls

jakelishman commented May 15, 2024

mtreinish May 15, 2024

Choose a reason for hiding this comment

jakelishman May 15, 2024

Choose a reason for hiding this comment

mtreinish May 15, 2024

Choose a reason for hiding this comment

kevinhartman Oct 30, 2024

Choose a reason for hiding this comment

jakelishman Oct 30, 2024

Choose a reason for hiding this comment

jakelishman Nov 1, 2024

Choose a reason for hiding this comment

kevinhartman Oct 30, 2024

Choose a reason for hiding this comment

jakelishman commented Nov 1, 2024

jakelishman commented Nov 1, 2024

mtreinish commented Nov 6, 2024

jakelishman commented May 15, 2024 •

edited

Loading

coveralls commented May 15, 2024 •

edited

Loading