'Peephole' optimization - or: collecting and optimizing two-qubit blocks - before routing #12727

sbrandhsn · 2024-07-05T18:53:33Z

Summary

Details and comments

This PR collects and consolidates two-qubit gates into blocks before routing. Each block is then subsequently analysed to determine whether the two-qubit unitary representing the block equals the identity or is a product of single-qubit gates. In the former case, the gates in the block are removed and in the latter case the block is replaced by single-qubit gates correspondingly.

Points up for debate:

Do we want to run the new pass Split2QUnitaries as part of ConsolidatedBlocks?
Do the passes for gate direction checks and fixes always play nicely with having two-qubit unitaries before routing? There were no test failures but I'm still wondering.
Is the current path of checking for the two-qubit unitaries for being a product of single-qubit unitaries the fastest or should we rely on Weyl-decomposition? If the former is true, we should probably port the corresponding functions to rust as they will touch a lot of two-qubit unitaries. :-)

qiskit-bot · 2024-07-05T18:55:56Z

One or more of the following people are relevant to this code:

@Qiskit/terra-core

coveralls · 2024-07-05T19:26:54Z

Pull Request Test Coverage Report for Build 9812549561

Details

45 of 45 (100.0%) changed or added relevant lines in 3 files are covered.
7 unchanged lines in 2 files lost coverage.
Overall coverage increased (+0.02%) to 89.843%

Files with Coverage Reduction	New Missed Lines	%
crates/qasm2/src/expr.rs	1	94.02%
crates/qasm2/src/lex.rs	6	91.6%

Totals
Change from base Build 9807934619:	0.02%
Covered Lines:	65234
Relevant Lines:	72609

💛 - Coveralls

coveralls · 2024-07-08T08:35:04Z

Pull Request Test Coverage Report for Build 10198580745

Warning: This coverage report may be inaccurate.

This pull request's base commit is no longer the HEAD commit of its target branch. This means it includes changes from outside the original pull request, including, potentially, unrelated coverage changes.

For more information on this, see Tracking coverage changes with pull request builds.
To avoid this issue with future PRs, see these Recommended CI Configurations.
For a quick fix, rebase this PR at GitHub. Your next report should be accurate.

Details

59 of 62 (95.16%) changed or added relevant lines in 4 files are covered.
299 unchanged lines in 10 files lost coverage.
Overall coverage increased (+0.06%) to 89.764%

Changes Missing Coverage	Covered Lines	Changed/Added Lines	%
qiskit/transpiler/preset_passmanagers/builtin_plugins.py	28	29	96.55%
qiskit/transpiler/passes/optimization/split_2q_unitaries.py	29	31	93.55%

Files with Coverage Reduction	New Missed Lines	%
crates/qasm2/src/lex.rs	3	93.23%
crates/circuit/src/imports.rs	3	77.78%
qiskit/circuit/library/standard_gates/u.py	3	93.07%
qiskit/qasm3/exporter.py	4	95.46%
qiskit/circuit/commutation_checker.py	5	96.99%
qiskit/circuit/_utils.py	6	92.0%
crates/circuit/src/parameter_table.rs	12	94.55%
crates/circuit/src/circuit_data.rs	35	95.14%
qiskit/circuit/quantumcircuit.py	87	93.54%
crates/circuit/src/operations.rs	141	88.68%

Totals
Change from base Build 10186441336:	0.06%
Covered Lines:	67356
Relevant Lines:	75037

💛 - Coveralls

This commit migrates the two functions in the private module `qiskit.synthesis.two_qubit.local_invariance` to primarily be implemented in rust. Since the two_qubit_local_invariants() function is being used in Qiskit#12727 premptively porting these functions to rust will both potentially speed up the new transpiler pass in that PR and also facilitate a future porting of that pass to rust. The two python space functions continue to exist as a small wrapper to do input type checking/conversion and rounding of the result (since the python API for rounding is simpler). There is no release note included for these functions as they are internal utilities in Qiskit and not exposed as a public interface.

mtreinish · 2024-07-08T19:55:25Z

Is the current path of checking for the two-qubit unitaries for being a product of single-qubit unitaries the fastest or should we rely on Weyl-decomposition? If the former is true, we should probably port the corresponding functions to rust as they will touch a lot of two-qubit unitaries. :-)

I took care of this in: #12739 since it was very simple. From a complexity PoV I would expect the local_invariants() function to be much faster since it's far less complex the the TwoQubitWeylDecomposition class.

mtreinish

Overall this looks great to me, I really like this idea of collecting 2q unitary blocks before routing and deferring synthesis until the translation phase. It avoids the potential risk of early synthesis making routing worse and will potentially make routing faster because we have reduced the total gate count.

I left some inline comments, on the new pass itself most of the suggestions are adjusting the access patterns to get attributes from the DAGOpNode directly instead of using the python op. Although some of these access patterns will shift after #12730 merges so we should keep an eye on potential conflicts there. The other thing is after #12739 merges and decompose_two_qubit_product_gate() is ported to rust we actually can move the majority of this pass to rust, where we pass a list of DAGOpNodes to rust do the filtering and resynthesis in rust, and return a list of 1q unitary pairs. This is basically what's done in Optimize1qGatesDecomposition now and I expect that it will be faster than doing it all python side. But that can wait until we have the necessary features in rust.

qiskit/transpiler/passes/optimization/split_2q_unitaries.py

mtreinish · 2024-07-10T12:13:38Z

test/python/compiler/test_transpiler.py

@@ -109,6 +111,16 @@ def _define(self):
        self._definition = QuantumCircuit(2)
        self._definition.cx(0, 1)

+    def to_matrix(self) -> np.ndarray:


Why is this needed? Should this skip the gate if the matrix is empty?

Looking into this - a couple of tests were failing...

qiskit/transpiler/passes/optimization/split_2q_unitaries.py

mtreinish · 2024-07-10T12:19:10Z

qiskit/transpiler/passes/optimization/split_2q_unitaries.py

+                dag_node.apply_operation_back(UnitaryGate(ur), qargs=(node.qargs[0],))
+                dag_node.apply_operation_back(UnitaryGate(ul), qargs=(node.qargs[1],))
+                dag_node.global_phase += phase
+                dag.substitute_node_with_dag(node, dag_node)


While we can use substitute node with dag here, it'll probably be a bit more performant to do something like:

https://github.com/Qiskit/qiskit/blob/main/qiskit/transpiler/passes/optimization/optimize_1q_decomposition.py#L244-L248

Where we just insert the decomposition gate before the 2q gate and then just remove the node. This should be a bit faster because it avoids all the extra complexity in the dag substitution code.

Sounds good, thanks. I will modify this.

qiskit/transpiler/preset_passmanagers/builtin_plugins.py

mtreinish · 2024-07-10T12:34:53Z

releasenotes/notes/peephole-before-routing-c3d184b740bb7a8b.yaml

+    Added a new pass :class:`.Split2QUnitaries` that iterates over all two-qubit gates or unitaries in a
+    circuit and replaces them with two single-qubit unitaries, if possible without introducing errors, i.e.
+    the two-qubit gate/unitary is actually a (kronecker) product of single-qubit unitaries. In addition,
+    the passes :class:`.Collect2qBlocks`, :class:`.ConsolidateBlocks` and :class:`.Split2QUnitaries` are
+    added to the `init` stage of the preset pass managers with optimization level 2 and optimization level 3
+    if the pass managers target a :class:`.Target` that has a discrete basis gate set, i.e. all basis gates
+    have are not parameterized. The modification of the `init` stage should allow for a more efficient routing
+    for quantum circuits that (a) contain two-qubit unitaries/gates that are actually a product of single-qubit gates
+    or (b) contain multiple two-qubit gates in a continuous block of two-qubit gates. In the former case,
+    the routing of the two-qubit gate can simply be skipped as no real interaction between a pair of qubits
+    occurs. In the latter case, the lookahead space of routing algorithms is not 'polluted' by superfluous
+    two-qubit gates, i.e. for routing it is sufficient to only consider one single two-qubit gate per
+    continuous block of two-qubit gates.


I would split this into two release notes (with separate bullet points under features). One documenting the new pass, the other documenting the modifications to the preset pass manager.

releasenotes/notes/peephole-before-routing-c3d184b740bb7a8b.yaml

qiskit/transpiler/passes/optimization/split_2q_unitaries.py

qiskit/transpiler/preset_passmanagers/builtin_plugins.py

Co-authored-by: Matthew Treinish <mtreinish@kortar.org>

mtreinish

I think this is basically ready. I just left some small inline comments on the tests and docs but we can do these in follow ups pretty easily as long as we do it before the 1.2.0 final release.

test/python/compiler/test_transpiler.py

test/python/transpiler/test_split_2q_unitaries.py

qiskit/transpiler/passes/optimization/split_2q_unitaries.py

Co-authored-by: Matthew Treinish <mtreinish@kortar.org>

…to peephole-opt

sbrandhsn · 2024-08-01T12:35:48Z

The code looks fine, module plumbing in the approximation. I haven't tried checking the compile speed and/or quality of output. Would intuitively expect modest improvement in both most of the time, but there are probably pathological cases where it is worse. Looks like you are pushing the swap case for a follow up

Thanks, I decided to not include the swap gate case here because we will need to build a new dag for every input circuit. I assume most input circuits won't have swap-like circuits, so we would end up wasting that runtime while not yielding an improvement in circuit size for most cases. Also, when a user inputs a prerouted quantum circuit he would be very surprised to find that his routing solution has been replaced by something qiskit came up with which may or may not be better. I still think this is an interesting approach and should be reconsidered in the future.

mtreinish

LGTM, thanks for the quick updates.

Comments addressed

* Oxidize two qubit local invariance functions This commit migrates the two functions in the private module `qiskit.synthesis.two_qubit.local_invariance` to primarily be implemented in rust. Since the two_qubit_local_invariants() function is being used in Qiskit#12727 premptively porting these functions to rust will both potentially speed up the new transpiler pass in that PR and also facilitate a future porting of that pass to rust. The two python space functions continue to exist as a small wrapper to do input type checking/conversion and rounding of the result (since the python API for rounding is simpler). There is no release note included for these functions as they are internal utilities in Qiskit and not exposed as a public interface. * Add docstring to rust function with citation * Store adjoint magic array statically * Use arraview1 instead of slice for local_equivalence() * Fix rustfmt

…cks - before routing (#12727) * init * up * up * Update builtin_plugins.py * Update builtin_plugins.py * reno * Update builtin_plugins.py * Update builtin_plugins.py * Update peephole-before-routing-c3d184b740bb7a8b.yaml * neko check * check neko * Update builtin_plugins.py * test neko * Update builtin_plugins.py * Update builtin_plugins.py * Update builtin_plugins.py * lint * tests and format * remove FakeTorino test * Update peephole-before-routing-c3d184b740bb7a8b.yaml * Apply suggestions from code review Co-authored-by: Matthew Treinish <mtreinish@kortar.org> * comments from code review * fix precision * up * up * update * up * . * cyclic import * cycl import * cyl import * . * circular import * . * lint * Include new pass in docs * Fix Split2QUnitaries dag manipulation This commit fixes the dag handling to do the 1q unitary insertion. Previously the dag manipulation was being done manually using the insert_node_on_in_edges() rustworkx method. However as the original node had 2 incoming edges for each qubit this caused the dag after running the pass to become corrupted. Each of the new 1q unitary nodes would end up with 2 incident edges and they would be in a sequence. This would result in later passes not being able to correctly understand the state of the circuit correctly. This was causing the unit tests to fail. This commit fixes this by just using `substitute_node_with_dag()` to handle the node substition, while doing it manually to avoid the overhead of checking is probably possible, the case where a unitary is the product of two 1q gates is not very common so optimizing it isn't super critical. * Update releasenotes/notes/peephole-before-routing-c3d184b740bb7a8b.yaml * stricter check for doing split2q * Update qiskit/transpiler/preset_passmanagers/builtin_plugins.py Co-authored-by: Matthew Treinish <mtreinish@kortar.org> * code review * Update qiskit/transpiler/passes/optimization/split_2q_unitaries.py Co-authored-by: Matthew Treinish <mtreinish@kortar.org> * new tests * typo * lint * lint --------- Co-authored-by: Matthew Treinish <mtreinish@kortar.org> (cherry picked from commit 1214d51)

…cks - before routing (#12727) (#12881) * init * up * up * Update builtin_plugins.py * Update builtin_plugins.py * reno * Update builtin_plugins.py * Update builtin_plugins.py * Update peephole-before-routing-c3d184b740bb7a8b.yaml * neko check * check neko * Update builtin_plugins.py * test neko * Update builtin_plugins.py * Update builtin_plugins.py * Update builtin_plugins.py * lint * tests and format * remove FakeTorino test * Update peephole-before-routing-c3d184b740bb7a8b.yaml * Apply suggestions from code review Co-authored-by: Matthew Treinish <mtreinish@kortar.org> * comments from code review * fix precision * up * up * update * up * . * cyclic import * cycl import * cyl import * . * circular import * . * lint * Include new pass in docs * Fix Split2QUnitaries dag manipulation This commit fixes the dag handling to do the 1q unitary insertion. Previously the dag manipulation was being done manually using the insert_node_on_in_edges() rustworkx method. However as the original node had 2 incoming edges for each qubit this caused the dag after running the pass to become corrupted. Each of the new 1q unitary nodes would end up with 2 incident edges and they would be in a sequence. This would result in later passes not being able to correctly understand the state of the circuit correctly. This was causing the unit tests to fail. This commit fixes this by just using `substitute_node_with_dag()` to handle the node substition, while doing it manually to avoid the overhead of checking is probably possible, the case where a unitary is the product of two 1q gates is not very common so optimizing it isn't super critical. * Update releasenotes/notes/peephole-before-routing-c3d184b740bb7a8b.yaml * stricter check for doing split2q * Update qiskit/transpiler/preset_passmanagers/builtin_plugins.py Co-authored-by: Matthew Treinish <mtreinish@kortar.org> * code review * Update qiskit/transpiler/passes/optimization/split_2q_unitaries.py Co-authored-by: Matthew Treinish <mtreinish@kortar.org> * new tests * typo * lint * lint --------- Co-authored-by: Matthew Treinish <mtreinish@kortar.org> (cherry picked from commit 1214d51) Co-authored-by: Sebastian Brandhofer <148463728+sbrandhsn@users.noreply.github.com>

In Qiskit#12727 a check was added to the default init stage's construction to avoid running 2q gate consolidation in the presence of targets with only discrete gates. However the way the target was being used in this check was incorrect. The name for an instruction in the target should be used as its identifier and then if we need the object representation that should query the target for that object based on the name. However the check was doing this backwards getting a list of operation objects and then using the name contained in the object. This will cause issues for instructions that use custom names such as when there are tuned variants or a custom gate instance with a unique name. While there is some question over whether we need this check as we will run the consolidate 2q blocks pass as part of the optimization stage which will have the same effect, this opts to just fix the target usage for it to minimize the diff. Also while not the explicit goal of this check it is protecting against some bugs in the consolidate blocks pass that occur when custom gates are used. So for the short term this check is retained, but in the future when these bugs in consolidate blocks are fixed we can revisit whether we want to remove the conditional logic.

* Fix target handling in discrete basis check In #12727 a check was added to the default init stage's construction to avoid running 2q gate consolidation in the presence of targets with only discrete gates. However the way the target was being used in this check was incorrect. The name for an instruction in the target should be used as its identifier and then if we need the object representation that should query the target for that object based on the name. However the check was doing this backwards getting a list of operation objects and then using the name contained in the object. This will cause issues for instructions that use custom names such as when there are tuned variants or a custom gate instance with a unique name. While there is some question over whether we need this check as we will run the consolidate 2q blocks pass as part of the optimization stage which will have the same effect, this opts to just fix the target usage for it to minimize the diff. Also while not the explicit goal of this check it is protecting against some bugs in the consolidate blocks pass that occur when custom gates are used. So for the short term this check is retained, but in the future when these bugs in consolidate blocks are fixed we can revisit whether we want to remove the conditional logic. * Remove check and fix ConsolidateBlocks bug This commit pivots this PR branch to just remove the additional logic around skipping the optimization passes for discrete basis sets. The value the check was actually providing was not around a discrete basis set target and instead was to workaround a bug in the consolidate blocks pass. If a discrete basis set target was used this would still fail because we will unconditionally call `ConsolidateBlocks` during the optimization stage. This commit opts to just remove the extra complexity of the conditional execution of the peephole optimization passes and instead just fix the underlying bug in `ConsolidateBlocks` and remove the check.

* Fix target handling in discrete basis check In #12727 a check was added to the default init stage's construction to avoid running 2q gate consolidation in the presence of targets with only discrete gates. However the way the target was being used in this check was incorrect. The name for an instruction in the target should be used as its identifier and then if we need the object representation that should query the target for that object based on the name. However the check was doing this backwards getting a list of operation objects and then using the name contained in the object. This will cause issues for instructions that use custom names such as when there are tuned variants or a custom gate instance with a unique name. While there is some question over whether we need this check as we will run the consolidate 2q blocks pass as part of the optimization stage which will have the same effect, this opts to just fix the target usage for it to minimize the diff. Also while not the explicit goal of this check it is protecting against some bugs in the consolidate blocks pass that occur when custom gates are used. So for the short term this check is retained, but in the future when these bugs in consolidate blocks are fixed we can revisit whether we want to remove the conditional logic. * Remove check and fix ConsolidateBlocks bug This commit pivots this PR branch to just remove the additional logic around skipping the optimization passes for discrete basis sets. The value the check was actually providing was not around a discrete basis set target and instead was to workaround a bug in the consolidate blocks pass. If a discrete basis set target was used this would still fail because we will unconditionally call `ConsolidateBlocks` during the optimization stage. This commit opts to just remove the extra complexity of the conditional execution of the peephole optimization passes and instead just fix the underlying bug in `ConsolidateBlocks` and remove the check. (cherry picked from commit 70c2f78)

* Fix target handling in discrete basis check In #12727 a check was added to the default init stage's construction to avoid running 2q gate consolidation in the presence of targets with only discrete gates. However the way the target was being used in this check was incorrect. The name for an instruction in the target should be used as its identifier and then if we need the object representation that should query the target for that object based on the name. However the check was doing this backwards getting a list of operation objects and then using the name contained in the object. This will cause issues for instructions that use custom names such as when there are tuned variants or a custom gate instance with a unique name. While there is some question over whether we need this check as we will run the consolidate 2q blocks pass as part of the optimization stage which will have the same effect, this opts to just fix the target usage for it to minimize the diff. Also while not the explicit goal of this check it is protecting against some bugs in the consolidate blocks pass that occur when custom gates are used. So for the short term this check is retained, but in the future when these bugs in consolidate blocks are fixed we can revisit whether we want to remove the conditional logic. * Remove check and fix ConsolidateBlocks bug This commit pivots this PR branch to just remove the additional logic around skipping the optimization passes for discrete basis sets. The value the check was actually providing was not around a discrete basis set target and instead was to workaround a bug in the consolidate blocks pass. If a discrete basis set target was used this would still fail because we will unconditionally call `ConsolidateBlocks` during the optimization stage. This commit opts to just remove the extra complexity of the conditional execution of the peephole optimization passes and instead just fix the underlying bug in `ConsolidateBlocks` and remove the check. (cherry picked from commit 70c2f78) Co-authored-by: Matthew Treinish <mtreinish@kortar.org>

sbrandhsn added 4 commits July 5, 2024 13:36

init

9965ce3

Merge branch 'Qiskit:main' into peephole-opt

f3af45a

up

c650503

up

092257c

sbrandhsn requested a review from a team as a code owner July 5, 2024 18:53

Update builtin_plugins.py

3befb83

sbrandhsn added 15 commits July 8, 2024 10:51

Update builtin_plugins.py

a8c5b8c

reno

90e725c

Update builtin_plugins.py

fd988e3

Update builtin_plugins.py

e765285

Update peephole-before-routing-c3d184b740bb7a8b.yaml

1805cb8

neko check

bce3fcd

check neko

e6f0b39

Update builtin_plugins.py

6f7445c

test neko

d3a45e7

Update builtin_plugins.py

dec91de

Update builtin_plugins.py

45912f8

Update builtin_plugins.py

5404822

lint

ed790c1

tests and format

1236cd6

remove FakeTorino test

db8aa2d

mtreinish mentioned this pull request Jul 8, 2024

Oxidize two qubit local invariance functions #12739

Merged

Update peephole-before-routing-c3d184b740bb7a8b.yaml

3e28ec6

mtreinish reviewed Jul 10, 2024

View reviewed changes

stricter check for doing split2q

9964d65

mtreinish reviewed Aug 1, 2024

View reviewed changes

qiskit/transpiler/preset_passmanagers/builtin_plugins.py Outdated Show resolved Hide resolved

Update qiskit/transpiler/preset_passmanagers/builtin_plugins.py

b90f9f6

Co-authored-by: Matthew Treinish <mtreinish@kortar.org>

mtreinish reviewed Aug 1, 2024

View reviewed changes

sbrandhsn and others added 5 commits August 1, 2024 14:19

code review

a7d3594

Update qiskit/transpiler/passes/optimization/split_2q_unitaries.py

89ac7dc

Co-authored-by: Matthew Treinish <mtreinish@kortar.org>

new tests

229b028

Merge branch 'peephole-opt' of https://github.com/sbrandhsn/qiskit in…

d8fba88

…to peephole-opt

typo

48eacd7

sbrandhsn added 2 commits August 1, 2024 14:39

lint

edde611

lint

5404bbc

mtreinish approved these changes Aug 1, 2024

View reviewed changes

mtreinish enabled auto-merge August 1, 2024 13:39

mtreinish added this pull request to the merge queue Aug 1, 2024

Merged via the queue into Qiskit:main with commit 1214d51 Aug 1, 2024
15 checks passed

mtreinish added the stable backport potential The bug might be minimal and/or import enough to be port to stable label Aug 1, 2024

mergify bot mentioned this pull request Aug 1, 2024

'Peephole' optimization - or: collecting and optimizing two-qubit blocks - before routing (backport #12727) #12881

Merged

mtreinish mentioned this pull request Aug 3, 2024

Remove init peephole optimization discrete basis check #12898

Merged

mergify bot mentioned this pull request Aug 8, 2024

Remove init peephole optimization discrete basis check (backport #12898) #12920

Merged

qci-amos mentioned this pull request Aug 16, 2024

Split2QUnitaries assumes node.op is ndarray #12970

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

'Peephole' optimization - or: collecting and optimizing two-qubit blocks - before routing #12727

'Peephole' optimization - or: collecting and optimizing two-qubit blocks - before routing #12727

sbrandhsn commented Jul 5, 2024

qiskit-bot commented Jul 5, 2024

coveralls commented Jul 5, 2024 •

edited

Loading

coveralls commented Jul 8, 2024 •

edited

Loading

mtreinish commented Jul 8, 2024

mtreinish left a comment

mtreinish Jul 10, 2024

sbrandhsn Jul 12, 2024

mtreinish Jul 10, 2024

sbrandhsn Jul 12, 2024

mtreinish Jul 10, 2024

mtreinish left a comment

sbrandhsn commented Aug 1, 2024

mtreinish left a comment

'Peephole' optimization - or: collecting and optimizing two-qubit blocks - before routing #12727

'Peephole' optimization - or: collecting and optimizing two-qubit blocks - before routing #12727

Conversation

sbrandhsn commented Jul 5, 2024

Summary

Details and comments

qiskit-bot commented Jul 5, 2024

coveralls commented Jul 5, 2024 • edited Loading

Pull Request Test Coverage Report for Build 9812549561

Details

💛 - Coveralls

coveralls commented Jul 8, 2024 • edited Loading

Pull Request Test Coverage Report for Build 10198580745

Warning: This coverage report may be inaccurate.

Details

💛 - Coveralls

mtreinish commented Jul 8, 2024

mtreinish left a comment

Choose a reason for hiding this comment

mtreinish Jul 10, 2024

Choose a reason for hiding this comment

sbrandhsn Jul 12, 2024

Choose a reason for hiding this comment

mtreinish Jul 10, 2024

Choose a reason for hiding this comment

sbrandhsn Jul 12, 2024

Choose a reason for hiding this comment

mtreinish Jul 10, 2024

Choose a reason for hiding this comment

mtreinish left a comment

Choose a reason for hiding this comment

sbrandhsn commented Aug 1, 2024

mtreinish left a comment

Choose a reason for hiding this comment

coveralls commented Jul 5, 2024 •

edited

Loading

coveralls commented Jul 8, 2024 •

edited

Loading