Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add more default trials to sabre layout #13360

Merged
merged 9 commits into from
Nov 6, 2024

Conversation

mtreinish
Copy link
Member

@mtreinish mtreinish commented Oct 23, 2024

Summary

Right now sabre layout uses n random trials (as specified by the user or defaulting to num_cpus) and since #12453 one additional trial taking the qubits of the densest subgraph as a starting point. There are also a couple of other trivial examples to try which may or may not produce better results depending on the circuit, a trivial layout and a reverse trivial layout. In the case of a hardware efficient circuit the trivial layout will map exactly and would always be picked. When running in a preset pass manager sabre should never be called in this scenario because VF2Layout will find the mapping, but the incremental cost of adding the trial is minimal. Similarly the cost of a reversed trivial layout (where virtual qubit 0 -> n, 1 -> n - 1, etc.) is trivial and may in some scenarios produce a better results.

Details and comments

TODO:

  • Add release note

Right now sabre layout uses n random trials (as specified by the user or
defaulting to num_cpus) and since Qiskit#12453 one additional trial taking the
qubits of the densest subgraph as a starting point. There are also a
couple of other trivial examples to try which may or may not produce
better results depending on the circuit, a trivial layout and a reverse
trivial layout. In the case of a hardware efficient circuit the trivial
layout will map exactly and would always be picked. When running in a
preset pass manager sabre should never be called in this scenario
because VF2Layout will find the mapping, but the incremental cost of
adding the trial is minimal. Similarly the cost of a reversed trivial
layout (where virtual qubit 0 -> n, 1 -> n - 1, etc.) is trivial and
may in some scenarios produce a better results.
@mtreinish mtreinish added Changelog: New Feature Include in the "Added" section of the changelog mod: transpiler Issues and PRs related to Transpiler labels Oct 23, 2024
@mtreinish mtreinish added this to the 1.3.0 milestone Oct 23, 2024
@mtreinish mtreinish requested a review from a team as a code owner October 23, 2024 15:13
@qiskit-bot
Copy link
Collaborator

One or more of the following people are relevant to this code:

  • @Qiskit/terra-core

@coveralls
Copy link

coveralls commented Oct 23, 2024

Pull Request Test Coverage Report for Build 11617356503

Warning: This coverage report may be inaccurate.

This pull request's base commit is no longer the HEAD commit of its target branch. This means it includes changes from outside the original pull request, including, potentially, unrelated coverage changes.

Details

  • 19 of 64 (29.69%) changed or added relevant lines in 1 file are covered.
  • 219 unchanged lines in 18 files lost coverage.
  • Overall coverage decreased (-0.05%) to 88.668%

Changes Missing Coverage Covered Lines Changed/Added Lines %
crates/accelerate/src/sabre/layout.rs 19 64 29.69%
Files with Coverage Reduction New Missed Lines %
crates/accelerate/src/unitary_synthesis.rs 1 92.2%
qiskit/transpiler/target.py 1 93.84%
qiskit/circuit/store.py 1 97.06%
crates/accelerate/src/basis/basis_translator/basis_search.rs 1 99.31%
qiskit/qasm2/export.py 1 98.48%
crates/qasm2/src/lex.rs 2 93.23%
crates/accelerate/src/basis/basis_translator/compose_transforms.rs 3 97.48%
qiskit/circuit/controlflow/if_else.py 4 97.73%
qiskit/circuit/delay.py 5 74.24%
qiskit/qasm2/parse.py 5 96.72%
Totals Coverage Status
Change from base Build 11610955352: -0.05%
Covered Lines: 76383
Relevant Lines: 86145

💛 - Coveralls

@mtreinish
Copy link
Member Author

I ran the asv levels and utility benchmarks none of which showed any difference with this PR. The benchmarks which I think would show something here with sabre are covered by VF2Layout so it doesn't make a difference. However, I built a quick test script:

from qiskit.circuit.library import EfficientSU2
from qiskit.providers.fake_provider import GenericBackendV2
from qiskit import generate_preset_pass_manager

cmap = [[0, 14], [1, 0], [1, 2], [3, 2], [4, 3], [4, 5], [6, 5], [7, 6], [8, 7], [8, 9], [8, 16], [9, 10], [11, 10], [11, 12], [12, 13], [15, 4], [16, 26], [17, 12], [17, 30], [18, 14], [18, 19], [19, 20], [21, 20], [22, 15], [22, 21], [22, 23], [23, 24], [25, 24], [25, 26], [27, 26], [27, 28], [28, 29], [28, 35], [30, 29], [30, 31], [31, 32], [32, 36], [33, 20], [33, 39], [34, 24], [34, 43], [37, 38], [38, 39], [39, 40], [40, 41], [42, 41], [43, 42], [44, 43], [44, 45], [46, 45], [47, 35], [47, 46], [48, 47], [49, 48], [49, 55], [50, 49], [50, 51], [51, 36], [52, 37], [53, 41], [53, 60], [54, 45], [54, 64], [55, 68], [56, 52], [57, 56], [57, 58], [59, 58], [59, 60], [61, 60], [62, 61], [62, 63], [63, 64], [64, 65], [65, 66], [67, 66], [67, 68], [68, 69], [70, 69], [71, 58], [72, 62], [73, 66], [73, 85], [74, 70], [75, 90], [76, 75], [76, 77], [77, 71], [77, 78], [79, 78], [79, 91], [80, 79], [81, 72], [81, 80], [82, 81], [82, 83], [83, 92], [84, 83], [84, 85], [86, 85], [87, 86], [87, 93], [88, 87], [89, 74], [89, 88], [93, 106], [94, 90], [94, 95], [96, 95], [96, 97], [96, 109], [97, 98], [98, 91], [98, 99], [99, 100], [101, 100], [101, 102], [102, 92], [102, 103], [104, 103], [104, 111], [105, 104], [105, 106], [106, 107], [107, 108], [109, 114], [110, 100], [112, 108], [113, 114], [115, 114], [116, 115], [117, 116], [118, 110], [118, 117], [119, 118], [120, 119], [121, 120], [122, 111], [122, 121], [122, 123], [124, 123], [125, 124], [125, 126], [126, 112]]
backend = GenericBackendV2(127, basis_gates=["ecr", "id", "rz", "sx", "x"], coupling_map=cmap)

qc = EfficientSU2(127, entanglement='circular', reps=3, insert_barriers=True, flatten=True)

import time

pm = generate_preset_pass_manager(2, backend=backend)
times = []
ecr_counts = []
depths = []
for i in range(25):
    start = time.perf_counter()
    res = pm.run(qc)
    stop = time.perf_counter()
    times.append(stop - start)
    ecr_count = res.count_ops()["ecr"]
    ecr_counts.append(ecr_count)
    depth = res.depth()
    depths.append(depth)

import statistics
print(f"Avg runtime: {statistics.geometric_mean(times)}")
print(f"Minimum ECR count: {min(ecr_counts)}")
print(f"Median ecr count: {statistics.median(ecr_counts)}")
print(f"Minimum depth: {min(depths)}")
print(f"Median depth: {statistics.median(depths)}")

and ran it with this PR and main. On main it returned:

Avg runtime: 4.788427858917832
Minimum ECR count: 1506
Median ecr count: 1793
Minimum depth: 5422
Median depth: 6328

with this PR it returned:

Avg runtime: 4.736172954507784
Minimum ECR count: 1374
Median ecr count: 1494
Minimum depth: 4849
Median depth: 5443

ElePT
ElePT previously approved these changes Oct 28, 2024
Copy link
Contributor

@ElePT ElePT left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a smart addition, these are the kind of edge cases that come up when benchmarking against other tools :)

@mtreinish mtreinish requested a review from ElePT October 29, 2024 17:45
Copy link
Contributor

@ElePT ElePT left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jakelishman how comfortable are you with the sneaky HW-dependent extra trials? I see:

  • Pros: better outcome for a lot of practical use cases
  • Cons: it's incomplete. We might generate an expectation on users that wouldn't be met for architectures outside of the ones contemplated in the "hack".

I'm still leaning towards the yes (not only because of the xkcd reference), but I might be missing something.

qiskit/transpiler/passes/layout/sabre_layout.py Outdated Show resolved Hide resolved
crates/accelerate/src/sabre/layout.rs Show resolved Hide resolved
mtreinish and others added 2 commits October 31, 2024 11:22
Co-authored-by: Elena Peña Tapia <57907331+ElePT@users.noreply.github.com>
@alexanderivrii
Copy link
Contributor

Note that for the 127-qubit layout we found a maximum-sized ring (over 104 qubits), and then inserted the remaining 23 qubits near their respective neighbors in the ring, while for the 133- and 156-qubit layouts we just chose a maximum-sized ring.

To decide which of the two strategies is better, I have modified the script above to transpile EfficientSU2(nq, entanglement='circular', reps=3, insert_barriers=True, flatten=True) into the 127-qubit heavy-hex coupling map, for all possible values nq = 1, 2, ..., 127. Each experiment was run 10 times, calculating as in the script the average run time, the minimum and the median ECR counts, and the minimum and the median depths. Furthermore, I have experimented with 4 different strategies to choose the additional default sabre layouts: (1) default = without this PR, (2) default + linear + reverse linear layouts (in this PR), (3) default + ring over 104 qubits (not in this PR), (4) default + ring with addition qubits inserted (in this PR). Attaching the data in case someone wants to make sense of it:

RingLayouts.xlsx

@mtreinish mtreinish requested a review from ElePT November 5, 2024 16:10
Copy link
Contributor

@ElePT ElePT left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @alexanderivrii, interesting! thanks for running the data. Looking (crudely) at the median depth ratios over the default strategy I see that all 3 strategies lead to some improvement in the geometric mean of the ratios:

  • (2) default + linear + reverse linear layouts (in this PR) -> 0.93
  • (3) default + ring over 104 qubits (not in this PR) -> 0.67
  • (4) default + ring with addition qubits inserted (in this PR) -> 0.68

And I'd say the difference between 3 an 4 doesn't look significative (especially given the number of samples). If I have learned anything from the last week is that the statistical distribution of the 2q depth ratios behaves weirdly enough not to extract conclusions from its geometric mean, but in case you are curious, I got:

  • (2) default + linear + reverse linear layouts (in this PR) -> 0.91
  • (3) default + ring over 104 qubits (not in this PR) -> 0.64
  • (4) default + ring with addition qubits inserted (in this PR) -> 0.69

Given this, and the closeness to the rc1 deadline, I would vote for merging the current state of the PR and leave changes for follow-ups.

@ElePT ElePT added this pull request to the merge queue Nov 6, 2024
Merged via the queue into Qiskit:main with commit a18ef02 Nov 6, 2024
17 checks passed
@mtreinish mtreinish deleted the more-sabre-default-trials branch November 6, 2024 11:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Changelog: New Feature Include in the "Added" section of the changelog mod: transpiler Issues and PRs related to Transpiler
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants