Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix potential non-determinism in DAGOpNode sort key #9569

Merged
merged 4 commits into from
Aug 3, 2023

Conversation

mtreinish
Copy link
Member

Summary

This commit fixes potential source of non-determinism when topologically sorting a DAGCircuit object. The lexicographical_topological_sort() function we're using from rustworkx takes a sort key callback that returns a string which is used for tie breaking in the topological sort. The string used for op nodes was previously str(qargs), but this is potentially problematic for standalone Qubit objects. The repr for a qubit without a register set will include the memory address of the bit object. This could result in the topological sort differing between executions of the same program which could lead to unexpected results. This commit fixes this by changing the sort key to be a string of the qubit indices in the dag instead of the qubit objects in qargs.

This commit will likely introduce a small performance regression because we're adding overhead on every DAGOpNode creation. This regression can be addressed when #9389 is implemented because we can just pass the data structure mapping bit objects to indices to the constructor instead of having to build that mapping on each op node.

Additionally, some test cases were fixed as part of this commit, because several DAGCircuit test cases were incorrectly calling apply_operation_back() and setting clbits as qargs which will now fail because of the index lookup on qubits.

Details and comments

This commit fixes potential source of non-determinism when topologically
sorting a DAGCircuit object. The lexicographical_topological_sort()
function we're using from rustworkx takes a sort key callback that
returns a string which is used for tie breaking in the topological sort.
The string used for op nodes was previously `str(qargs)`, but this is
potentially problematic for standalone Qubit objects. The repr for a
qubit without a register set will include the memory address of the bit
object. This could result in the topological sort differing between
executions of the same program which could lead to unexpected results.
This commit fixes this by changing the sort key to be a string of the
qubit indices in the dag instead of the qubit objects in qargs.

This commit will likely introduce a small performance regression because
we're adding overhead on every DAGOpNode creation. This regression can
be addressed when Qiskit#9389 is implemented because we can just pass the data
structure mapping bit objects to indices to the constructor instead of
having to build that mapping on each op node.

Additionally, some test cases were fixed as part of this commit, because
several DAGCircuit test cases were incorrectly calling
apply_operation_back() and setting clbits as qargs which will now fail
because of the index lookup on qubits.
@mtreinish mtreinish added Changelog: None Do not include in changelog mod: transpiler Issues and PRs related to Transpiler labels Feb 10, 2023
@mtreinish mtreinish requested a review from a team as a code owner February 10, 2023 16:34
@qiskit-bot
Copy link
Collaborator

Thank you for opening a new pull request.

Before your PR can be merged it will first need to pass continuous integration tests and be reviewed. Sometimes the review process can be slow, so please be patient.

While you're waiting, please feel free to review other open PRs. While only a subset of people are authorized to approve pull requests for merging, everyone is encouraged to review open pull requests. Doing reviews helps reduce the burden on the core team and helps make the project's code better for everyone.

One or more of the the following people are requested to review this:

  • @Qiskit/terra-core

Copy link
Member

@jakelishman jakelishman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deeply unfortunately, SabreSwap and LookaheadSwap both manually construct some DAGOpNodes - I think we need to thread the sort-key generation through them as well (and stop them from doing this, at some point). Looking at the SabreSwap code now, it also just for fun mutates the qargs after the node is created, so its produced sort key won't even be correct 👍.

@coveralls
Copy link

coveralls commented Feb 10, 2023

Pull Request Test Coverage Report for Build 5752236784

  • 8 of 8 (100.0%) changed or added relevant lines in 2 files are covered.
  • 8 unchanged lines in 2 files lost coverage.
  • Overall coverage decreased (-0.002%) to 85.914%

Files with Coverage Reduction New Missed Lines %
qiskit/extensions/quantum_initializer/squ.py 2 80.0%
crates/qasm2/src/lex.rs 6 91.39%
Totals Coverage Status
Change from base Build 5750977952: -0.002%
Covered Lines: 73084
Relevant Lines: 85066

💛 - Coveralls

@mtreinish
Copy link
Member Author

Deeply unfortunately, SabreSwap and LookaheadSwap both manually construct some DAGOpNodes - I think we need to thread the sort-key generation through them as well (and stop them from doing this, at some point). Looking at the SabreSwap code now, it also just for fun mutates the qargs after the node is created, so its produced sort key won't even be correct +1.

Looking through the code, SabreSwap and probably LookaheadSwap (I'm confident in SabreSwap and I think LookaheadSwap, but I'm not as familiar with the code and might have missed some details) I think it actually will work correctly. The manually constructed DAGOpNode is used solely as a container of operations and operands between functions. In SabreSwap it permutates the qargs and then calls apply_operation_back(node.op, node.qargs, node.cargs) . LookaheadSwap looks like it does something similar too.

@jakelishman
Copy link
Member

Ah, you're right - it'll just continue to work. But given there's a copy.copy on the node there (Whyyyyyyyyy??? We own it!), that looks like an easy performance win for 1000q Sabre as well. I'll fix that up in a separate PR.

Copy link
Member

@jakelishman jakelishman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm kind of concerned that the performance regression here will be too large, until we've got #9389 in place. Maybe it would be best to wait? The potential bug we identified for this is probably super arcane right now, because most topological iteration through DAGs happens in the transpiler, and layout/routing passes canonicalise to a single register, so no loose bits.

Copy link
Member

@jakelishman jakelishman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be time to revisit this, now that #10128 is merged. Minor comment: when we're generating the sort key, we might want to zero-pad out the elements of the key so the lexicographical sort works like a numeric sort? It won't affect correctness, but might make iteration slightly less confusing.

I also suspect that this might have some performance improvements for topological sorting; the key comparisons will be comparing much shorted strings.

@mtreinish mtreinish modified the milestones: 0.25.1, 0.45.0 Aug 3, 2023
@mtreinish
Copy link
Member Author

I updated the PR to use DAGCircuit.find_bit() in d34c05d. I'll run a quick asv comparison for some transpile and dag creation benchmarks to see if it detects and change

jakelishman
jakelishman previously approved these changes Aug 3, 2023
Copy link
Member

@jakelishman jakelishman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks cleaner, and hopefully less performance overhead in creation than the previous version as well!

I left one comment, but I don't feel strongly about it.

qiskit/dagcircuit/dagnode.py Outdated Show resolved Hide resolved
This commit updates the sort key construction to do two things, first it
updates the integer indices used in the sort key to be 0 padded so that
the sort order is more intuitive. At the same time the sort key is not
including the cargs for an operation to ensure that's being factored
into the sort order.

Co-authored-by: Jake Lishman <jake.lishman@ibm.com>
Copy link
Member

@jakelishman jakelishman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks fine to me now. Preliminary benchmarking from Matt's machine suggests that there's no significant performance regressions or improvements here. There's potentially an issue with topological iteration between two circuits constructed in slightly different orders that contain operations on (e.g.) q0, q1 and separately c0, c1, but we think that the impact will be incredibly low, if it exists at all, and we're likely to change how cargs are handled anyway.

@jakelishman jakelishman enabled auto-merge August 3, 2023 16:37
@jakelishman jakelishman added this pull request to the merge queue Aug 3, 2023
Merged via the queue into Qiskit:main with commit a44185d Aug 3, 2023
mtreinish added a commit to mtreinish/qiskit-core that referenced this pull request Aug 29, 2023
In a typical circuit there are only certain numbers of qargs that are
used regularly. The best examples are 1q gates on each qubit in the
target, or 2q gates on each edge in the connectivity graph. As the qargs
will be repeated many times in a typical circuit then for most
operations the sort key attribute is also not-unique (as it's just
composed of the positional indices of each qarg and carg). In very large
circuit compilation we spend a non-insignificant amount of time
constructing the sort key strings on nodes during the lifecycle of the
DAGCircuit as it gets transformed. This commit caches the sort keys so
that they're not rebuilt for every new DAGNode. This does add some extra
overhead to the construction the first time a DAGNode is created because
it has to do an additional qarg tuple construction and a dict lookup.
But this is worth it because in most circutis the cache hit rate will be
very high and we'll avoid the more expensive string construction for
most gates in a circuit.

There is also a potential shared state issue here if any qubit or clbit
objects are present in >1 DAGCircuit and the relative positions of those
bits differ between the circuits the sort key will not be the same as
before because the earlier qubit combinations. This is because the cache
is stored as a class attribute on the DAGOpNode class and will be reused
between circuits. In practice this is unlikely to come up very frequently.
But also it's impact is mitigated by these keys just being used for
sorting and the sorting hasn't always been 100% deterministic anyway as
long as it's consistent for the life of the DAGCircuit being exactly the
same between multiple DAGCircuit objects has never been a guarantee
(although we tried to improve it in Qiskit#9569).
github-merge-queue bot pushed a commit that referenced this pull request Aug 30, 2023
* Cache DAGOpNode sort keys

In a typical circuit there are only certain numbers of qargs that are
used regularly. The best examples are 1q gates on each qubit in the
target, or 2q gates on each edge in the connectivity graph. As the qargs
will be repeated many times in a typical circuit then for most
operations the sort key attribute is also not-unique (as it's just
composed of the positional indices of each qarg and carg). In very large
circuit compilation we spend a non-insignificant amount of time
constructing the sort key strings on nodes during the lifecycle of the
DAGCircuit as it gets transformed. This commit caches the sort keys so
that they're not rebuilt for every new DAGNode. This does add some extra
overhead to the construction the first time a DAGNode is created because
it has to do an additional qarg tuple construction and a dict lookup.
But this is worth it because in most circutis the cache hit rate will be
very high and we'll avoid the more expensive string construction for
most gates in a circuit.

There is also a potential shared state issue here if any qubit or clbit
objects are present in >1 DAGCircuit and the relative positions of those
bits differ between the circuits the sort key will not be the same as
before because the earlier qubit combinations. This is because the cache
is stored as a class attribute on the DAGOpNode class and will be reused
between circuits. In practice this is unlikely to come up very frequently.
But also it's impact is mitigated by these keys just being used for
sorting and the sorting hasn't always been 100% deterministic anyway as
long as it's consistent for the life of the DAGCircuit being exactly the
same between multiple DAGCircuit objects has never been a guarantee
(although we tried to improve it in #9569).

* Move sort key cache to be DAG instance scoped

This commit reworks the storage location for the cached dag sort keys.
In the previous commit the sort keys were stored as a class attribute on
the DAGOpNode. As was mentioned in that commit message this had
potential shared state issues. To address that potential issue this
commit moves the cache to be scoped to an instance of DAG as that is
really what the sort key is scoped to. At the same time the lookup key
for the cache is changed to be (qargs, cargs) instead of the flattened
tuple.

* Use a more accurate not overloaded variable name for the cache lookup key
SamD-1998 pushed a commit to SamD-1998/qiskit-terra that referenced this pull request Sep 7, 2023
* Fix potential non-determinism in DAGOpNode sort key

This commit fixes potential source of non-determinism when topologically
sorting a DAGCircuit object. The lexicographical_topological_sort()
function we're using from rustworkx takes a sort key callback that
returns a string which is used for tie breaking in the topological sort.
The string used for op nodes was previously `str(qargs)`, but this is
potentially problematic for standalone Qubit objects. The repr for a
qubit without a register set will include the memory address of the bit
object. This could result in the topological sort differing between
executions of the same program which could lead to unexpected results.
This commit fixes this by changing the sort key to be a string of the
qubit indices in the dag instead of the qubit objects in qargs.

This commit will likely introduce a small performance regression because
we're adding overhead on every DAGOpNode creation. This regression can
be addressed when Qiskit#9389 is implemented because we can just pass the data
structure mapping bit objects to indices to the constructor instead of
having to build that mapping on each op node.

Additionally, some test cases were fixed as part of this commit, because
several DAGCircuit test cases were incorrectly calling
apply_operation_back() and setting clbits as qargs which will now fail
because of the index lookup on qubits.

* Update sort_key creation to use DAGCircuit.find_bit

* 0 pad integer indices and include cargs in sort key

This commit updates the sort key construction to do two things, first it
updates the integer indices used in the sort key to be 0 padded so that
the sort order is more intuitive. At the same time the sort key is not
including the cargs for an operation to ensure that's being factored
into the sort order.

Co-authored-by: Jake Lishman <jake.lishman@ibm.com>

---------

Co-authored-by: Jake Lishman <jake.lishman@ibm.com>
SamD-1998 pushed a commit to SamD-1998/qiskit-terra that referenced this pull request Sep 7, 2023
* Cache DAGOpNode sort keys

In a typical circuit there are only certain numbers of qargs that are
used regularly. The best examples are 1q gates on each qubit in the
target, or 2q gates on each edge in the connectivity graph. As the qargs
will be repeated many times in a typical circuit then for most
operations the sort key attribute is also not-unique (as it's just
composed of the positional indices of each qarg and carg). In very large
circuit compilation we spend a non-insignificant amount of time
constructing the sort key strings on nodes during the lifecycle of the
DAGCircuit as it gets transformed. This commit caches the sort keys so
that they're not rebuilt for every new DAGNode. This does add some extra
overhead to the construction the first time a DAGNode is created because
it has to do an additional qarg tuple construction and a dict lookup.
But this is worth it because in most circutis the cache hit rate will be
very high and we'll avoid the more expensive string construction for
most gates in a circuit.

There is also a potential shared state issue here if any qubit or clbit
objects are present in >1 DAGCircuit and the relative positions of those
bits differ between the circuits the sort key will not be the same as
before because the earlier qubit combinations. This is because the cache
is stored as a class attribute on the DAGOpNode class and will be reused
between circuits. In practice this is unlikely to come up very frequently.
But also it's impact is mitigated by these keys just being used for
sorting and the sorting hasn't always been 100% deterministic anyway as
long as it's consistent for the life of the DAGCircuit being exactly the
same between multiple DAGCircuit objects has never been a guarantee
(although we tried to improve it in Qiskit#9569).

* Move sort key cache to be DAG instance scoped

This commit reworks the storage location for the cached dag sort keys.
In the previous commit the sort keys were stored as a class attribute on
the DAGOpNode. As was mentioned in that commit message this had
potential shared state issues. To address that potential issue this
commit moves the cache to be scoped to an instance of DAG as that is
really what the sort key is scoped to. At the same time the lookup key
for the cache is changed to be (qargs, cargs) instead of the flattened
tuple.

* Use a more accurate not overloaded variable name for the cache lookup key
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Changelog: None Do not include in changelog mod: transpiler Issues and PRs related to Transpiler
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants