SparsePauliOp.to_matrix() exhibits numerical instabilities #13413

donsano33 · 2024-11-08T14:15:40Z

Environment

Qiskit version: 1.2.4
Python version: 3,11
Operating system: Ubuntu 24.04.1 LTS

What is happening?

When calculating the matrix representation of a SparsePauliOp consisting of multiple terms via to_matrix() , numbers that should be exactly zero are not zero but something around e-18.
This does not happen with qiskit<1.

How can we reproduce the issue?

from qiskit.quantum_info import SparsePauliOp

factor = 0.025
terms = [
    factor * SparsePauliOp(data=["IIII", "IIZI", "IIIZ", "IIZZ"], coeffs=[1,-1,-1, 1]),
    factor * SparsePauliOp(data=['IIII', 'IZII', 'IIZI', 'IZZI'], coeffs=[1,-1,-1, 1])
    ]
op = SparsePauliOp.sum(terms)
op_matrix = op.to_matrix(sparse=True)
print(op_matrix.diagonal())

This also appears when using sparse=False.

What should happen?

In qiskit<1.0 the output looks like this:

[0.  0.  0.  0.1
0.  0.  0.1 0.2
0.  0.  0.  0.1
0.  0.  0.1 0.2]

with qiskit>1.0, e.g. I use qiskit==1.2.4 the output looks somewhat like this:

[ 6.9388939e-18  6.9388939e-18 -6.9388939e-18  1.0000000e-01
  0.0000000e+00  0.0000000e+00  1.0000000e-01  2.0000000e-01
  6.9388939e-18  6.9388939e-18 -6.9388939e-18  1.0000000e-01
  0.0000000e+00  0.0000000e+00  1.0000000e-01  2.0000000e-01]

Interestingly the positions of the small numbers changes when I rerun the same script with a new call. If I run the statements in a loop the positions of the small numbers is constant.

Any suggestions?

Just a first guess by looking at the implementation of SparsePauliOp.to_matrix() the issue might either be in this function

qiskit/qiskit/quantum_info/operators/symplectic/sparse_pauli_op.py

Line 956 in cd26d49

zx = ZXPaulis(

or rather in both of the to_matrix_sparse()

qiskit/qiskit/quantum_info/operators/symplectic/sparse_pauli_op.py

Line 965 in cd26d49

data, indices, indptr = to_matrix_sparse(zx, force_serial=force_serial)

and to_matrix_dense accelerated functions which get imported.

qiskit/qiskit/quantum_info/operators/symplectic/sparse_pauli_op.py

Line 968 in cd26d49

return to_matrix_dense(zx, force_serial=force_serial)

The text was updated successfully, but these errors were encountered:

jakelishman · 2024-11-08T14:38:59Z

The number 0.025 isn't exactly representable in floating-point arithmetic - the closest value to it is actually the huge fraction $\frac{3602879701896397}{144115188075855872}$. That means that "correct" result of the matrix calculation of a sum isn't necessarily exactly what you expect, and floating-point arithmetic is such that the order that additions are done in can affect things. The fact that there aren't zeros isn't a bug, but a side-effect of the new algorithm for calculating the matrices (which is several hundreds of times faster than the previous method, especially for large sparse matrices) is that we first simplify the operator before doing the sum, to reduce the amount of work we have to do (though it doesn't affect your case), and I think that can cause the terms to turn up in a non-deterministic order that's set at the time of import qiskit. The other possibility is that the distribution of the matrix calculation onto threads might be shifting the orders of things - if you set force_serial=True in the to_matrix call, that might also stabilise it (but I don't think so in this case).

Assuming the non-determinism is what I suggested, we can look to fix the order of evaluation there. If I'm right, it should be as simple as swapping out the HashMap in MatrixCompressedPaulis::combine into an indexmap::IndexMap. That could be handled as a bug fix.

As a feature request, we could add a "tolerance" field to to_matrix to manually set a threshold to suppress small terms.

The fact that Qiskit pre-1.0 was giving exact zeros for your matrix before is most likely a bit of chance - the trouble with inexact floating-point numbers and accumulation of round-off error is fundamental to numerical computing.

donsano33 · 2024-11-08T14:46:25Z

Thanks for the quick reply.
The number 0.025 was actually just taken as a representative example. In our actual usecase the factor is not even constant and usually has a lot more digits after the comma, but I get your point.

As you already expected, using force_serial=True does not change the behavior.

It would really nice being able to specify a "tolerance", at which all values are floored to exactly zero.

jakelishman · 2024-11-08T14:58:16Z

Sure, no worries! The reason I mentioned the inexact floating-point number is that it's not really specific to 0.025, that's just representative of the problem. It has a chance of happening with any coefficients that aren't exactly integer multiples of powers of two - the number of digits when written in base 10 doesn't matter. If you tried it with 0.25, you should find it's entirely stable always (for those two operators) because that's exactly representable in floating-point, and it never accumulates round-off error by adding it to itself. Just to illustrate what's going on, in raw Python:

>>> 0.025 + 0.025 + 0.025 - 0.025 - 0.025 - 0.025
6.938893903907228e-18
>>> 0.025 - 0.025 + 0.025 - 0.025 + 0.025 - 0.025
0.0

It so happens that 0.025 + 0.025 in double-precision floating point is also represented by the closest floating-point number to 0.05, but both floating-point numbers are truly ever so slightly higher (by about $1.39\times10^{-18}$ and $2.78\times10^{-18}$ respectively). Then 0.05 + 0.025 in double-precision floating point ends up being ever so slightly closer to $0.07500000000000001$ than to $0.075$ (or at least the closest floats to those two values). So if you add together three 0.025s before subtracting one, you get a round-off accumulation, but if you do 0.025 - 0.025 summed three times, you keep cancelling it out - the order of operations matters numerically, when in "real" maths it doesn't.

The tolerance argument is a feature, and we can do that in a future release (probably Qiskit 2.0 now, since 1.3 closed for new features two weeks ago). Fixing the stability of the order of operations is something we can do in a patch release, though - that's the IndexMap thing I mentioned above.

donsano33 · 2024-11-11T09:31:42Z

Thanks again for the wonderful explanation, very insightful.

What are the next steps? The fix for the stability of the order of operations can happen in this bugticket, right?
For the feature request, shall I create a new one or would you handle that?

donsano33 added the bug Something isn't working label Nov 8, 2024

jakelishman added the type: feature request New feature or request label Nov 11, 2024

jakelishman mentioned this issue Nov 14, 2024

Fix process-based non-determinism in SparsePauliOp.to_matrix #13439

Merged

mergify bot mentioned this issue Nov 14, 2024

Fix process-based non-determinism in SparsePauliOp.to_matrix (backport #13439) #13440

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SparsePauliOp.to_matrix() exhibits numerical instabilities #13413

SparsePauliOp.to_matrix() exhibits numerical instabilities #13413

donsano33 commented Nov 8, 2024

jakelishman commented Nov 8, 2024 •

edited

Loading

donsano33 commented Nov 8, 2024

jakelishman commented Nov 8, 2024 •

edited

Loading

donsano33 commented Nov 11, 2024

SparsePauliOp.to_matrix() exhibits numerical instabilities #13413

SparsePauliOp.to_matrix() exhibits numerical instabilities #13413

Comments

donsano33 commented Nov 8, 2024

Environment

What is happening?

How can we reproduce the issue?

What should happen?

Any suggestions?

jakelishman commented Nov 8, 2024 • edited Loading

donsano33 commented Nov 8, 2024

jakelishman commented Nov 8, 2024 • edited Loading

donsano33 commented Nov 11, 2024

jakelishman commented Nov 8, 2024 •

edited

Loading

jakelishman commented Nov 8, 2024 •

edited

Loading