-
Notifications
You must be signed in to change notification settings - Fork 376
Improve the execution time of computing mean and var #787
Comments
The reason why the paralliation is not done under Windows is due to differences in how its done across the OS platforms and that it could not be used due to that. basis is described in the WeightedPauliOperator constructor
By default their is no grouping of the paulis and the expectation value of each is computed seperately. VQE however, by default, will convert a WeightedPauliOperator to a TPBGroupedWeightedPauliOperator which will group them by Tensor Product Basis. Performance is something we continually are looking to improved. One thing you can try to improved performance is to use the Aer We are also looking again at operator and in particular at the way expectation and evolution are used. Performance as I stated is always a concern for us and we are always looking for improvements. Here is the Qiskit/RFCs#8 description of this upcoming effort. |
Thanks for your reply, i didn't notice that the |
When you say your operator is length of 1 do you mean number of Paulis in it? In the WeightedPauliOperator each Pauli string gets run separately as a circuit. In the TPBGrouped they are grouped by TensorProductBasis and each group is run in a single go. It you print_details() of the TPBOperator you can see the number of groups and which Pauli(s) are in each group. In this case the parallel map should be able to split by groups. For parallelized operation the Aer simulator has a backend options for running in parallel too see https://qiskit.org/documentation/api/qiskit.providers.aer.backends.QasmSimulator.html |
Ho, it's clear now thanks. The TPBGrouped paulis used in my notebook are splitted in one group of 225 paulis, that explain why the parallelization is not performed. And also separating these TPBGrouped is not a good(as i mentioned above) way because of the calculation of certain covariance is omitted. I will see how to do the parallelization using directly the Aer simulator |
Aer simulators parallelize by circuits and shots. However there is no parallelization over operator components in the expectation value calculation. @chriseclectic do you think it is worth pursuing, in the statevector, stabilizer, and MPS simulators? VQE instances have many thousands of components. |
I received a request from a client on this issue: https://github.ibm.com/IBM-Q-General/client-support/issues/123. Have you any idea of what i can say for now to improve their execution time( Reduce number of shots for example) and if their will be an improvement expected soon in the calculation of expectation value by the operator redesign? |
How much improvement do you see when you set the number of shots to 1? I believe it would lead to significant improvement, and this is what I'd suggest to the clients. |
With shots=1 each step take approximately 45 seconds(Circuit creation+transpilation+measurement+energy evaluation), it's much more faster than 1024 shots. But, i'm wondering if the minimum energy is reached in reasonable steps. |
What do you mean by "step"? In VQE circuit creation and transpliation are done only once. Then there are many (300 in the client's snippet) iterations of measurement and energy evaluation. |
Sorry i meant the time between two energy evaluation, I use the logging and i notice that for one step you mentioned above there are several energy evaluation is it right? And before each energy evaluation the logging shows a processes of circuit transpilation. |
I guess that the transpilations that you see in the log is just the binding of parameters, which set real values into the parameterized circuit, thus transforming it into a real circuit that can be executed. But it doesn't perform a transpilation in the sense of optimizing the circuit. |
@AzizNgoueya just checking - are you using the newest official release of Aqua i.e. 0.6.2 that was released mid-last December? (You gave no indication of version when creating this issue and the client support issue linked above shows 0.6.0.) Only the newest version has the parameterized cct support in Aqua which speeds up the process by only having to transpile once. |
Ok, i used the 0.6.1 version of aqua, i will try with 0.6.2 |
If you upgrade qiskit |
@woodsp-ibm the execution time is better with the |
@AzizNgoueya Great, that's good to hear. How much did things improve for you in the end out of curiosity? |
@woodsp-ibm the energy evaluation takes approximately 100 seconds with 25 qubits(against more than 1000 seconds with 1024 shots). It's an incredible improvement. |
What is the expected enhancement?
Right now, computing the mean and var using a WeightedPauliOperator and results on 25 qubits take long time just for one iteration(1038 seconds). I tried to parallelize but it seems that it's deactivated on Windows according to this code: https://github.com/Qiskit/qiskit-terra/blob/792e1b7f866b9f3685566341fa6b4b54d5ba33e9/qiskit/tools/parallel.py#L115.
There is a reason for this?
Also if we look at this code https://github.com/Qiskit/qiskit-aqua/blob/44a94674e9d3937f277fb19112885fa6073048c4/qiskit/aqua/operators/weighted_pauli_operator.py#L774, the length of list of WeightedPauliOperator and results counts is superior to 1 if there are more than one pauli basis. This important because the parallel_map function can process this in several batch. However in my code, i have only one basis but it's big, so it's executed in one batch and takes so long time.
The text was updated successfully, but these errors were encountered: