-
Notifications
You must be signed in to change notification settings - Fork 365
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimized MPI simulator #1515
Optimized MPI simulator #1515
Conversation
bool multi_chunk_swap_enable_ = true; //enable multi-chunk swaps | ||
uint_t chunk_swap_buffer_qubits_ = 15; //maximum buffer size in qubits for chunk swap | ||
uint_t max_multi_swap_; //maximum swaps can be applied at a time, calculated by chunk_swap_buffer_bits_ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These values should be included in metadata of a result if MPI works.
I added a minor comment about metadata, but basically all changes look good to me. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Summary
This PR is the optimization for MPI simulator.
Details and comments
A cache blocking transpiler puts some swap operators to reorder qubits,
and some swaps require data exchange between processes for MPI simulator.
Previously each swap operator is applied independently, and this PR merged some swaps to decrease data exchange.
By using this PR, scalability will be improved.
This graph shows weak scaling of Quantum Volume simulation, 30, 31, 32 and 33 qubits for 1, 2, 4 and 8 nodes (30 qubits per node) on IBM Power System AC922 (6x Tesla V100 GPUs). QW20 is reported in the paper (https://arxiv.org/pdf/2102.02957.pdf) , Base is the latest Qiskit Aer without this PR and Optimized is with this PR.