-
Notifications
You must be signed in to change notification settings - Fork 0
MCCL Parallel Processing
Janaka Ranasinghesagara edited this page Nov 8, 2023
·
7 revisions
The ability to run a single Monte Carlo simulation across multiple CPUs can be invoked using the "cpucount" command line option. The usage to specify using 8 CPUs is:
mc cpucount=8 infile=myinfile.txt
- If there are only 4 CPUs resident on the computer, four simulations will be started and the remaining four will be started as the first four finish.
- If the N, total photons to be launched, specified in the infile is not divisible by 8, the number of resulting photons that are run is floor(N/8)*8. The normalization of the results will be determined by this number instead of N.
- You can set "cpucount=all". It uses "Environment.ProcessorCount" to determine how many CPUs are available. This should only be done on a private computer. On a public cluster too many CPUs might be specified and it might take a long time to wait for that many to become available (see timing results below).
- To ensure that the random number generator used on the different CPUs generates streams that are uncorrelated, we employ the Dynamic Creator Mersenne Twister (https://github.com/MersenneTwister-Lab/dcmt). This Dynamic Creator Mersenne Twister (dcmt) finds sub-streams within the original Mersenne Twister (MT) that do not overlap and are long enough to encompass all random numbers needed in a simulation.
- The dcmt code was written in C and we ported it to C#. The details of the code can be seen by cloning a copy of our source code. Comments were added that describe the code when known.
- We ran timing studies on the University of California, Irvine High Performance Cluster (https://rcic.uci.edu/hpc3/hpc3.html). The sample infile=infile_one_layer_ROfRho_FluenceOfRhoAndZ.txt was used in all of the simulations. The number of photons launched N was set to 1e5, 1e6 and 1e7 and cpucount was set to 1, 2, 4 and 8 (16 was tried however it took a long time before 16 were available). We obtained the following timing results in seconds:
N | cpucount=1 | cpucount=2 | cpucount=4 | cpucount=8 |
---|---|---|---|---|
1e7 | 38901 | 17779 | 17151 | 16980 |
1e6 | 3051 | 1815 | 1698 | 1350 |
1e5 | 448 | 178 | 184 | 208 |
Virtual Photonics Technology Initiative
Project Site | Discussion | Education
- Getting Started
- Editing infiles
- Examples
- Capabilities & Implementation
- Source, Tissue, Detector Options
- Post Processor
- Inverse Solutions
- Parallel Processing
- Validation & Comparison with MCML
- References
- FAQ
- Downloads & Latest Release