Randomized Quasi-Monte Carlo Sampling in The Random Ray Method #3268

spasmann · 2025-01-18T17:12:47Z

Description

This PR adds a randomized Halton sequence, a form of randomized Quasi-Monte Carlo (RQMC) sampling, as an alternative to typical random number sampling for use in TRRM. RQMC methods decrease the variance in the simulation by providing a more uniform distribution of samples for initializing ray positions and directions of travel compared to random samples.

The Halton sampling method can be specified using the Python API with:

model.settings.random_ray['sample_method'] = 'halton'

OpenMC's native rng is the default sampling method but it can also be specified using:

model.settings.random_ray['sample_method'] = 'prng'

QMC & RQMC Background

Quasi-Monte Carlo (QMC) techniques use low-discrepancy sequences in place of typical pseudo-random number generators in Monte Carlo techniques. Like pseudo-random number generators, low-discrepancy sequences are intended to produce $N$ points $U[0,1]^D$. Unlike pseudo-random numbers, low-discrepancy sequences use deterministic algorithms designed to maximize the distance between the points which is measured using discrepancy statistics.

Despite improved sampling characteristics, QMC techniques are not commonly used in particle transport applications. This is because each point generated in the sequence is dependent on the previous, making QMC techniques ill-suited for modeling the particle random-walk process. Fortunately, as the name implies, TRRM performs a ray-tracing procedure uniformly through the problem, bypassing the need to explicitly model the scattering or fission processes. This provides a well-suited application for (R)QMC techniques.

Standard QMC methods are inherently deterministic, each time a sequence is initialized the same samples are generated. However, for each batch in the TRRM rays are emitted at different locations and traveling in different directions. Therefore new QMC samples are required for each batch. Most QMC sequences are extensible in $D$ however QMC performance takes a significant hit beyond $D>9$. Instead, we can utilize randomized-QMC (RQMC) techniques. RQMC represents a host of techniques which attempt to randomize or shuffle the original QMC sequence but maintain the low-discrepancy of the samples. By randomizing the sequence each iteration, we can generate a unique set of RQMC samples for each set of particles. Additionally, by randomizing the QMC samples we can estimate the variance of the integral (just like in analog MC and traditional Random Ray), something that is difficult to do with unrandomized QMC.

The image below shows samples from a pseudo-random number generator, the Halton sequence, and the randomized Halton sequence.

A few helpful resources on QMC and RQMC include

Implementation

The RQMC method implemented here is Owen's randomization of the Halton sequence. Owen’s randomization is well-suited for applications like TRRM because it introduces independent random permutations to the samples, making it possible to generate $R$ distinct sample sets. Similar to OpenMC's native RNG, the reproducibility of samples is ensured by seeding OpenMC's native RNG which is used in the randomization process. Additionally, Owen's randomization allows for the efficient generation of new samples in parallel by constructing rows in the range $\left[N, N^\prime\right]$. Finally, Owen's randomization method was chosen for its straightforward implementation and lack of any additional user input.

Verification & Performance

The implementation has been verified by evaluating fission pin cell powers from the 2D-C5G7 problem. The performance boost provided by the RQMC samples compared to random samples will vary by problem. In general, RQMC should give better performance to problems with higher spatial sensitivity e.g. a fine spatial mesh or TRRM's linear source approximation.

Results were generated on one Perlmutter node (2x AMD EPYC 7763), utilizing one MPI rank and 256 OpenMP threads. In the following figures, each data point represents the mean result from 80 individual simulations (each set with a different seed), while the shaded regions represent one standard deviation.

142,964 Flat Source Regions

Surprisingly, simulations using Halton samples were on average over 10% faster using a fine mesh.

17,924 Linear Source Regions

The coarse mesh runtimes are nearly identical.

Runtime Performance

It is very unlikely that the Halton sampling algorithm itself is faster than OpenMC's native RNG (although I have not compared them). My current hypothesis is that the Halton samples result in reduced shared memory contention because the samples are generated in such a way that each subsequent point will be "far away" from the previous point. The plot below shows the percent speedup of the Halton sample simulations compared to the RNG simulations as a function of the number of OpenMP threads used and generally supports this hypothesis.

However, the trend is noisy, and I would have expected this phenomenon to be more pronounced on coarse meshes. Instead, the runtime performance on the coarse mesh simulations were nearly identical. I'll need to run more tests to figure out what exactly is giving the Halton a runtime advantage in the fine mesh simulations. In any case, it doesn't seem that the Halton sampling hurts runtime, I think this makes sense as the bulk of the compute time is spent in the ray tracing procedure.

Checklist

I have performed a self-review of my own code
I have run clang-format (version 15) on any C++ source files (if applicable)
I have followed the style guidelines for Python source files (if applicable)
I have made corresponding changes to the documentation (if applicable)
I have added tests that prove my fix is effective or that my feature works (if applicable)

ebknudsen · 2025-01-27T15:27:25Z

question: Wouldn't it also be nice if this procedure could be generalized to also be used for stochastic volume runs?
I confess I have not (yet) studied this PR to see how difficult this would be, but it's perhaps worth a thought.

jtramm

Thanks @spasmann for putting in this PR -- it's nicely implemented and is a very fun addition to the code. I also did some of my own testing of the new feature. I ran 2D C5G7 30 times with each sampling method (with different seeds each time) but otherwise the same configuration (flat source, 142k FSRs, 1750 rays/iteration). The ensemble statistics look fantastic for the new randomized Halton sampling scheme:

	PRNG	Halton
k-eff	1.18675	1.18676
k-eff std dev. [pcm]	39	15
Average Pin Error (AAPE)	0.53	0.47
AAPE Std. Dev.	0.11	0.05
RMS	0.68	0.59
RMS Std. Dev.	0.14	0.06
MAX Pin Power Error	2.37	2.19
MAX Std. Dev.	0.46	0.29

The spatial error metrics show a modest ~10% absolute improvement (and look pretty much identical to what you reported). There is also a really huge reduction in the std. deviation of the spatial error metrics throughout the ensemble sample, generally around 2x on average. The improvement in the eigenvalue is really major. Without introducing any observable bias, the std. dev. is being reduced by 2.6x. Assuming approximately equal runtimes, this is like a 7X FOM improvement overall! I'll also note that these are all ensemble uncertainties (rather than what gets listed by OpenMC), so is definitely a legitimate reduction, we are not getting fooled by underestimation of uncertainty.

I also tried out the Kobayashi dog leg problem (a fixed source problem). Ensemble testing has shown both methods get the same answer, so that's good! However, there isn't much impact on uncertainty. For a tally in the far region of the problem, PRNG gets 2.4% std. dev., whereas RQMS gets 2.5%.

Anyway, my early take is that it seems to really help in some cases, and doesn't hurt in others, so on the whole I'd recommend folks to use it!

jtramm · 2025-01-27T17:26:35Z

src/random_ray/random_ray.cpp

+// Function to generate randomized Halton sequence samples
+//
+// Algorithm adapted from:
+//      A. B. Owen. A randomized halton algorithm in r. Arxiv, 6 2017.
+//      URL https://arxiv.org/abs/1706.02808
+vector<vector<float>> rhalton(int N, int dim, uint64_t* seed, int64_t skip = 0)
+{
+  int b;
+  double b2r;
+  vector<double> ans(N);
+  vector<int> ind(N);
+  vector<int> primes = {2, 3, 5, 7, 11, 13, 17, 19, 23, 29};
+  vector<vector<float>> halton(N, vector<float>(dim, 0.0));
+
+  std::iota(ind.begin(), ind.end(), skip);
+
+  for (int D = 0; D < dim; ++D) {
+    b = primes[D];
+    b2r = 1.0 / b;
+    vector<int> res(ind);
+    std::fill(ans.begin(), ans.end(), 0.0);
+
+    while ((1.0 - b2r) < 1.0) {
+      vector<int> dig(N);
+      // randomaly permute a sequence from skip to skip+N
+      vector<int> perm(b);
+      std::iota(perm.begin(), perm.end(), 0);
+      fisher_yates_shuffle(perm, seed);
+
+      for (int i = 0; i < N; ++i) {
+        dig[i] = res[i] % b;
+        ans[i] += perm[dig[i]] * b2r;
+        res[i] = (res[i] - dig[i]) / b;
+      }
+      b2r /= b;
+    }
+
+    for (int i = 0; i < N; ++i) {
+      halton[i][D] = ans[i];
+    }
+  }
+
+  return halton;
+}
+


Given that the particle positions and directions are stored in FP64, would it make sense to allow the sampling to also be 64-bit?

This is a nice and high-level/abstract implementation of the sampling scheme, though the "abstractness" is a double edged sword as it requires a significant amount of dynamic memory allocation with all the usage of vector. My guess is that this causes most of the added expense in the sampling scheme as compared to PRNG sampling. With that in mind, as the function is only ever called in OpenMC as:

rhalton(1, 5, current_seed(), skip = skip)

we could in theory make a simpler and faster function that only generates a single 5 dimensional sample, thus eliminating the need for several vectors (or replacement with statically sized std::array). However, it's possible other areas in OpenMC may want to use Halton sampling in the future, in which case perhaps it is worthwhile to keep things abstract? At the very least, it seems like we can limit things to a single sample at a time (eliminating the N argument and thus several dynamically sized vectors)?

jtramm · 2025-01-27T17:32:00Z

src/random_ray/random_ray.cpp

+
+  // Sample spatial distribution
+  Position xi {samples[0][0], samples[0][1], samples[0][2]};
+  Position shift {1e-9, 1e-9, 1e-9};


Wherever possible, "magic numbers" in OpenMC are defined in constants.h, so it's easier to keep track of how many things like the 1e-9 shift here we have baked into the code. You could make a new macro in constants.h with this value, or potentially just reuse an existing macro since this shift is related to geometry floating point sensitivity. Some existing options from constants.h:

// Used for surface current tallies constexpr double TINY_BIT {1e-8}; // User for precision in geometry constexpr double FP_PRECISION {1e-14}; constexpr double FP_REL_PRECISION {1e-5}; constexpr double FP_COINCIDENT {1e-12};

jtramm · 2025-01-27T17:45:00Z

src/random_ray/random_ray.cpp

+  // Set random number seed
+  int64_t batch_seed = (simulation::current_batch - 1) * settings::n_particles;
+  int64_t skip = id();
+  init_particle_seeds(batch_seed, seeds());
+  stream() = STREAM_TRACKING;


In a similar manner to my comment regarding simplification vs. leaving abstract for the rhalton function, if we end up deciding to just have it generate a single sample, can we change the interface to only rely on the current_seed() (appropriately offset of the particle ID), thus allowing for this code to be moved back up to the caller, such that it doesn't need to be (partially) reproduced between sample_rng() vs. sample_halton? The logic of RQMC sampling makes it seem like we need to keep the seed to be batch-specific, but wanted to double check!

jtramm · 2025-01-28T15:25:45Z

question: Wouldn't it also be nice if this procedure could be generalized to also be used for stochastic volume runs?
I confess I have not (yet) studied this PR to see how difficult this would be, but it's perhaps worth a thought.

Hey @ebknudsen! Definitely -- I have talked with @paulromano and @pshriwise about this (in particular, with regards to PR #3129). The tentative plan is to get #3129 merged, and then we can look at testing alternative methods for sampling the rays.

spasmann added 10 commits September 17, 2024 13:08

halton ray sampling

fa6f28a

fix batch seed

d967bb1

merge origin/develop

0d14f44

Add initial halton samples regression test

08abdc6

Add sample method for selecting prng or halton sampling

528bb72

Merge develop

44cc548

Fix uniform_int_distribution bug

808cb6c

clang-format and updated regression test

25ac209

Reorganize halton function

31518dc

Merge branch 'develop' into halton_rays

5083584

spasmann requested a review from jtramm as a code owner January 18, 2025 17:12

spasmann marked this pull request as draft January 18, 2025 17:12

jtramm added 2 commits January 27, 2025 09:21

added section to userguide on Halton sampling

e868aea

Halton capitalization

2f5de56

added sample_method to XML input docs

2d7cf35

jtramm requested changes Jan 27, 2025

View reviewed changes

jtramm marked this pull request as ready for review January 27, 2025 21:14

jtramm added the Random Ray label Jan 27, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Randomized Quasi-Monte Carlo Sampling in The Random Ray Method #3268

Randomized Quasi-Monte Carlo Sampling in The Random Ray Method #3268

spasmann commented Jan 18, 2025

ebknudsen commented Jan 27, 2025 •

edited

Loading

jtramm left a comment

jtramm Jan 27, 2025

jtramm Jan 27, 2025

jtramm Jan 27, 2025

jtramm commented Jan 28, 2025

Randomized Quasi-Monte Carlo Sampling in The Random Ray Method #3268

Are you sure you want to change the base?

Randomized Quasi-Monte Carlo Sampling in The Random Ray Method #3268

Conversation

spasmann commented Jan 18, 2025

Description

QMC & RQMC Background

Implementation

Verification & Performance

142,964 Flat Source Regions

17,924 Linear Source Regions

Runtime Performance

Checklist

ebknudsen commented Jan 27, 2025 • edited Loading

jtramm left a comment

Choose a reason for hiding this comment

jtramm Jan 27, 2025

Choose a reason for hiding this comment

jtramm Jan 27, 2025

Choose a reason for hiding this comment

jtramm Jan 27, 2025

Choose a reason for hiding this comment

jtramm commented Jan 28, 2025

ebknudsen commented Jan 27, 2025 •

edited

Loading