add sample description

blackjax-devs · Dec 9, 2022 · 687f295 · 687f295
1 parent a3a4253
commit 687f295
Showing 1 changed file with 76 additions and 1 deletion.
diff --git a/examples/GP_Marginal.md b/examples/GP_Marginal.md
@@ -16,7 +16,82 @@ jupyter:
 
 In this example, we want to illustrate how to use the marginal sampler implementation [`mgrad_gaussian`](https://blackjax-devs.github.io/blackjax/mcmc.html#blackjax.mgrad_gaussian) of the article [Auxiliary gradient-based sampling algorithms](https://rss.onlinelibrary.wiley.com/doi/abs/10.1111/rssb.12269). We do so by using the simulated data from the example [Gaussian Regression with the Elliptical Slice Sampler](https://blackjax-devs.github.io/blackjax/examples/GP_EllipticalSliceSampler.html). Please also refer to the complementary example [Bayesian Logistic Regression With Latent Gaussian Sampler](https://blackjax-devs.github.io/blackjax/examples/LogisticRegressionWithLatentGaussianSampler.html).
 
+## Sampler Overview
 
+In section we give a brief overview of the idea behind this particular sampler. For more details please refer to the original paper [Auxiliary gradient-based sampling algorithms](https://rss.onlinelibrary.wiley.com/doi/abs/10.1111/rssb.12269) ([here](https://arxiv.org/abs/1610.09641) you can access the arXiv preprint).
+
+### Motivation: Auxiliary Metropolis-Hastings samplers
+
+Let's us recall how to sample from a target density $\pi(\mathbf{x})$ using a Metropolis-Hasting sampler trough a *marginal scheme process*. The main idea is to have via a mechanisms that generate proposals $y$ which we then accept or reject according to a specific criterion. Concretely,
+
+1. First we draw an initial sample $\mathbf{u} \sim q(\mathbf{u}|\mathbf{x})$.
+2. Then generate the proposal $y \sim q(\mathbf{y}|\mathbf{x}, \mathbf{u})$.
+3. Compute the Metropolis-Hasting ratio
+
+$$
+\varrho = \frac{\pi(\mathbf{y})q(\mathbf{x}|\mathbf{y})}{\pi(\mathbf{x})q(\mathbf{y}|\mathbf{x})}
+$$
+
+where $q(\mathbf{y}|\mathbf{x})$ is the overall proposal density which can be obtained by integrating over the auxiliary variable $u$:
+
+$$
+q(\mathbf{y}|\mathbf{x}) = \int q(\mathbf{y}|\mathbf{x}, \mathbf{u})q(\mathbf{u}|\mathbf{x})du
+$$
+
+4. Accept proposal $y$ with probability $\min(1, \varrho)$ and reject it otherwise.
+
+There is a way of extending this approach, known as *an auxiliary sampler*, by considering a different target (assuming the product form) $\pi(\mathbf{x}, \mathbf{u}) = \pi(\mathbf{x}) q(\mathbf{u}|\mathbf{x})$. For this target we can generate proposal similarly as before(well, via Hastings-within-Gibbs):
+
+1. Sample $\mathbf{u}|\mathbf{x} \sim \pi(\mathbf{u}|\mathbf{x}) = q(\mathbf{u}|\mathbf{x})$.
+2. Generate proposal $\mathbf{y}|\mathbf{u}, \mathbf{x} \sim q(\mathbf{y}|\mathbf{x}, \mathbf{u})$
+3. Compute the Metropolis-Hasting ratio
+
+$$
+\tilde{\varrho} = \frac{\pi(\mathbf{y}|\mathbf{u})q(\mathbf{x}|\mathbf{y}, \mathbf{u})}{\pi(\mathbf{x}|\mathbf{u})q(\mathbf{y}|\mathbf{x}, \mathbf{u})}
+$$
+
+4. Accept proposal $y$ with probability $\min(1, \tilde{\varrho})$ and reject it otherwise.
+
+### Example: Auxiliary Metropolis-Adjusted Langevin Algorithm (MALA)
+
+Let's consider the case of a random walk proposal $N(\mathbf{u}|\mathbf{x}, (\delta /2) \mathbf{I})$ for $\delta > 0$. In [[Section 2.2]  Auxiliary gradient-based sampling algorithms](https://rss.onlinelibrary.wiley.com/doi/abs/10.1111/rssb.12269),
+it is shown that one can use a first order approximation to sample from the (intractable) $\pi(\mathbf{x}|\mathbf{u})$ density so that
+
+$$
+q(\mathbf{y}|\mathbf{u}, \mathbf{x}) \propto N(\mathbf{y}|\mathbf{u} + (\delta/2)\nabla \log \pi(\mathbf{x}), (\delta/2) I).
+$$
+
+The resulting marginal sampler is the Metropolis-adjusted Langevin algorithm (MALA) where
+
+$$
+q(\mathbf{y}| \mathbf{x}) \propto N(\mathbf{y}|\mathbf{x} + (\delta/2)\nabla \log \pi(\mathbf{x}), \delta I).
+$$
+
+### Latent Gaussian Models
+
+A particular case of interest is the latent Gaussian model where the target density has the form
+
+$$
+\pi(\mathbf{x}) \propto \overbrace{\exp\{f(\mathbf{x})\}}^{\text{likelihood}} \underbrace{N(\mathbf{x}|\mathbf{0}, \mathbf{C})}_{\text{Gaussian Prior}}
+$$
+
+In this case when combined with a random walk proposal $N(\mathbf{u}|\mathbf{x}, (\delta /2) \mathbf{I})$ with the first order approximation we obtain tht following proposal density:
+
+$$
+q(\mathbf{y}|\mathbf{x}, \mathbf{u}) \propto N\left(\mathbf{y}|\frac{2}{\delta} \mathbf{A}\left(\mathbf{u} + \frac{\delta}{2}\nabla f(\mathbf{x})\right), \mathbf{A}\right),
+$$
+
+where $\mathbf{A} = \delta / 2(\mathbf{C} + (\delta / 2)\mathbf{I})^{-1}\mathbf{C}$. The corresponding marginal density is
+
+$$
+q(\mathbf{y}|\mathbf{x}) \propto N\left(\mathbf{y}|\frac{2}{\delta} \mathbf{A}\left(\mathbf{x} + \frac{\delta}{2}\nabla f(\mathbf{x})\right), \frac{2}{\delta}\mathbf{A}^2 + \mathbf{A}\right).
+$$
+
+Sampling from $\pi(\mathbf{x}, \mathbf{u})$ (and therefore from $\pi(\mathbf{x})$) is done via Hastings-within-Gibbs as above.
+
+---
+
+Now that we have a high-level understanding of the algorithm, let's see how to use it in `blackjax`.
 
 ```python
 import jax
@@ -118,7 +193,7 @@ def inference_loop(rng, init_state, kernel, n_iter):
     return states, info
 ```
 
-We are now ready to run the sampler! The only extra parameters in the `step` function is `delta`, which corresponds (in a loose sense) to the step-size of MALA and other similar algorithms.
+We are now ready to run the sampler! The only extra parameters in the `step` function is `delta`, which (as seen in the sampler description) corresponds (in a loose sense) to the step-size of MALA algorithm.
 
 **Remark:** Note that one can calibrate the `delta` parameter as described in the example [Bayesian Logistic Regression With Latent Gaussian Sampler](https://blackjax-devs.github.io/blackjax/examples/LogisticRegressionWithLatentGaussianSampler.html).