Chapter 9 Monte Carlo Integration

Obtaining numerical results using repetitive sampling and computing…

According to Murdoch [2000], the term Monte Carlo originally referred to simulations that involved random walks and was first used by Jon von Neumann and S. M. Ulam in the 1940’s.

Monte Carlo methods involve simulating a process multiple times and using statistical inference to approximate unknown quantities.

The fundamental idea is to use randomness to solve deterministic problems.

Identify an appropriate model (or models) that represents (or represent) the population of interest.
Generate random observations using the model (or models).
Compute outcome based on the sampled inputs. For example, calculate the value for the statistic of interest using the random observations.
Repeat Steps 2 and 3 for \(M\) trials.
Aggregate the outcomes of the \(M\) trials. For example, use the \(M\) values to study the distribution of the statistic.

R provides a rich set of functions for generating random numbers and running Monte Carlo simulations. Below are some fundamental examples.

Motivation: Probability as Integration of a Density Function

We can obtain probabilities using Monte Carlo Simulation, based on the concept of a posteriori approach of assigning probabilities.

Suppose you want to obtain the probability \(P(X\leq x)\).

The general idea for obtaining a posteriori probabilities using Monte Carlo simulation is as follows:

Generate \(M\) samples from the population \(F\).
The empirical CDF is given by

\[ F_M(x) = \frac{1}{M}\sum_{m=1}^MI(x_m\leq x) \approx P(X\leq x) \] where \(I(.)\) is the indicator function.

Theorem 9.1 (Glivenko-Cantelli Theorem) \[ \text{As } n \rightarrow\infty, \quad\underset{x\in \mathbb{R}}{\sup}|F_M(x)-F(x)| \rightarrow 0 \text{ almost surely} \]

Example 1: Exponential

As a quick example, suppose \(X \sim Exp(\lambda = 2)\). We know that \(P(X\leq 1) = 1 - \exp(-2\cdot 1)\approx 0.86\)

If we sample large amount of times from \(Exp(2)\), we know that around \(86\%\) of the generated data will be less than or equal to 1.

set.seed(142)
X <- rexp(100, rate = 2)
sum(X <= 1)/100

## [1] 0.86

We can plot the theoretical and empirical cdf to check if they really look the same.

Example 2: CLT

For another demonstration, we visualize the Central Limit theorem. Recall its (basic) definition:

Theorem 9.2 (Central Limit Theorem) Let \(X_1,X_2,...,X_n\) be a random sample from a population with mean \(\mu\) and finite variance \(\sigma^2\), and let \(\bar{X}_n=\frac{1}{n}\sum_{i=1}^nX_i\).

For a sufficiently large sample size \(n\), the statistic

\[ \frac{\bar{X}_n-\mu}{\sigma/\sqrt{n}} \] approximately follows the standard normal distribution.

Suppose the population has a distribution \(Poisson(\lambda = 6)\). The mean and variance are both \(\lambda = 6\).

Let’s obtain a random sample of size \(n=30\) repeatedly and examine the distribution of the statistic \(\frac{\bar{X}-\mu}{\sigma/\sqrt{n}}\).

M   <- 1000
n   <- 30
vec <- numeric(M)
for(m in 1:M){
    set.seed(m)
    X <- rpois(30, lambda = 6)
    vec[m] <- (mean(X) - 6)/sqrt(6/n)
}

It looks standard normal, doesn’t it?

Suppose we obtained the following sample from the population:

set.seed(142)
samp <- rpois(n = 30, lambda = 6)

Computing \(\frac{\bar{X}-\mu}{\sigma/\sqrt{n}}\) using this sample, we have:

z <- (mean(samp) - 6)/sqrt(6/30)
z

## [1] 1.565248

By the central limit theorem, \(P(Z \leq 1.57)\approx \Phi(1.57)=0.9412\)

pnorm(z)

## [1] 0.9412376

But using Monte-Carlo Simulation, we count how many samples will have a Z-statistic that are less than the observed Z-statistic

sum(vec <= z)/M

## [1] 0.947

Close enough?

Exercise 9.1 Let \(X\) be a random variable that is uniformly distributed over \([−1, 1]\) and \(Y\) is a random variable from \(N(1, 1)\) that is independent of \(X\).

Using Monte Carlo Simulation with \(M=1000\) and seed number = 142, what is the probability that \(X + Y > 1.5\)?

SUMMARY:

When we don’t have exact probability formulas or they don’t apply, we can simulate to get arbitrarily good approximations
If we can describe the process of our model, we can set up a simulation of it

Euler’s Method vs Monte Carlo Integration

Suppose we want to evaluate a definite integral:

\[ \int_a^b f(x)dx \] where \(f\) is some function. For most functions, there is no closed-form expression for such definite integrals.

Numerical analysis provides various means of approximating definite integrals, starting with Euler’s method for one-dimensional integrals:

\[ \int_a^bf(x)dx\approx\sum_{i=1}^{\lfloor(b-a)/h \rfloor} hf\left(a+ih\right) \]

However, using this method is very slow, especially for high-dimensional computing. One has to evaluate the function \(f\) many times. As an alternative, physicists has to rely on some random approximation scheme, or the Monte Carlo Method.

Recall that the expected value of \(f\) with respect to a distribution with probability density \(p\) with a support \(D\) is

\[ \mathbb{E}_p[f(X)]=\int_D f(x)p(x)dx \]

Based on law of large of numbers, for IID random variables, the sample mean converges to the expected value:

\[ \frac{1}{n}\sum_{i=1}^nf(X_i)\rightarrow\mathbb{E}_p[f(X)]=\int_{D} f(x)p(x)dx \]

The most basic Monte Carlo method for evaluating the definite integral \(\int_a^b f(x)dx\) is to set \(p\) as the pdf of a Uniform random variable.

Draw the random variable \(X\) uniformly over the domain \([a,b]\), which has a pdf \(\frac{1}{b-a}\)

\[ \begin{align} \int_a^b f(x)dx &=(b-a)\int_a^b f(x) \frac{1}{b-a} dx \\ &=(b-a)\mathbb{E}[f(X)] \end{align} \]

and by law of large numbers,

\[ \frac{b-a}{n}\sum_{i=1}^nf(X_i)\rightarrow\int_a^bf(x)dx \]

Example 1: Bounded Integral

To demonstrate, let us perform integration on \(f(x)=\sin(x)\) over \([0,\pi]\). The exact solution is

\[ \int_0^\pi\sin(x)dx=2 \]

Using Monte Carlo integration, we can approximate it with these steps:

Generate \(n=10000\) values from \(Uniform(0,\pi)\)
Evaluate \(f(x)=\sin(x)\) at these points
Get the average and multiply by the measure of the domain, which is \(\pi-0\).

X <- runif(10000, 0, pi)
pi*mean(sin(X))

## [1] 2.00878

Exercise 9.2 Approximate the following definite integral using Monte Carlo Integration:

\[ \int_0^1 e^{-x^2}dx \]

Use \(M = 1\text{e6}\)

Example 2: Unbounded Integral

Sampling from the uniform distribution is useless for an integral like

\[ \int_0^\infty x^2e^{-x^2}dx \]

The trick is to introduce a pdf with the same support as the integral limits. In general,

\[ \int_D f(x)dx=\int_D \frac{f(x)}{p(x)}p(x)dx = \mathbb{E}_p\left[\frac{f(X)}{p(X)}\right] \]

And now, if \(X_i\) is generated from \(p\),

\[ \frac{1}{n}\sum_{i=1}^n\frac{f(X_i)}{p(X_i)} \rightarrow \mathbb{E}_p\left[\frac{f(X)}{p(X)}\right] = \int_D f(x)dx \]

Solution for Example 2

Suppose we have the following integral

\[ \int_0^\infty x^2e^{-x^2}dx \]

The exact solution for this is \(\sqrt{\pi}/4\)
The Exponential distribution has bounds \([0,\infty)\), and has pdf \(\lambda e^{-\lambda x}\).
We can sample from \(Exp(\lambda = 1)\) to solve the above integral.

\[ \begin{align} \int_0^\infty x^2e^{-x^2}dx &= \int_0^\infty \frac{e^{-x}}{e^{-x}}x^2e^{-x^2}dx\\ &= \int_0^\infty \frac{x^2e^{-x^2}}{e^{-x}}e^{-x}dx\\ &= \int_0^\infty x^2e^{-x^2+x}e^{-x}dx \\ &= \mathbb{E}(X^2\exp(-X^2+X)) \end{align} \] where \(X\sim Exp(1)\)

X <- rexp(1e6, rate = 1)
mean(X^2*exp(-X^2+X))

## [1] 0.443195

sqrt(pi)/4

## [1] 0.4431135

Exercise 9.3 The following integral has no closed form solution:

\[ \int _{-\infty} ^\infty \frac{1}{1 + x^4}dx \]

Approximate this integral using Monte Carlo integration. Use \(M = 1\text{e6}\)

Estimating \(\pi\) using random number generation

Consider a unit circle centered at the origin with radius \(r=1\). The equation for the points that lie inside the unit circle is:

\[ x^2 + y^2 \leq 1 \]

We know that the area of a unit circle is \(\pi\).

Now, let’s focus on the first quadrant, \((x,y \geq 0)\), where the area of the quarter circle is

\[ A_{\text{quarter circle}} = \frac{\pi}{4} \]

The Monte Carlo approach approximates this area by randomly sampling points inside the unit square (which has area = 1), then getting the proportion of points that are inside the quarter circle.

Then the final (estimated) value \(\hat{\pi}\) is 4 times the estimated area of the quarter circle.

The following algorithm can be used to approximate \(\pi\).

Generate \(M\) random points \((x,y)\) where \(x\) and \(y\) are sampled uniformly from \([0,1]\).
Count how many points fall under the quarter unit circle:

\[ x^2 + y^2 \leq 1 \]

The estimated value of \(pi\) is

\[ \pi\approx 4 \times \frac{\text{# of points in the quarter circle}}{\text{total points}} \]

estimate_pi(1e5)

## [1] 3.14516

In fact, by law of law large numbers, we can observe that as the number of Monte Carlo iterations increase, we can obtain an estimate of \(\pi\) that is closer to its actual value.

Today, the Monte Carlo method refers to any simulation that involves the use of random numbers.

9.0.1 References

Shazili, C. (2013). Lecture Notes on Monte Carlo and Markov Chains. Carnegie Mellon University, Department of Statistics and Data Science.

https://www.stat.cmu.edu/~cshalizi/statcomp/13/lectures/15/lecture-notes-markov.pdf
Rizzo, M. L. (2007). Statistical Computing with R. In Chapman and Hall/CRC eBooks. https://doi.org/10.1201/9781420010718