Lecture 10
September 30, 2024
Monte Carlo is stochastic simulation.
Monte Carlo is a broad method, which can be used to:
Monte Carlo estimation involves framing the (deterministic) quantity of interest as a summary statistic of a random process.
Text: VSRIKRISH to 22333
We can formalize Monte Carlo estimation as the computation of the expected value of a random quantity \(Y\), \(\mu = \mathbb{E}[Y]\).
To do this, generate \(n\) independent and identically distributed values \(Y_1, \ldots, Y_n\). Then the sample estimate is
\[\tilde{\mu}_n = \frac{1}{n}\sum_{i=1}^n Y_i\]
Important note: The Monte Carlo sample mean \(\tilde{\mu}_n = \frac{1}{n}\sum_{i=1}^n Y_i\) is itself a random variable.
What properties would we like the Monte Carlo estimate \(\mathbb{E}[\tilde{\mu}_n]\) to have?
With some assumptions (the mean of \(Y\) exists and \(Y\) has finite variance), the expected Monte Carlo sample mean \(\mathbb{E}[\tilde{\mu}_n]\) is
\[\frac{1}{n}\sum_{i=1}^n \mathbb{E}[Y_i] = \frac{1}{n} n \mu = \mu\]
So the Monte Carlo estimate is an unbiased estimate of the mean.
On average (across different samples), the MC estimate \(\tilde{\mu}_n\) has the “right” value.
Will a single sample trajectory converge to the true value?
If
\(Y\) is a random variable and its expectation exists and
\(Y_1, \ldots, Y_n\) are independently and identically distributed
Then by the weak law of large numbers:
\[\lim_{n \to \infty} \mathbb{P}\left(\left|\tilde{\mu}_n - \mu\right| \leq \varepsilon \right) = 1\]
In other words, eventually Monte Carlo estimates will get within an arbitrary error of the true expectation. But how large is large enough?
Note that the law of large numbers applies to vector-valued functions as well. The key is that \(f(x) = Y\) just needs to be sufficiently well-behaved.
We’d like to know more about the error of this estimate for a given sample size. The variance of this estimator is
\[\tilde{\sigma}_n^2 = \text{Var}\left(\tilde{\mu}_n\right) = \mathbb{E}\left((\tilde{\mu}_n - \mu)^2\right) = \frac{\sigma_Y^2}{n}\]
So as \(n\) increases, the standard error decreases:
\[\tilde{\sigma}_n = \frac{\sigma_Y}{\sqrt{n}}\]
In other words, if we want to decrease the Monte Carlo error by 10x, we need 100x additional samples. This is not an ideal method for high levels of accuracy.
Monte Carlo is an extremely bad method. It should only be used when all alternative methods are worse.
— Sokal, Monte Carlo Methods in Statistical Mechanics, 1996
But…often most alternatives are worse!
If you can compute your answers analytically, you probably should.
But for many systems problems, this is either
This error estimate lets us compute confidence intervals for the MC estimate.
Remember: an \(\alpha\)-confidence interval is an interval such that \(\alpha \%\) of intervals constructed after a given experiment will contain the true value.
It is not an interval which contains the true value \(\alpha \%\) of the time. This concept does not exist within frequentist statistics, and this mistake is often made.
To understand confidence intervals, think of horseshoes!
The post is a fixed target, and my accuracy informs how confident I am that I will hit the target with any given toss.
But once I make the throw, I’ve either hit or missed.
The confidence level \(\alpha\%\) expresses the pre-experimental frequency by which a confidence interval will contain the true value. So for a 95% confidence interval, there is a 5% chance that a given sample was an outlier and the interval is inaccurate.
OK, back to Monte Carlo…
Basic Idea: The Central Limit Theorem says that with enough samples, the errors are normally distributed:
\[\left\|\tilde{\mu}_n - \mu\right\| \to \mathcal{N}\left(0, \frac{\sigma_Y^2}{n}\right)\]
The \(\alpha\)-confidence interval is: \[\tilde{\mu}_n \pm \Phi^{-1}\left(1 - \frac{\alpha}{2}\right) \frac{\sigma_Y}{\sqrt{n}}\]
For example, the 95% confidence interval is \[\tilde{\mu}_n \pm 1.96 \frac{\sigma_Y}{\sqrt{n}}.\]
Converging at a rate of \(1/\sqrt{n}\) is not great. But:
If you can compute your answer analytically, you probably should.
But often this is difficult if not impossible without many simplifying assumptions.
What is the probability of rolling 4 dice for a total of 19?
function dice_roll_repeated(n_trials, n_dice)
dice_dist = DiscreteUniform(1, 6)
roll_results = zeros(n_trials)
for i=1:n_trials
roll_results[i] = sum(rand(dice_dist, n_dice))
end
return roll_results
end
nsamp = 10_000
# roll four dice 10,000 times
rolls = dice_roll_repeated(nsamp, 4)
# initialize storage for frequencies by sample length
avg_freq = zeros(length(rolls))
# initialize storage for standard error of estimate
std_freq = zeros(length(rolls))
# compute average frequencies of 19
avg_freq[1] = (rolls[1] == 19)
std_freq[1] = 0 # no standard error for the first sample
for i in 2:length(rolls)
avg_freq[i] = (avg_freq[i-1] * (i-1) + (rolls[i] == 19)) / i
std_freq[i] = std(rolls[1:i] .== 19) / sqrt(i)
end
After 10,000 iterations, the 95% confidence interval for the MC estimate is 4.3 \(\pm\) 0.4%.
Is this good enough? Should we keep going?
Let’s revisit our two-waste dissolved oxygen example with uncertain wastewater inflows.
At each inflow, the uncertainty about the amount of waste (which modifies the baseline CBOD/NBOD):
Iterations | Estimate |
---|---|
100 | 1.0 \(\pm\) 1.0% |
1000 | 2.5 \(\pm\) 0.5% |
5000 | 2.9 \(\pm\) 0.2% |
10000 | 2.9 \(\pm\) 0.2% |
“Simple” Monte Carlo analysis: assumes that we can readily sample independent and identically-distributed random variables.
There are other methods for when distributions are hard to sample from or uncertainties aren’t independent.
Random number generators are not really random, only pseudorandom.
This is why setting a seed is important. But even that can go wrong…
Wednesday: Intro to Optimization
Next Week: Linear Programming (Monday)
HW3: Due Thursday (10/3) at 9pm
Prelim 1: Next Wednesday (10/9) in class, includes material through today’s lecture.