+ - 0:00:00
Notes for current slide
Notes for next slide

Risk and Uncertainty


BEE 6940 Lecture 2 January 30, 2023
1 / 58

Any Questions?

2 / 58

This is an overview of the topics we'll cover in today's lecture. The italics around the last topic reflect that it's an "optional" topic that we may get to if time allows.


What Is Climate Risk?

4 / 58

What Is Climate Risk?


Climate risk: "risk" created or enhanced by the impacts of climate change

5 / 58

What Is Climate Risk?


Climate risk: "risk" created or enhanced by the impacts of climate change

Strong interactions between these impacts and broader socioeconomic dynamics results in complex dynamics.

5 / 58

Climate Impacts are Diverse


6 / 58

Climate change impacts the intensity, frequency, and duration of a variety of hazards, affecting a large number of sectors. There is certainly a lot of spatial and temporal variability to these changes, but they are highly uncertain, for a number of reasons. Despite this uncertainty, we have to make decisions about how to manage these risks on relatively short time scales.

This map actually understates things by focusing on an estimate of the "top" risk in a given location, and there can be a number of compounding effects from multiple stressors. More on that later.

Climate Risks are Worsening


Recent Climate Risk Headlines

7 / 58

What Is Risk?


8 / 58

What Is Risk?


Intuitively: "Risk" is the possibility of loss, damages, or harm.

Risk=Probability of Hazard×Damages From Event

9 / 58

What Is Risk?


Intuitively: "Risk" is the possibility of loss, damages, or harm.

Risk=Probability of Hazard×Damages From Event Things we don't think of as "risk":

  • Good or neutral outcomes
  • Deterministic outcomes
9 / 58

Some Cartoons About Risk


XKCD Comic 2107: Launch Risk
Source: XKCD 2107

10 / 58

Some Cartoons About Risk


XKCD Comic 1252: Increased Risk
Source: XKCD 1252

11 / 58

What Is Risk?


Common framework:

Risk as a combination of

  • Hazard
  • Exposure
  • Vulnerability
  • Response (Simpson et al (2021))
12 / 58

Defining Climate Risk


Climate Risk: Changes in risk stemming from the impacts of or response to climate change.

13 / 58

Defining Climate Risk


Climate Risk: Changes in risk stemming from the impacts of or response to climate change.

Hazards

  • Drought/flooding
  • Extreme temperatures
  • Sea level rise
  • Others!

Exposure/Vulnerability

  • Compound events
  • Urbanization
  • Land Use, Land Cover Change
13 / 58

Motivating Questions


  1. What are the potential impacts of climate change?
  2. What can we say about their uncertainties?
  3. What are the impacts of those uncertainties on the performance of risk-management strategies?
14 / 58

Uncertainty and Probability

15 / 58

Uncertainty and Risk Analysis


Uncertainty enters into the hazard-exposure-vulnerability-response model in a few ways:

  • Uncertain hazards
  • Uncertainty in model estimates of exposure or vulnerability
  • Uncertainty in responses
16 / 58

But...


What exactly do we mean by uncertainty?

Glib answer: Uncertainty is a lack of certainty!

17 / 58

But...


What exactly do we mean by uncertainty?

Glib answer: Uncertainty is a lack of certainty!

Maybe better: Uncertainty refers to an inability to exactly describe current or future states.

17 / 58

Two Categories of Uncertainty


  • Aleatory Uncertainty: Uncertainty resulting from inherent randomness
  • Epistemic Uncertainty: Uncertainty resulting from lack of knowledge
18 / 58

Two Categories of Uncertainty


  • Aleatory Uncertainty: Uncertainty resulting from inherent randomness
  • Epistemic Uncertainty: Uncertainty resulting from lack of knowledge

The lines between aleatory and epistemic uncertainty are not always clear! This has implications for modeling and risk analysis.

18 / 58

On Epistemic Uncertainty


XKCD Cartoon: Epistemic Uncertainty
Source: XKCD 2440

19 / 58

Uncertainty and Probability


We often represent or describe uncertainties in terms of probabilities:

  • Long-run frequency of an event (frequentist)
  • Degree of belief that a proposition is true (Bayesian)
20 / 58

Confidence vs. Credible Intervals


The difference between the frequentist and Bayesian perspectives can be illustrated through the difference in how both conceptualize uncertainty in estimates.

21 / 58

Confidence vs. Credible Intervals


A Bayesian credible interval for some random quantity is conceptually straightforward:

An α-credible interval is an interval with an α% probability of containing the realized or "true" value.

Dartboard from Wikipedia Source: Wikipedia

22 / 58

Confidence vs. Credible Intervals


However, this notion breaks down with the frequentist viewpoint: there is some "true value" for the associated estimate based on long-run frequencies.

With this view, it is incoherent to talk about probabilities corresponding to parameters. Instead, the key question is how frequently (based on repeated analyses of different datasets) your estimates are "correct".

23 / 58

Confidence vs. Credible Intervals


In other words, the confidence level α% expresses the pre-experimental frequency by which a confidence interval will contain the true value.

So for a 95% confidence interval, there is a 5% chance that a given sample was an outlier and the interval is inaccurate.

24 / 58

Confidence vs. Credible Intervals


To understand frequentist confidence intervals, think of horseshoes! The post is a fixed target, and my accuracy as a horseshoe thrower captures how confident I am that I will hit the target with any given toss.

Cartoon of horseshoes Source: https://www.wikihow.com/Throw-a-Horseshoe

25 / 58

Confidence vs. Credible Intervals


But once I make the throw, I've either hit or missed.

Generating a confidence interval is like throwing a horseshoe with a certain (pre-experimental) degree of accuracy.

Cartoon of horseshoes Source: https://www.wikihow.com/Throw-a-Horseshoe

26 / 58

Probability Distributions


Probabilities are often represented using a probability distribution, which are parameterized by a probability density function.

  • Normal (Gaussian) Distribution: mean μ, variance σ2
  • Poisson Distribution: rate λ
  • Binomial Distribution: # trials n, probability of success p
  • Generalized Extreme Value Distribution: location μ, scale σ, shape ξ
27 / 58

Probability Models


A key consideration in uncertainty and risk analysis is defining an appropriate probability model for the data.

Many "default" approaches, such as linear regression, assume normal distributions and independent and identically-distributed residuals.

28 / 58

Deviations from Normality


Some typical ways in which these assumptions can fail:

  • skew (more samples on one side of the mean than the other)

Linear regression with normal and skewed residuals

29 / 58

Deviations from Normality


Some typical ways in which these assumptions can fail:

  • skew
  • fat tails (probability of extremes)

Linear regression with normal and skewed residuals

30 / 58

Deviations from Normality


Some typical ways in which these assumptions can fail:

  • skew
  • fat tails
  • (auto-)correlations

Linear regression with normal and skewed residuals

31 / 58

Diagnosing Quality of Fit


How can we know if a proposed probability model is appropriate for a data set?

32 / 58

Diagnosing Quality of Fit


Visual inspection often breaks down: our brains are very good at imposing structure (look up "gestalt principles").

Linear regression with normal and skewed residuals

Linear regression with normal and skewed residuals

33 / 58

Quantile-Quantile Plots


One useful tool is a quantile-quantile (Q-Q) plot, which compares quantiles of two distributions.

If the quantiles match, the points will be roughly along the diagonal line, e.g. this comparison of normally-distributed data with a normal distribution.

Q-Q Plot for Normally Distributed Data

34 / 58

Quantile-Quantile Plots


If the points are below/above the 1:1 line, the theoretical distribution is over/under-predicting the associated quantiles.

Comparison of Normal and Cauchy distributions

Q-Q Plot for Cauchy Distributed Data

35 / 58

Cumulative Distribution Functions


Q-Q plots show similar information to a Cumulative Distribution Function (CDF) plot.

Comparison of Normal and Cauchy CDFs

Q-Q Plot for Cauchy Distributed Data

36 / 58

Autocorrelation


Another critical question is if the samples are correlated or independent. For a time series, this can be tested using autocorrelation (or cross-correlation for multiple variables).

Autocorrelation Diagram for Independent Samples

Autocorrelation Diagram for Autocorrelated Samples

37 / 58

Key Takeaway


Specifying the probability model is important — getting this too wrong can bias resulting inferences and projections.

There's no black-box workflow for this: try exploring different methods, relying on domain knowledge, and looking at different specifications until you convince yourself something makes sense.

38 / 58

Monte Carlo

39 / 58

Monte Carlo Simulation


A common problem in risk/uncertainty analysis is uncertainty propagation: what is the impact of input uncertainties on system outcomes? The most basic way to approach this is through Monte Carlo simulation.

Monte Carlo schematic

40 / 58

Monte Carlo Simulation


Monte Carlo simulation involves:

  1. Sampling input(s) from probability distribution(s);
  2. Simulating the quantity of interest;
  3. Aggregating the results (if desired).
41 / 58

Monte Carlo Simulation


Monte Carlo simulation involves:

  1. Sampling input(s) from probability distribution(s);
  2. Simulating the quantity of interest;
  3. Aggregating the results (if desired).

Note that steps 1 and 2 require the ability to generate data from the probability model (or we say that the model is generative). This is not always the case!

41 / 58

Monte Carlo Simulation


Monte Carlo is a very useful method for calculating complex and high-dimensional integrals (such as expected values), since an integral is an n-dimensional area:

  1. Sample uniformly from the domain;
  2. Compute how many samples are in the area of interest.
42 / 58

Monte Carlo (Formally)


We can formalize this common use of Monte Carlo as the computation of the expected value of a random quantity f(Y), Yp, over a domain D: μ=E[f(Y)]=Df(y)p(y)dy.

43 / 58

Monte Carlo (Formally)


Generate n independent and identically distributed values Y1,,Yn. Then the sample estimate is μ~=1ni=1nf(Yi)

44 / 58

The Law of Large Numbers


Monte Carlo works because of the large of law numbers:

If

  1. Y is a random variable and its expectation exists and
  2. Y1,,Yn are independently and identically distributed

Then by the strong law of large numbers:

μ~nμ almost surely as n

45 / 58

Monte Carlo Estimators Are Unbiased


Notice that the sample mean μ~n is itself a random variable.

With some assumptions (the mean of Y exists and Y has finite variance), the expected Monte Carlo estimate is E[μ~n]=1ni=1nE[f(Yi)]=1nnμ=μ

This means that the Monte Carlo estimate is an unbiased estimate of the mean.

46 / 58

Ok, So That Seems Easy...


The basic Monte Carlo algorithm is straightforward: draw a large enough set of samples from your input distribution, simulate and/or compute your test statistic for each of those samples, and the sample value will necessarily converge to the population value.

However:

  • Are your input distributions correctly specified (including correlations across inputs)?
  • How large is "large enough"?
47 / 58

Monte Carlo Error


This raises a key question: how can we quantify the standard error of a Monte Carlo estimate?

The variance of this estimator is:

σ~n2=Var(μ~n)=E((μ~nμ)2)=σy2n

So the standard error σn~ decreases approximately as 1/n as n increases.

48 / 58

Monte Carlo Error


In other words, if we want to decrease the Monte Carlo error by 10x, we need 100x additional samples. This is not an ideal method for high levels of accuracy.

Monte Carlo is an extremely bad method. It should only be used when all alternative methods are worse.

— Sokal, Monte Carlo Methods in Statistical Mechanics, 1996

49 / 58

Monte Carlo Error


In other words, if we want to decrease the Monte Carlo error by 10x, we need 100x additional samples. This is not an ideal method for high levels of accuracy.

Monte Carlo is an extremely bad method. It should only be used when all alternative methods are worse.

— Sokal, Monte Carlo Methods in Statistical Mechanics, 1996

The thing is, though – for a lot of problems, all alternative methods are worse!

49 / 58

Reporting Monte Carlo Uncertainty


An α-credible interval for a Monte Carlo estimate is straightforward: compute an empirical interval containing α% of the Monte Carlo sample values (e.g. for a 95% credible interval, take the range between the 0.025 and 0.975 quantiles).

50 / 58

Monte Carlo Confidence Intervals


To estimate confidence intervals, we can rely on the variance estimate from before.

For "sufficiently large" sample sizes n, the central limit theorem says that the distribution of the error |μ~nμ| can be approximated by a normal distribution, |μ~nμ|N(0,σy2n)

51 / 58

Monte Carlo Confidence Intervals


This means that we can construct confidence intervals using the inverse cumulative distribution function for the normal distribution.

The α-confidence interval is: μ~n±Φ1(1α2)σyn.

For example, the 95% CI is μ~n±1.96σy/n.

52 / 58

Monte Carlo Confidence Intervals


Of course, we typically don't know σy. We can replace this with the sample standard deviation, though this will increase the uncertainty of the estimate.

But this gives us a sense of how many more samples we might need to get a more precise estimate.

53 / 58

A Dice Example (Cliche Alert!)


What is the probability of rolling 4 dice for a total of 19?

Let's solve this using Monte Carlo.

54 / 58

A Dice Example (Cliche Alert!)


What is the probability of rolling 4 dice for a total of 19?

Let's solve this using Monte Carlo.

  • Step 1: Run n trials (say, 10,000) trials of 4 dice rolls each.
  • Step 2: Compute the frequency of trials for which the sum is 19, e.g. compute the sample average of the indicator function 1ni=1nI(sum of 4 dice=19).
54 / 58

A Dice Example


How does this estimate evolve as we add more samples?

Note: the true value (given by the red line) is 4.32%.

Simulations from Dice Monte Carlo Experiment

55 / 58

More Complex Monte Carlo


We won't spend too much more time here, but for more complex problems, the sample size needed to constrain the Monte Carlo error can be computationally burdensome.

This is typically addressed with more sophisticated sampling schemes which are designed to reduce the variance from random sampling, causing the estimate to converge faster.

  • Importance sampling
  • Quasi-random sampling (e.g. Sobol)
56 / 58

Key Takeaways (Monte Carlo)


  • The basic Monte Carlo algorithm is a simple way to propagate uncertainties and compute approximate estimates of statistics, though its rate of convergence is poor.
  • Can also be used for general simulation (which we will do later) and optimization.
  • Note: Monte Carlo is a fundamentally parametric statistical approach, that is, it relies on the specification of the data-generation process, including all parameter values.
  • What if we don't know these specifications a priori? This is the fundamental challenge of uncertainty quantification, which we will discuss more throughout this course.
57 / 58

Upcoming Schedule


Wednesday: Discuss Simpson (2021) and lab on testing for normality and Monte Carlo (featuring The Price is Right!).

Next Monday: Representing climate uncertainties and implications for risk management.

58 / 58

Any Questions?

2 / 58
Paused

Help

Keyboard shortcuts

, , Pg Up, k Go to previous slide
, , Pg Dn, Space, j Go to next slide
Home Go to first slide
End Go to last slide
Number + Return Go to specific slide
b / m / f Toggle blackout / mirrored / fullscreen mode
c Clone slideshow
p Toggle presenter mode
t Restart the presentation timer
?, h Toggle this help
Esc Back to slideshow