Row | year | lsl |
---|---|---|
Int64 | Quantity… | |
1 | 1928 | 5.05343 ft |
2 | 1929 | 3.90728 ft |
3 | 1930 | 4.60245 ft |
Lecture
2023-09-11
Today
Motivation
Flood depths
Functions of random variables
Parameter uncertainty
Wrapup
Recall: \[ \mathbb{E}\left[L(a, \theta) \right] = \int_\theta L(a, \theta) p(\theta) d\theta \] Where \(\theta\) is a vector of parameters, \(a\) is some action or decision, and \(L\) is the loss function.
Note
We previously called \(\theta\) a “state of the world” and \(L\) a “reward function”.
You have been comissioned by a client to assess their exposure to flooding. Specifically, they want to know the probability distribution of annual flood damages at their property if they do not elevate or floodproof their building.
With this information, they can compute metrics like the expected annual damage, the 99th percentile annual damage, and the probability of any flood occurring that will help them make a decision.
Today
Motivation
Flood depths
Functions of random variables
Parameter uncertainty
Wrapup
We fold this long code block to save space.
Row | year | lsl |
---|---|---|
Int64 | Quantity… | |
1 | 1928 | 5.05343 ft |
2 | 1929 | 3.90728 ft |
3 | 1930 | 4.60245 ft |
*=
syntax: x *= 2
is equivalent ot x = x * 2
. We use .*=
to work element-wise on the vector.normalize=:pdf
normalizes the histogram so that the area under the curve is 1.suptitle
adds a title to the entire figure.This data comes from the NOAA Tides and Currents database, specifically a gauge at Sewell’s Point, VA, with sea level rise removed.
We want a probability distribution for flood depths \(p(h)\). We can work with the log of the flood depths and treat them as normally distributed: \[ \log h_i \sim \mathcal{N}(\mu, \sigma^2) \]
We call this a lognormal distribution: \[ h_i \sim \text{LN}(\mu, \sigma) \]
fit_mle
function requires a vector of numbers, without units. We use ustrip
to convert the units to a number, using feet (u"ft"
) as the reference unit.:lsl_ft
column we created.Note
We’ll spend a whole module on extreme value distributions. They should do a better job of modeling annual maxima but require some subtelty.
Flood probabilities are often plotted as return periods. This is just a visualization the CDF on a log (or log-log) scale.
Note
Work through this code and make sure you understand it
It is common to add the data points as dots on the return period curve. This begs the question: what return period is assigned to each point? This is a subjective choice, but a common one is the Weibull plotting position:
🤔
How well does this fit the data?
Today
Motivation
Flood depths
Functions of random variables
Parameter uncertainty
Wrapup
A bounded logistic function provides a plausible depth-damage model for now: \[ d(h) = \mathbb{I}\left[x > 0 \right] \frac{L}{1 + \exp(-k(x - x_0))} \] where \(d\) is damage as a percent of total value, \(h\) is water depth, \(\mathbb{I}\) is the indicator function, \(L\) is the maximum loss, \(k\) is the slope, and \(x_0\) is the inflection point. We fix \(L=1\) (known upper bound: 100% damage) so
We will use \(x_0 = 4\) and \(k = 0.75\).
f(x) = blogistic(x, x0, k)
. It’s similar to lambda functions in pythonPlugging in our bounded logistic model for \(d(h)\) and our lognormal model for \(h\): \[ \begin{align} p(d) &= \int_h d(h) p(h) \, dh \\ &= \int_{-\infty}^\infty \mathbb{I}[h > 0] \text{logistic}(h) \mathcal{N}(h | \mu, \sigma^2) \, dh\\ &= \int_0^\infty \text{logistic}(h) \mathcal{N}(\mu, \sigma^2) \, dh\\ &= \int_0^\infty \frac{1}{1 + \exp(-k * (x - x0))} \frac {1}{\sigma {\sqrt {2\pi }}} \exp \left\{-{\frac {1}{2}}\left({\frac {x-\mu }{\sigma }}\right)^{2} \right\} \, dh \end{align} \]
We might be able to solve this analytically (Wolfram Alpha can’t…). But…
Monte Carlo methods are a set of computational techniques for:
We want to approximate the quantity \[ \int_0^\infty \text{logistic}(h) \mathcal{N}(\mu, \sigma^2) \, dh \]::::: {.incremental} :::: {.columns} ::: {.column width=“50%”} A deterministic strategy:
::: ::: {.column width=“50%”} A sampling strategy
::: :::: :::::
If \(\theta^s \sim p(\theta)\), then \[ \mathbb{E}\left[ f(\theta) \right] = \int_\theta f(\theta) p(\theta) d\theta \approx \frac{1}{S} \sum_{s=1}^S f(\theta^s) \]
A deceptively simple idea:
Given these samples, we can compute expectations of any function of the samples. For example, we can compute the mean damage, the 99th percentile of damage, and the probability of any damage occurring
3-element Vector{Float64}:
0.00235
0.07008
0.03486
Today
Motivation
Flood depths
Functions of random variables
Parameter uncertainty
Wrapup
We have been working with a single probability distribution for flood depths, which we computed by maximum likelihood.
These values are not precise. What happens if we consider the lognormal distribution with slightly different, but still plausible, parameters?
What about the depth-damage parameters \(x_0\) and \(k\)?
Row | params | MLE | alt_dist |
---|---|---|---|
String | Float64 | Float64 | |
1 | Average Annual Loss | 0.00235 | 0.00975 |
2 | Q99 Annual Loss | 0.07008 | 0.16531 |
3 | Probability of Loss | 0.03486 | 0.10076 |
Row | params | MLE | alt_dist | alt_curve | alt_both |
---|---|---|---|---|---|
String | Float64 | Float64 | Float64 | Float64 | |
1 | Average Annual Loss | 0.00235 | 0.00975 | 0.00407 | 0.01529 |
2 | Q99 Annual Loss | 0.07008 | 0.16531 | 0.12715 | 0.24093 |
3 | Probability of Loss | 0.03486 | 0.10076 | 0.03308 | 0.09746 |
Today
Motivation
Flood depths
Functions of random variables
Parameter uncertainty
Wrapup