Extreme Value Statistics


Lecture

2023-11-01

Extreme values

Today

  1. Extreme values

  2. Case studies

  3. Theoretical Frameworks

  4. Terminology

  5. Challenges

Objectives

  • Likelihood of extreme events
  • Extrapolate
  • Univariate (for now)

Applications

  • Engineering design
  • Emergency management
  • Regulation
  • Insurance
  • Managing financial assets

What variables?

  • Streamflow
  • Precipitation rates or totals
  • Wind speed
  • Temperatures

Extremes are rare

  • There are simple and clever methods for estimating the likelihood of rare events
  • Fundamentally, extrapolating is hard
  • Key sources of uncertainty:
    • Estimation uncertainty
    • Model structure uncertainty
    • Sampling uncertainty

Case studies

Today

  1. Extreme values

  2. Case studies

  3. Theoretical Frameworks

  4. Terminology

  5. Challenges

Harvey and Addicks / Barker

I was asked to calculate what would happen under some specific assumptions (no suggestion of unbiasedness!)

  • These were plausible assumptions, not necessarily the “right” or “best” assumptions (other side had many reasonable objections!)
  • Large differences between estimates made under different assumptions underscore the challenge of “objective” esimates
  • A hard problem (interacting drivers of nonstationarity, short records, etc)

Precipitation frequencies in TX

  • TWDB project
  • Atlas 14 (Perica et al., 2018) does not account for climate change
  • Use more stations and account for climate change
  • Joint TWDB/TAMU/Rice project
  • Variable studied: precipitation at multiple durations across the entire state

How likely was Winter Storm Uri?

  • What should we plan for?
  • Was it an “unprecedented” event?
  • Variable studied:
    • temperature at grid cells
    • aggregated “population-weighted” index

Theoretical Frameworks

Today

  1. Extreme values

  2. Case studies

  3. Theoretical Frameworks

  4. Terminology

  5. Challenges

Peak over threashold (POT)

  • Define a threshold
  • Model the distribution of events above the threshold
  • Model the probability of seeing an event over the threshold
  • Advantage: focus on meaningful events, even if they’re rare
  • Disadvantage: threshold is arbitrary, modeling arrival turns out to be tricky

Block maxima

  • Define a block size (e.g., 1 year – how you define “a year” matters)
  • Model the distribution of the extreme in each block
  • Advantage: easier to communicate and implementations are more flexible
  • Disadvantages: timing of extremes; two extremes in one year; sometimes your min/max is not special

Terminology

Today

  1. Extreme values

  2. Case studies

  3. Theoretical Frameworks

  4. Terminology

  5. Challenges

Key terms

  • Exceedance probability (often AEP): \(p\)
  • Return period or recurrence interval (\(T\)): \(\frac{1}{p}\)
  • Return level: the value that will be exceeded with probability \(\frac{1}{T}\), ie the quantile

Plotting position

Two ideas:

  • Ranks: if you have \(N\) events, the largest is rank 1, the second largest rank 2, etc.
  • Use the points that you have as return levels and estimate the associated return periods
  • Common estimator is Weibull plotting position \(p = \frac{m}{N+1}\) where \(m\) is the rank
  • Lots of bickering in the literature about right choice
  • When you see return period plots, if the observations are shown they are using a plotting position of some sort

Log Pearson Type III distribution

  • Different variables have different properties
  • USGS likes this distribution for streamflow

GEV

  • Model block extremes
  • As we will see, has strong theoretical justification
  • Three parameters: location, scale, shape
  • Shape can be tricky to estimate \(\rightarrow\) large parameter uncertainty

GPD

  • Similar: location, scale, and shape parameter
  • Requires a separate model for “arrivals”

Challenges

Today

  1. Extreme values

  2. Case studies

  3. Theoretical Frameworks

  4. Terminology

  5. Challenges

Parametric Uncertainty

  • Many parameter values can be consistent with the data
  • But lead to very different conclusions about the likelihood of extreme events
  • Hard to pin down
  • Relatively easier to resolve through clever approaches we will see

Model Uncertainty

  • Different assumptions lead to different inferences
  • GEV / GPD are theoretically justified
  • As we add clever approaches, we add model structure uncertainty

Sampling Uncertainty

  • We have a finite sample of data and are trying to estimate parameters that tell us about rare events
  • If Harvey had never occurred, we would likely have a very different estimate of the 100-year rainfall!

Nonstationarity

  • These models don’t explicitly account for climate change
  • Interannual variability / correlations

Reading

  • Coles (2001): canonical extreme value textbook

Logistics

  1. Exams
    1. Working on grading exams 1 and 2
  2. Class
    1. Friday 11/3: lab
    2. Monday 11/6: POT and Block Maxima Models
    3. Wednesday 11/8: GEV models and estimators

References

Coles, S. (2001). An introduction to statistical modeling of extreme values. London ; Springer.
Perica, S., Pavlovic, S., St. Laurent, M., Trypaluk, C., Unruh, D., & Wilhite, O. (2018). NOAA Atlas 14 (No. Volume 11 Version 2.0: Texas) (p. 283). Silver Spring, MD: National Weather Service, National Oceanic and Atmospheric Administration, U.S. Department of Commerce. Retrieved from https://www.weather.gov/media/owp/oh/hdsc/docs/Atlas14_Volume11.pdf