CEVE 543 Fall 2025 Lab 2: Julia Data Handling

DataFrames.jl, Plotting with CairoMakie, TX precipitation datasets, Basic exploratory data analysis

Author

CEVE 543 Fall 2025

Published

Fri., Sep. 5

1 Learning Goals

Download and parse real weather data (annual maximum rainfall) from NOAA
Create DataFrames to organize station metadata and rainfall measurements
Build interactive maps and time series plots
Calculate distances between geographic points
Compare data across multiple weather stations

2 Do Before Lab

Set up GitHub Copilot. You’ll use GitHub Copilot - an AI assistant that helps write code. It’s especially useful for plotting and data manipulation syntax.
1. Visit GitHub Copilot and sign up using your education account
2. Install the GitHub Copilot extension in VS Code
3. Link your VS Code to your GitHub account when prompted
4. You should see Copilot suggestions appear as you type code
Accept the assignment. Use the GitHub Classroom link posted on Canvas to “accept” the assignment, which will give you your own GitHub repository.
Clone your repository to your computer using GitHub Desktop or the command line.
Open the repository in VS Code directly from VS Code, GitHub Desktop, or your command line.
Activate the Julia environment. In VS Code, open the command palette (View → Command Palette) and type “Julia: Start REPL”. Alternatively, run julia --project=. in the terminal, then run using Pkg; Pkg.instantiate() to install required packages. The easiest method is to type ] in the REPL to enter package mode, then type instantiate.

3 Do During Lab

Verify rendering works. Run quarto render index.qmd in your terminal - it should create both PDF and HTML documents.
Work through the lab step by step. You can either:
- Run code line by line in your Julia REPL, or
- Run quarto preview index.qmd for a live preview that updates automatically
Make sure you can run each line of code and use the annotations to understand what each line does.
Try these modifications as you work through the lab:
- Change the target location - Pick a different city and find the nearest weather stations
- Adjust the number of stations - Plot more or fewer stations in the comparison
- Modify the map appearance - Try different color schemes or marker sizes
- Explore different time periods - Focus on specific decades or years
Complete the analysis question at the end of the document.

4 Background

We are deliberately not providing a gentle introduction to Julia here, because everyone is at a different level of familiarity with the language. Instead, refer to the course website’s Julia learning resources (Julia for Nervous Beginners is a good place to start if that sounds like you!). You aren’t expected to start working on the problem set yet, but you should get up to speed with Julia – if you have exposure to Julia or are a proficient R/Matlab/Python programmer, this will be a period of light work for you; if you’ve never seen Julia then it will be heavier work.

Key Documentation:

TidierData.jl Documentation - Complete guide to modern data manipulation
Makie.jl Plotting - Comprehensive plotting documentation
GeoMakie.jl - Geographic visualization techniques
DataFrames.jl - Core tabular data operations

Understanding Code Annotations: This lab uses interactive code annotations to explain what each line does. You’ll see numbers like # <1> in the code - hover over them to see explanations in popup tooltips.

5 Code

5.1 Loading Required Packages

First, we need to load all the Julia packages we’ll use in this analysis:

# Core data manipulation packages
using CSV
using DataFrames
using Dates
using Distributions
using Downloads
using TidierData
using TidierFiles
using Unitful

ENV["DATAFRAMES_ROWS"] = 5

1: Load CSV package for reading the NOAA text file into Julia
2: Load DataFrames for working with tabular data (like Excel spreadsheets in code)
3: Load Dates to properly parse and handle time information in weather data
4: Load Downloads to fetch files from NOAA’s web servers
5: Load TidierData for modern data manipulation with pipeline syntax (@chain, @filter, @select)
6: Load TidierFiles for simplified file reading and writing operations
7: Load Unitful to handle units of measurement (inches, kilometers, degrees) properly
8: Set the maximum number of rows to display in DataFrames by default

Next, we load the plotting packages and configure our visualization backend:

# Plotting and visualization packages
using GLMakie
using CairoMakie
using ColorSchemes
using GeoMakie
using Makie.Unitful

# Configure plotting backend - CairoMakie for high-quality static plots
CairoMakie.activate!(type = "svg")
# If you want interactive plots, uncomment the line below instead:
# GLMakie.activate!()

# Configure Unitful display preferences to use inches
rainfall_conversion = Makie.UnitfulConversion(u"inch")

1: Load GLMakie for interactive plots with GPU acceleration (default choice)
2: Load CairoMakie as backup for high-quality static plots if GLMakie fails
3: Load ColorSchemes to access professional color palettes (viridis, plasma, etc.)
4: Load GeoMakie for geographic visualizations and map projections
5: Load Makie.Unitful to integrate our unit system with the plotting functions
6: Activate CairoMakie as the plotting backend for high-quality static plots
7: Alternative activation for interactive plots if needed
8: Create unit converter to display rainfall in inches instead of default “deca-inches”

Finally, we load our custom utility functions that contain the data parsing logic:

# Load custom utility functions
include("util.jl")

1: Load our custom parsing functions from util.jl (keeps the main notebook clean)

5.2 Downloading Weather Data

Next, we’ll download precipitation data from the National Weather Service:

fname = "dur01d_ams_na14v11.txt"
url = "https://hdsc.nws.noaa.gov/pub/hdsc/data/tx/dur01d_ams_na14v11.txt"

if !isfile(fname)
    Downloads.download(url, fname)
end

1: Set the local filename for our Texas precipitation data
2: NOAA’s web address for the Texas Atlas-14 precipitation dataset
3: Only download if we don’t already have the file (the ! means “not”)
4: Download the file from NOAA’s server to our local directory

This prevents re-downloading the file every time we run the code, which is faster and more polite to NOAA’s servers.

5.3 Parsing the Weather Data

Now we’ll use our custom function to parse the downloaded data into a useful format:

stations, rainfall_data = read_noaa_data(fname)

display(stations)
display(rainfall_data)

1: Call our custom function that parses the NOAA file into two DataFrames
2: Display the stations DataFrame (shows first 5 rows due to ENV setting)
3: Display the rainfall DataFrame to see the data structure

816×7 DataFrame

811 rows omitted

Row	stnid	noaa_id	name	state	latitude	longitude	years_of_data
	Int64	String	String	String	Float64	Float64	Int64
1	1	60-0011	CLEAR CK AT BAY AREA BLVD	TX	29.4977	-95.1599	31
2	2	60-0019	TURKEY CK AT FM 1959	TX	29.5845	-95.1869	31
3	3	60-0022	ARMAND BYU AT GENOARED BLF RD	TX	29.6345	-95.1123	31
⋮	⋮	⋮	⋮	⋮	⋮	⋮	⋮
815	815	87-0031	SECO CREEK AT MILLER RANCH	TX	29.5731	-99.4028	20
816	816	99-2048	COTULLA	TX	28.4567	-99.2183	116

61925×4 DataFrame

61920 rows omitted

Row	stnid	date	year	rainfall
	Int64	Date	Int64	Quantity…?
1	1	1987-06-11	1987	6.31 inch
2	1	1988-09-02	1988	5.46 inch
3	1	1989-08-01	1989	11.39 inch
⋮	⋮	⋮	⋮	⋮
61924	816	2016-08-20	2016	3.44 inch
61925	816	2017-09-25	2017	2.72 inch

The read_noaa_data function separates the messy text file into:

stations: A DataFrame with each station’s location, name, and metadata
rainfall_data: A DataFrame with all rainfall measurements, using stnid to link to stations

5.4 Modern Data Manipulation with TidierData.jl

Now let’s explore our data using TidierData.jl, which provides a modern, readable syntax for data manipulation. TidierData brings tidyverse-style operations to Julia with intuitive verb names and pipeline workflows.

For complete documentation, visit: https://tidierorg.github.io/Tidier.jl/dev/

5.4.1 Basic TidierData Operations

Let’s start with some basic data exploration using TidierData syntax:

@chain stations begin
    @select(stnid, noaa_id, name, state, latitude, longitude, years_of_data)
end

1: Start a pipeline where each operation flows into the next (like %>% in R)
2: Choose specific columns to display (similar to SQL SELECT)

816×7 DataFrame

811 rows omitted

Row	stnid	noaa_id	name	state	latitude	longitude	years_of_data
	Int64	String	String	String	Float64	Float64	Int64
1	1	60-0011	CLEAR CK AT BAY AREA BLVD	TX	29.4977	-95.1599	31
2	2	60-0019	TURKEY CK AT FM 1959	TX	29.5845	-95.1869	31
3	3	60-0022	ARMAND BYU AT GENOARED BLF RD	TX	29.6345	-95.1123	31
⋮	⋮	⋮	⋮	⋮	⋮	⋮	⋮
815	815	87-0031	SECO CREEK AT MILLER RANCH	TX	29.5731	-99.4028	20
816	816	99-2048	COTULLA	TX	28.4567	-99.2183	116

Note that our rainfall measurements have explicit units (you’ll see inch in the output) thanks to the Unitful.jl package. This makes unit conversions automatic and error-free. For example, to convert to centimeters:

@chain rainfall_data begin
    @mutate(rainfall = uconvert(u"cm", rainfall))
end

1: Convert rainfall measurements from inches to centimeters automatically with proper unit handling

61925×4 DataFrame

61920 rows omitted

Row	stnid	date	year	rainfall
	Int64	Date	Int64	Quantity…?
1	1	1987-06-11	1987	16.0274 cm
2	1	1988-09-02	1988	13.8684 cm
3	1	1989-08-01	1989	28.9306 cm
⋮	⋮	⋮	⋮	⋮
61924	816	2016-08-20	2016	8.7376 cm
61925	816	2017-09-25	2017	6.9088 cm

5.4.2 Filtering and Transforming Data

# Find stations in Texas with lots of data
@chain stations begin
    @filter(state == "TX" && years_of_data > 50)
    @select(noaa_id, name, years_of_data)
    @arrange(desc(years_of_data))
end

1: Start a pipeline using our stations DataFrame
2: Keep only rows where state is “TX” AND years_of_data > 50
3: Choose which columns to show in our result
4: Sort by years_of_data in descending order (most data first)

638×3 DataFrame

633 rows omitted

Row	noaa_id	name	years_of_data
	String	String	Int64
1	41-0420	AUSTIN	169
2	41-7622	RIO GRANDE CITY	168
3	79-0043	BROWNSVILLE	168
⋮	⋮	⋮	⋮
637	41-6119	MT VERNON	51
638	41-8761	SUNRAY 4 SW	51

5.4.3 Adding Calculated Columns

# Create categorical data quality labels
@chain stations begin
    @mutate(
        data_category = case_when(
            years_of_data >= 60 => "Long-term (60+ years)",
            years_of_data >= 30 => "Medium-term (30-59 years)",
            years_of_data >= 10 => "Short-term (10-29 years)",
            true => "Very short (<10 years)",
        )
    )
    @select(noaa_id, name, years_of_data, data_category)
end

1: Start our data pipeline with the stations DataFrame
2: Create or modify columns in the DataFrame
3: Use conditional logic to create a new categorical column
4: If 60+ years of data, categorize as “Long-term”
5: If 30-59 years of data, categorize as “Medium-term”
6: If 10-29 years of data, categorize as “Short-term”
7: Default case for stations with less than 10 years
8: Select only the columns we want to display

816×4 DataFrame

811 rows omitted

Row	noaa_id	name	years_of_data	data_category
	String	String	Int64	String
1	60-0011	CLEAR CK AT BAY AREA BLVD	31	Medium-term (30-59 years)
2	60-0019	TURKEY CK AT FM 1959	31	Medium-term (30-59 years)
3	60-0022	ARMAND BYU AT GENOARED BLF RD	31	Medium-term (30-59 years)
⋮	⋮	⋮	⋮	⋮
815	87-0031	SECO CREEK AT MILLER RANCH	20	Short-term (10-29 years)
816	99-2048	COTULLA	116	Long-term (60+ years)

5.4.4 Grouping and Summarizing Data

# Analyze temporal coverage of rainfall data
obs_per_year = @chain rainfall_data begin
    @group_by(year)
    @summarise(
        station_count = n(),
    )
    @arrange(desc(station_count))
end

1: Start with rainfall DataFrame to see which years have the most station coverage
2: Group observations by year to count stations active each year
3: Calculate summary statistics for each year group
4: Count how many stations reported data in each year
5: Sort years by station count (years with most stations first)

170×2 DataFrame

165 rows omitted

Row	year	station_count
	Int64	Int64
1	1995	735
2	1996	735
3	1997	735
⋮	⋮	⋮
169	1850	4
170	1849	1

5.5 Creating a Map of Weather Stations

Let’s visualize all the weather stations on a map, with colors showing how many years of data each station has.

For more advanced mapping techniques, see: https://geo.makie.org/dev/introduction

For complete Makie plotting documentation, visit: https://docs.makie.org/stable/

function plot_stations()

    # Create map plot colored by years of data
    fig = Figure(size = (800, 600))
    ga = GeoAxis(fig[1, 1]; source = "+proj=latlong", dest = "+proj=merc",
        title = "Texas Atlas-14 Stations", xgridvisible = false, ygridvisible = false)

    # Add US states (white with black borders)
    states = GeoMakie.naturalearth("admin_1_states_provinces_lakes", 110)
    poly!(ga, states.geometry;
        strokecolor = :black, strokewidth = 1, color = :white)

    # Create scatter plot with viridis colormap
    scatter!(ga, stations.longitude, stations.latitude;
        color = stations.years_of_data, colormap = :viridis, markersize = 10)

    # Set plot bounds based on data bounds plus delta
    delta = 0.3  # degrees of padding around the data bounds
    min_lon, max_lon = extrema(stations.longitude)
    min_lat, max_lat = extrema(stations.latitude)

    xlims!(ga, min_lon - delta, max_lon + delta)
    ylims!(ga, min_lat - delta, max_lat + delta)

    # Add colorbar
    Colorbar(fig[1, 2], label = "Years of Data", colormap = :viridis,
        limits = (minimum(stations.years_of_data), maximum(stations.years_of_data)))

    return fig
end
plot_stations()

1: Create a GeoAxis with map projection: converts lat/lon coordinates to Mercator for proper display
2: Download US state boundaries from Natural Earth (110 = medium resolution) for map background
3: Plot weather stations colored by years of data using the viridis color scheme

5.6 Finding Stations Near a Target Location

Let’s say we want to study rainfall at a particular point. We’ll use Rice University in Houston as our example:

target_location = "Rice University"
target_lon = -95.404970
target_lat = 29.718952

1: Descriptive name for our target location (change this for different places!)
2: Rice University’s longitude coordinate (negative means west of Greenwich)
3: Rice University’s latitude coordinate (positive means north of equator)

29.718952

5.7 Calculating Distances to Weather Stations

Now we need to calculate how far each weather station is from our target location using the Haversine formula:

# Haversine formula for great circle distance with units
function calc_distance(lon1, lat1, lon2, lat2)
    R = 6378.0u"km"

    # Convert degrees to radians
    φ1 = deg2rad(lat1)
    φ2 = deg2rad(lat2)
    Δφ = deg2rad(lat2 - lat1)
    Δλ = deg2rad(lon2 - lon1)

    # Haversine formula
    a = sin(Δφ / 2)^2 + cos(φ1) * cos(φ2) * sin(Δλ / 2)^2
    c = 2 * atan(sqrt(a), sqrt(1 - a))

    return R * c
end

stations = @chain stations begin
    @mutate(distance_km = calc_distance(longitude, latitude, !!target_lon, !!target_lat))
    @arrange(distance_km)
end
stations

1: Define a function that calculates great circle distance between two lat/lon points
2: Earth’s radius in kilometers (using proper units!)
3: Convert the first latitude from degrees to radians (φ is Greek phi)
4: Convert the second latitude to radians
5: Calculate difference in latitudes (Δ is Greek delta, means “change in”)
6: Calculate difference in longitudes (λ is Greek lambda)
7: First part of Haversine formula: angular distance component
8: Complete the great circle distance calculation
9: Multiply by Earth’s radius to get distance in kilometers
10: Start a TidierData pipeline with our stations DataFrame
11: Add distance column using !! to inject external variables into the TidierData expression
12: Sort the stations by distance (closest first)
13: Display the sorted stations (shows closest ones first due to ENV row limit)

816×8 DataFrame

811 rows omitted

Row	stnid	noaa_id	name	state	latitude	longitude	years_of_data	distance_km
	Int64	String	String	String	Float64	Float64	Int64	Quantity…
1	780	79-0056	HOUSTON WB CITY	TX	29.7622	-95.3593	129	6.53163 km
2	377	41-4321	HOUSTON HEIGHTS	TX	29.7914	-95.4261	62	8.31921 km
3	380	41-4326	HOUSTON-PORT	TX	29.7456	-95.28	29	12.4388 km
⋮	⋮	⋮	⋮	⋮	⋮	⋮	⋮	⋮
815	807	79-0125	EL PASO	TX	31.7587	-106.484	167	1083.55 km
816	428	41-4931	LA TUNA 1 S	TX	31.98	-106.597	75	1098.28 km

The Haversine formula accounts for Earth’s curvature, giving much more accurate distances than simple straight-line calculations.

5.8 Examining the Closest Weather Station

Let’s focus on the closest weather station and look at its rainfall data:

closest_station = first(stations)
closest_stnid = closest_station.stnid
closest_rainfall = @chain rainfall_data begin
    @filter(stnid == !!closest_stnid)
end
closest_rainfall

1: Filter rainfall data using !! to inject the external variable into TidierData

129×4 DataFrame

124 rows omitted

Row	stnid	date	year	rainfall
	Int64	Date	Int64	Quantity…?
1	780	1889-05-18	1889	3.18 inch
2	780	1890-04-02	1890	4.75 inch
3	780	1891-01-08	1891	5.89 inch
⋮	⋮	⋮	⋮	⋮
128	780	2016-04-17	2016	5.69 inch
129	780	2017-08-26	2017	14.7 inch

5.9 Plotting a Time Series

Now let’s create a time series plot to see how rainfall has changed over time at this station:

function plot_time_series(station_row, rainfall_df)
    # Create time series plot of rainfall data
    fig = Figure(size = (800, 400))
    ax = Axis(fig[1, 1],
        ylabel = "Annual Maximum Rainfall",
        title = "$(station_row.noaa_id): $(station_row.name)",
        dim2_conversion = rainfall_conversion)

    # Plot the time series
    lines!(ax, rainfall_df.date, rainfall_df.rainfall)
    scatter!(ax, rainfall_df.date, rainfall_df.rainfall,
        markersize = 10, marker = :circle, strokewidth = 2, color = :transparent)

    fig
end

plot_time_series(closest_station, closest_rainfall)

1: Use string interpolation $() to insert station ID and name into the plot title
2: Apply unit converter to display rainfall in inches instead of deca-inches
3: Plot using date column to show precise timing of annual maximum events
4: Add hollow circles with transparent fill to emphasize individual data points

5.10 Comparing Multiple Nearby Stations

This is interesting! How does this compare to other nearby stations? Let’s plot several stations together:

# Compare with other nearby stations
function plot_nearby_comparison(stations_df, rainfall_data; n_stations = 4)
    fig = Figure(size = (800, 400))
    ax = Axis(fig[1, 1],
        ylabel = "Annual Maximum Rainfall",
        title = "Comparison of $(n_stations) Nearest Stations to $(target_location)",
        dim2_conversion = rainfall_conversion)

    colors = [get(colorschemes[:viridis], i / n_stations) for i in 0:n_stations-1]

    for (i, station) in enumerate(eachrow(stations_df[1:n_stations, :]))
        station_id = station.stnid
        rainfall = @chain rainfall_data begin
            @filter(stnid == !!station_id)
        end
        lines!(ax, rainfall.year, rainfall.rainfall,
            color = colors[i], linewidth = 2,
            label = "$(station.noaa_id)")
        scatter!(ax, rainfall.year, rainfall.rainfall,
            color = colors[i], markersize = 10)
    end
    xlims!(ax, 1950, 2025)

    axislegend(ax, position = :lt)
    return fig
end

plot_nearby_comparison(stations, rainfall_data; n_stations = 4)

1: Generate evenly-spaced colors from the viridis palette for each station
2: Loop through the first n_stations closest weather stations
3: Use !! to inject the loop variable into the TidierData filter expression
4: Add a legend in the top-left corner to identify different stations

5.11 GEV Distribution and Return Period Analysis

Now we’ll work with a GEV distribution to create return period plots and compare with observed data using Weibull plotting positions.

Your task: Experiment with different GEV parameters to see how well you can match the observed data from your closest station. You’ll need to systematically test each parameter to answer the questions at the end. Start with these parameters, then modify them one at a time:

# GEV parameters for Houston area (units: inches)
# Try modifying these values to better fit your data!
μ = 4.0    # Location parameter - shifts the distribution left/right
σ = 1.3    # Scale parameter - controls spread (must be > 0)
ξ = 0.15   # Shape parameter - controls tail behavior

houston_gev = GeneralizedExtremeValue(μ, σ, ξ)

1: Create GEV distribution with the specified parameters

Distributions.GeneralizedExtremeValue{Float64}(μ=4.0, σ=1.3, ξ=0.15)

5.12 Return Period Plot

# Function to calculate return levels from GEV
function gev_return_level(gev_dist, return_period)
    μ, σ, ξ = params(gev_dist)
    if abs(ξ) < 1e-6  # Gumbel case
        return μ - σ * log(-log(1 - 1 / return_period))
    else
        return μ - (σ / ξ) * (1 - (-log(1 - 1 / return_period))^(-ξ))
    end
end

# Create return period plot
function plot_return_periods(gev_dist)
    fig = Figure(size = (800, 400))
    ax = Axis(fig[1, 1],
        xlabel = "Return Period (years)",
        ylabel = "Return Level (inches)",
        title = "GEV Return Period Plot",
        xscale = log10)

    # Plot curve and highlight standard return periods
    T_smooth = 10 .^ range(log10(1.1), log10(250), length = 100)
    levels_smooth = [gev_return_level(gev_dist, T) for T in T_smooth]
    lines!(ax, T_smooth, levels_smooth, color = :blue, linewidth = 2, label = "GEV Model")

    return_periods = [5, 10, 25, 50, 100, 250]

    ax.xticks = [5, 10, 25, 50, 100, 250]
    return fig
end

plot_return_periods(houston_gev)

1: Extract GEV parameters from distribution object
2: Gumbel formula when \xi \approx 0
3: General GEV return level formula
4: Log scale for x-axis
5: Standard return periods for flood analysis

5.13 Weibull Plotting Positions - Comparing Theory vs Reality

Now let’s see how well your chosen GEV parameters match the actual observed data! The red dots show the empirical return periods calculated from your station’s data using Weibull plotting positions.

What to look for as you experiment: - Do the red dots follow the blue GEV curve? - Are there systematic deviations at high or low return periods? - How does each parameter change affect the fit?

Systematic experimentation approach: 1. First, try increasing \mu by 1-2 inches, then decreasing it. Note the effect. 2. Next, try increasing \sigma by 0.5, then decreasing it. Note the effect. 3. Finally, try changing \xi to 0.0 (Gumbel), then to -0.1 (light tail), then 0.3 (heavy tail). Note the effect. 4. Find your best combination!

# Weibull plotting positions
function weibull_plotting_positions(data)
    clean_data = sort(collect(skipmissing(data)))
    n = length(clean_data)
    plotting_positions = [i / (n + 1) for i in 1:n]
    empirical_return_periods = [1 / (1 - p) for p in plotting_positions]
    return clean_data, empirical_return_periods
end

# Plot empirical vs theoretical
function plot_empirical_vs_theoretical(station_data, gev_dist, station_info)
    # Start with the basic return period plot
    fig = plot_return_periods(gev_dist)
    ax = fig.content[1]  # Get the actual Axis object

    # Update title to include station info
    ax.title = "Empirical vs Theoretical Return Periods\n$(station_info.noaa_id): $(station_info.name)"

    # Add empirical data with Weibull positions
    emp_levels, emp_periods = weibull_plotting_positions(station_data.rainfall)
    scatter!(ax, emp_periods, emp_levels,
        color = :red, markersize = 10,
        label = "Observed Data (Weibull positions)")

    axislegend(ax, position = :rb)

    return fig
end

plot_empirical_vs_theoretical(closest_rainfall, houston_gev, closest_station)

6 Analysis Questions

6.1 Temporal Trends

Explore the rainfall data to find one station where there appears to be a positive trend in rainfall over time and one station where there appears to be a negative trend. We will discuss more formal ways for thinking about trends in the coming weeks.

Station with apparent positive trend:

Write your answer here. Include the station name/ID and explain what you observe in the data that suggests an increasing trend.

Station with apparent negative trend:

Write your answer here. Include the station name/ID and explain what you observe in the data that suggests a decreasing trend.

6.2 GEV Parameter Experimentation

Starting Parameters vs Final Parameters:

Record your starting parameters and the final parameters you chose after experimentation:

Starting: \mu = 4.0, \sigma = 1.3, \xi = 0.15
Final: \mu = \ldots \sigma = \ldots, \xi = \ldots

Parameter Effects:

What happened when you increased/decreased \mu (location parameter)?

Test values like \mu = 2.0, 4.0, 6.0. Describe how the entire GEV curve shifted and how this affected the fit to your red data points.

What happened when you increased/decreased \sigma (scale parameter)?

Test values like \sigma = 0.8, 1.3, 2.0. Describe how the curve became steeper/flatter and how this affected the spread of return levels.

What happened when you changed \xi (shape parameter)?

Test \xi = -0.1 (light tail), \xi = 0.0 (Gumbel), \xi = 0.15, \xi = 0.3 (heavy tail). Focus on how the high return period end (right side) of the curve changed.

Final Assessment:

How well does your best-fit GEV distribution represent the closest station’s data?

Do the red dots follow your blue curve? Where are the biggest deviations?

Additional observations:

Optional: Add any other interesting patterns you notice in the return period analysis.