Hierarchical ODE State Space Model for Unemployment Dynamics
A Bayesian Approach with Pooled Labor Market Parameters
Published
June 5, 2026
WORK IN PROGRESS
This report is currently under development. Not all statements, interpretations, and conclusions have been reviewed or validated by a human researcher. Please treat all content as preliminary analysis subject to revision.
Overview
This report presents a hierarchical Bayesian state space model for unemployment dynamics across education levels. The model uses a time-varying equilibrium — constructed from a B-spline basis — that the ODE pulls unemployment toward. Shocks are cleanly identified as temporary deviations from this structural baseline.
Spline-Based Equilibrium - B-spline basis (K=10) constructs smooth time-varying \(u_{\text{eq}}(t)\) - Random walk prior on spline coefficients for adaptive smoothness - Each education level has its own equilibrium trajectory with data-informed flexibility
Technical Approach - Non-centered parameterization for efficient MCMC sampling - Asymmetric shock shapes (power-law rise, stretched-exponential decay) for flexible impulse modeling - Education-level parallelization via Stan’s reduce_sum for multi-threaded sampling - Prior-centered initialization via make_init_at_prior_edu_parallel()
Recent Trends: December 2023 - April 2026
A close-up view of the most recent unemployment dynamics across education levels.
Show code
# Extract de-seasonalized trend for recent trends plottrend_all <-extract_trend(result, summary =TRUE)# Filter to December 2023 - April 2026# Formula: year_frac = year + (month - 0.5) / 12# December 2023 = 2023 + 11.5/12 ≈ 2023.958# April 2026 = 2026 + 3.5/12 ≈ 2026.292recent_trend <-data.frame(year_frac = trend_all$year_frac,education = trend_all$education,mean = trend_all$mean,lower = trend_all$q5,upper = trend_all$q95)# Add observed ratesfor (i inseq_along(stan_data$education_levels)) { edu <- stan_data$education_levels[i] idx <- recent_trend$education == edu obs_rate <- stan_data$n_unemployed[, i] / stan_data$n_total[, i] recent_trend$observed[idx] <- obs_rate}# Filter to recent periodrecent_trend <- recent_trend[recent_trend$year_frac >=2023.958& recent_trend$year_frac <=2026.292, ]
The hierarchical structure pools information across education levels. The between-education standard deviation (\(\sigma\)) parameters quantify how much groups differ. Smaller \(\sigma\) = stronger pooling.
sigma_plot <-data.frame(parameter =factor(sigma_display$Parameter, levels =rev(sigma_display$Parameter)),median = sigma_summary$median,q5 = sigma_summary$q5,q95 = sigma_summary$q95)ggplot(sigma_plot, aes(x = median, y = parameter)) +geom_point(size =3, color ="steelblue") +geom_errorbarh(aes(xmin = q5, xmax = q95), height =0.2, color ="steelblue") +labs(x ="Between-Education SD", y =NULL,title ="Hierarchical Pooling Strength",subtitle ="Smaller values = education groups are more similar") +theme_minimal(base_size =13)
Between-education standard deviations with 90% credible intervals
Education-Specific Parameter Values
Parameter estimates for key quantities across all seven education levels:
Show code
# Build per-edu parameter tableedu_params <-rbind(data.frame(result$fit$summary(variables =paste0("adj_speed[", 1:N_edu, "]")),param ="Adjustment Speed", edu = edu_levels),data.frame(result$fit$summary(variables =paste0("shock_2008_effect[", 1:N_edu, "]")),param ="2008 Shock Effect", edu = edu_levels),data.frame(result$fit$summary(variables =paste0("shock_2020_effect[", 1:N_edu, "]")),param ="2020 Shock Effect", edu = edu_levels),data.frame(result$fit$summary(variables =paste0("decay_2008[", 1:N_edu, "]")),param ="2008 Decay Rate", edu = edu_levels),data.frame(result$fit$summary(variables =paste0("decay_2020[", 1:N_edu, "]")),param ="2020 Decay Rate", edu = edu_levels),data.frame(result$fit$summary(variables =paste0("u_eq_mean[", 1:N_edu, "]")),param ="Mean Equilibrium", edu = edu_levels))ggplot(edu_params, aes(x = median, y = edu)) +geom_point(size =2.5, color ="steelblue") +geom_errorbarh(aes(xmin = q5, xmax = q95), height =0.2, color ="steelblue") +facet_wrap(~ param, scales ="free_x", ncol =2) +labs(y =NULL, x ="Posterior median with 90% CI",title ="Education-Specific Parameter Estimates") +theme_minimal(base_size =12)
Key parameter estimates by education level with 90% credible intervals
Key Insights
Adjustment speeds increase with education: less_than_hs has the slowest adjustment, PhD and professional degrees the fastest — consistent with more fluid labor markets at higher education levels.
Shock effects are tightly pooled across education groups — the Exponential(20) prior on \(\sigma\) reflects the economic prior that macro shocks affect all workers similarly. The model estimates similar shock magnitudes across groups.
Mean equilibrium unemployment ranges from ~1.3% (professional degrees) to ~10.2% (less than high school) — matching the raw data well while the time-varying equilibrium captures structural trends within each group.
Economic Parameter Estimates
Adjustment Speed and Equilibrium
The model estimates education-specific adjustment speeds and time-averaged equilibrium unemployment rates.
eq_df <-data.frame(education = edu_levels,median = eq_summary$median,lower = eq_summary$q5,upper = eq_summary$q95)eq_df$education <-factor(eq_df$education,levels = eq_df$education[order(eq_df$median)])ggplot(eq_df, aes(x = education, y = median *100)) +geom_point(size =4, color ="steelblue") +geom_errorbar(aes(ymin = lower *100, ymax = upper *100),width =0.2, linewidth =1, color ="steelblue") +coord_flip() +labs(x ="Education Level",y ="Mean Equilibrium Unemployment Rate (%)",title ="Structural Unemployment Rates by Education",subtitle ="Time-averaged u_eq(t) — the baseline toward which the ODE pulls") +theme(axis.text.y =element_text(size =11))
Time-averaged equilibrium unemployment rates by education level
Shock Effects
The key advantage of the state space model: shock effects are positive and identifiable. With hierarchical priors, we estimate both population-level means and education-specific effects.
The seasonal effect is the difference between the full model (with seasonality) and the trend (without). This directly shows how much unemployment oscillates due to seasonal hiring patterns.
Show code
# Extract seasonal effects directly computed by the modelseasonal_effects <-extract_seasonal_effects(result, summary =TRUE)# Focus on key education levelssample_edu <-c("phd", "bachelors", "less_than_hs")seas_subset <- seasonal_effects[seasonal_effects$education %in% sample_edu, ]# Plot the seasonal oscillation over timeggplot(seas_subset, aes(x = year_frac)) +geom_ribbon(aes(ymin = q5 *100, ymax = q95 *100, fill = education),alpha =0.3) +geom_line(aes(y = mean *100, color = education), linewidth =0.6) +geom_hline(yintercept =0, linetype ="dashed", color ="gray40") +facet_wrap(~education, scales ="free_y", ncol =1) +scale_x_continuous(breaks =seq(2000, 2025, 5)) +labs(x ="Year",y ="Seasonal Effect (percentage points)",title ="Seasonal Oscillation in Unemployment by Education",subtitle ="Positive = unemployment above trend; Negative = below trend; Band = 90% CI") +theme(legend.position ="none",strip.text =element_text(size =12, face ="bold"))
Seasonal oscillation in unemployment rates (Full model minus Trend)
Monthly Seasonal Pattern
The model includes a direct seasonal effect on unemployment (on the logit scale). This captures observed seasonal patterns like academic hiring calendars, summer employment fluctuations, and annual hiring cycles.
Key insight:\(u_{\text{eq}}(t)\) absorbs slow structural changes in labor markets (de-industrialization, education polarization, etc.), while shocks create transitory deviations. There is no separate “spline deviation” fighting with the equilibrium for identifiability.
1.2 Time-Varying Equilibrium via B-Splines
\(u_{\text{eq}}(t)\) is constructed from a cubic B-spline basis:
where \(B_k(t)\) are B-spline basis functions (\(K=10\)), and \(\beta_{k,i}\) are education-specific coefficients with a random walk prior for smoothness:
This prior penalizes rapid changes in the equilibrium trajectory, producing smooth curves that capture decade-scale structural trends without overfitting.
2. Extending the Model
2.1 Education Heterogeneity
Each of the 7 education levels has its own: - B-spline coefficients \(\beta_{k,i}\) → time-varying equilibrium \(u_{\text{eq},i}(t)\) - Adjustment speed \(\text{adj}_i\) (hierarchical) - Shock effects and decay rates (hierarchical) - Seasonal pattern (hierarchical)
All use non-centered hierarchical priors for efficient sampling:
where \(s_i(t) = u_{\text{eq},i}(t) \cdot \text{adj}_i\) is the baseline separation rate, \(I(t)\) is the shock impulse function, and \(\delta_i\) are education-specific shock magnitudes with strong hierarchical pooling:
We observe counts (binomial samples from the latent unemployment rate), not the true rate.
Process model (discrete-time evolution): \[\text{logit}(u_{t,i}) = \text{logit}(u_{t-1,i}) + \Delta u_{t,i} + \gamma_{m(t),i}\]
where \(\Delta u_{t,i} = \text{adj}_i \cdot (u_{\text{eq},i}(t) - u_{t-1,i}) + \text{shocks}\) is the ODE-predicted change. No separate innovation term — the time-varying equilibrium plus shocks capture all systematic dynamics.
The model uses Stan’s reduce_sum to parallelize across education levels. Each thread computes the complete trajectory + likelihood for one or more education levels, since education levels are independent in the ODE computation. See the multi-threading section below for performance details.
3.6 Generated Quantities
u_eq_mean: Time-averaged equilibrium per education level
Clean separation: equilibrium absorbs trends, shocks are transitory
The fundamental insight: By making the equilibrium time-varying rather than constant, the model cleanly separates structural labor market changes (absorbed by \(u_{\text{eq}}(t)\)) from transitory shock effects. No separate spline deviation fights the ODE for the same variation.
Hierarchical Parameter Structure
The model uses hierarchical priors to pool information across education levels, improving estimation efficiency while allowing group-specific variation.
Improved Estimates: Small groups (PhDs, professionals) borrow strength from population
Regularization: Prevents overfitting of education-specific parameters
Identifiability: Constrains some_college adjustment speed through population effect
Interpretability: Population means are directly interpretable aggregates
Residual Diagnostics: Model Fit Assessment
Observation residual (observed minus full model prediction): Positive values mean the model underpredicts unemployment; negative values mean overprediction. Shaded bands show 90% credible intervals from posterior uncertainty.
Figure 1: Observation residuals: observed minus model prediction with 90% credible intervals
Equilibrium trajectory (\(u_{\text{eq}}[t]\)): The time-varying structural unemployment rate (green) constructed from the B-spline basis. The blue band shows the full model fit including shocks and seasonal effects. Gray dots are observed monthly unemployment rates.
Figure 2: Time-varying equilibrium u_eq[t] (green) with full model fit (blue) and observed data (gray dots)
The green band shows the time-varying equilibrium toward which the ODE pulls unemployment. Deviations of the blue line from the green band represent the combined effect of the 2008 and 2020 shocks. The gray dots show the observed monthly unemployment rates from the CPS data.
Model Diagnostics
A well-specified model should have: - Zero divergent transitions - indicates proper posterior geometry - Zero max treedepth exceeded - indicates efficient sampling - E-BFMI > 0.2 (ideally > 0.7) - indicates good energy exploration - All chains completing - indicates stable computation - Rhat < 1.01 for all parameters - indicates chain convergence - ESS > 400 for all parameters - indicates sufficient effective samples
# Get the sampler diagnosticsnp <- bayesplot::nuts_params(result$fit)draws <- result$fit$draws(format ="draws_df")# Check if we have divergencesn_divergent <-sum(np$Value[np$Parameter =="divergent__"])cat("Total divergent transitions:", n_divergent, "\n")
Total divergent transitions: 0
Show code
if (n_divergent >0) {# Pairs plot for key parameters - shows where divergences occur p_pairs <-mcmc_pairs( draws,pars =c("mu_log_adj_speed", "sigma_log_adj_speed","mu_log_shock_2008", "sigma_log_shock_2008"),np = np,off_diag_args =list(size =0.75, alpha =0.5) )print(p_pairs)} else {cat("No divergent transitions - model geometry is well-behaved!\n")cat("\nThis indicates:\n")cat(" - Proper parameterization (non-centered hierarchical)\n")cat(" - Well-matched priors and data\n")cat(" - Good initialization at prior centers\n")}
No divergent transitions - model geometry is well-behaved!
This indicates:
- Proper parameterization (non-centered hierarchical)
- Well-matched priors and data
- Good initialization at prior centers
Energy Diagnostics (E-BFMI)
Low E-BFMI indicates the sampler is having trouble exploring the posterior. Values below 0.2 are concerning.
Show code
# Energy plotmcmc_nuts_energy(np, binwidth =1) +labs(title ="NUTS Energy Diagnostic",subtitle ="Overlapping distributions indicate good exploration")
Energy diagnostics: E-BFMI should be > 0.2 for each chain
ppc_data <-extract_ppc_data(result)# Only create plot if PPC data is availableif (!is.null(ppc_data)) {# Sample a few education levels for clarity sample_edu <-c("phd", "bachelors", "less_than_hs") ppc_sample <- ppc_data[ppc_data$education %in% sample_edu, ]ggplot(ppc_sample, aes(x = year_frac)) +geom_ribbon(aes(ymin = predicted_rate_lower *100,ymax = predicted_rate_upper *100),fill ="steelblue", alpha =0.3) +geom_line(aes(y = predicted_rate *100), color ="steelblue", linewidth =0.8) +geom_point(aes(y = observed_rate *100), alpha =0.5, size =1) +facet_wrap(~education, scales ="free_y") +scale_x_continuous(breaks =seq(2000, 2025, 5)) +labs(x ="Year",y ="Unemployment Rate (%)",title ="Posterior Predictive Check",subtitle ="Points: observed; Line: predicted mean; Band: 95% prediction interval") +theme(strip.text =element_text(size =11))} else {cat("Note: This model does not include posterior predictive samples.\n")cat("PPC can be added by including generated quantities in the Stan model.\n")}
Posterior predictive check: observed vs replicated data
Model Fit Assessment (LOO-CV)
We compare two variants of the hierarchical ODE state space model using leave-one-out cross-validation (LOO-CV):
Education-parallel model: Hierarchical structure with education-specific parameters but no systematic trends across education rank.
Education-trend model: Extends the hierarchical model with linear trends across education rank for key parameters (equilibrium unemployment, adjustment speed, shock magnitudes, recovery rates, spline smoothness).
Results: The education-parallel model has expected log predictive density (ELPD) of -8817.9 (SE = 101.5), while the education-trend model has ELPD of -8788.5 (SE = 102). The difference in ELPD is -29.3 ± 7.7 (3.81 standard errors).
Interpretation: The ELPD difference is small relative to its uncertainty (|difference| < 2×SE), indicating that adding education trends does not meaningfully improve predictive performance. The data do not support systematic linear trends across education rank for most labor market parameters, suggesting that education-specific heterogeneity is adequately captured by the hierarchical structure without imposing monotonic trends.
Summary
This hierarchical ODE state space model successfully models unemployment dynamics across education levels with proper identification of shock effects and labor market parameters.
Model Quality
Diagnostic
Requirement
Result
Divergences
0
✓
Max treedepth
0
✓
E-BFMI
> 0.7
✓
Chains
4/4
✓
All Rhat
< 1.01
✓
All ESS
> 400
✓
Key Findings
Time-varying equilibrium captures structural labor market change — \(u_{\text{eq}}(t)\) smoothly evolves to reflect decade-scale trends (de-industrialization, education polarization). Shocks are cleanly separated as transitory deviations.
The 2008 crisis produced lasting structural damage for non-PhD workers — Decay half-lives of 5-6 years for most groups (vs. 5 weeks for PhDs) indicate the shock persisted far beyond the recession. The equilibrium \(u_{\text{eq}}(t)\) shifted upward, absorbing permanent job losses.
The 2020 COVID shock was purely transitory — Half-lives of 4-5 months across all education levels, with no permanent equilibrium shift. The labor market bounced back quickly.
Adjustment speeds increase with education — Ranging from 8.9 (less than high school) to 57.3 (PhD) per month, reflecting more fluid labor markets at higher education levels.
Mean equilibrium unemployment ranges from 1.1% (professional degrees) to 8.9% (less than high school), closely matching raw data while the time-varying trajectory captures within-group structural trends.
Technical Success
Perfect convergence (0 divergences, 0/22,418 parameters with Rhat > 1.05, max Rhat 1.013, min ESS 298) achieved through: - Time-varying equilibrium — Eliminated identifiability fight between constant u_eq and spline deviation - Non-centered parameterization — Efficient sampling for all hierarchical parameters - Moderate hierarchical pooling — Exponential(2) on adj_speed sigma, Exponential(5) on shock sigmas, Exponential(1) on decay sigma - Prior-centered initialization — make_init_at_prior_edu_parallel() starts chains at prior centers
The model uses Stan’s reduce_sum to parallelize across education levels. Since education levels are independent in the ODE computation, each thread computes the complete trajectory + likelihood for one or more education levels.