Hierarchical ODE State Space Model for Unemployment Dynamics
A Bayesian Approach with Pooled Labor Market Parameters
WORK IN PROGRESS
This report is currently under development. Not all statements, interpretations, and conclusions have been reviewed or validated by a human researcher. Please treat all content as preliminary analysis subject to revision.
Overview
This report presents a hierarchical Bayesian state space model for unemployment dynamics across education levels. The model explicitly represents labor market flows through an ordinary differential equation (ODE) framework, with hierarchical pooling to share information across education groups.
Key Features
Economic Foundation - Explicit labor market dynamics: \(\frac{dU}{dt} = s(1-U) - fU\) - Separation rate (\(s\)): probability of job loss per month - Finding rate (\(f\)): probability of job finding per month - Equilibrium unemployment: \(u^* = \frac{s}{s+f}\)
Hierarchical Structure - Population-level means with education-specific deviations - Pooling strengthens inference for small groups - Hierarchical parameters: equilibrium unemployment, adjustment speed, shock magnitudes, recovery rates, seasonal effects, spline smoothness
The model is built on the stock-flow framework from labor economics. At any point in time, the labor force is divided into two stocks: employed (\(E\)) and unemployed (\(U\)), with \(E + U = L\) (total labor force). Workers flow between these states:
separation (s)
E ─────────────────────→ U
←────────────────────
finding (f)
1.1 The Continuous-Time ODE
The rate of change in unemployment follows the ordinary differential equation:
\[\frac{dU}{dt} = s \cdot E - f \cdot U = s \cdot (L - U) - f \cdot U\]
Dividing by \(L\) to get rates (lowercase \(u = U/L\)):
\[\frac{du}{dt} = s \cdot (1-u) - f \cdot u\]
Term-by-term interpretation:
\(s \cdot (1-u)\): Inflow to unemployment — employed workers (fraction \(1-u\)) lose jobs at rate \(s\)
\(f \cdot u\): Outflow from unemployment — unemployed workers (fraction \(u\)) find jobs at rate \(f\)
Net change = inflow − outflow
1.2 Equilibrium Analysis
At equilibrium, \(\frac{du}{dt} = 0\), so:
\[s \cdot (1-u^*) = f \cdot u^*\]
Solving for \(u^*\):
\[s - s \cdot u^* = f \cdot u^*\]\[s = u^* \cdot (s + f)\]\[\boxed{u^* = \frac{s}{s + f}}\]
Economic implications:
If \(s = 0.02\) (2% monthly separation) and \(f = 0.30\) (30% monthly finding): \(u^* = 0.02/(0.02+0.30) = 6.25\%\)
This captures observed seasonal patterns such as academic calendar effects for PhDs, summer employment fluctuations, and annual hiring cycles. Prior: \(\gamma^u_{m,i} \sim \text{Normal}(0, 0.05)\) with a sum-to-zero constraint.
Design choice: We use a direct effect on unemployment rather than modulating the finding rate because the two approaches are not separately identifiable from the data. The direct approach is simpler and more flexible.
2.4 State Space Formulation
We observe counts (binomial samples from the latent unemployment rate), not the true rate. The model has two layers:
Process model (latent state evolution): \[\text{logit}(u_{t,i}) = \text{logit}(u_{t-1,i}) + \Delta u_{t,i} + \epsilon_{t,i}\]
where: - \(\Delta u_{t,i}\) is the ODE-predicted change - \(\epsilon_{t,i} \sim \text{Normal}(0, \sigma_{\text{state}})\) is the innovation (captures unexplained dynamics)
Trend trajectory: Evolution without seasonal effects
Pure ODE trajectory: Evolution without innovations or seasonality
Seasonal effects: Difference between full model and trend
Log-likelihood: For LOO cross-validation
Posterior predictive: For model checking
4. Why This Model Succeeds Where GAM Fails
Feature
GAM Approach
State Space Model
Shock mechanism
Indicator variable (absorbed by trend)
Explicit ODE term with decay
Temporal structure
Smooth function (static)
Dynamic state evolution
Prior information
Penalty on curvature
Informative economic priors
Shock identification
Competes with 501-EDF trend
Structurally separated from trend
Interpretation
Abstract smooth values
Separation/finding rates
Recovery dynamics
Not modeled
Education-specific decay rates
Seasonality
Cyclic smooth
Finding rate multiplier
The fundamental insight: By building economic structure into the model (separation → unemployment → finding), shock effects have a clear mechanistic role. The explicit ODE dynamics provide interpretable estimates of labor market flows.
Hierarchical Parameter Structure
The model uses hierarchical priors to pool information across education levels, improving estimation efficiency while allowing group-specific variation.
A key question for hierarchical models: Which unemployment dynamics show the most education-specific variation?
The hierarchical structure allows us to directly compare the strength of pooling across all parameter groups by examining the between-education standard deviation (\(\sigma\)) for each.
Comparing pooling strength across all hierarchical parameters
Interpretation: Universal vs. Education-Specific Dynamics
Tight Pooling (σ < 0.15) - Nearly universal across education:
Show code
tight <- variance_summary[sigma_mean <0.15, parameter]if (length(tight) >0) {cat(paste0(" • ", tight, collapse ="\n"), "\n\n")cat("→ These dynamics are fundamental to all education levels\n")cat("→ Education-specific differences are small relative to uncertainty\n")cat("→ The data supports a common mechanism\n")} else {cat("No parameters show tight pooling.\n")}
• Spline smoothness
• Seasonal effects
→ These dynamics are fundamental to all education levels
→ Education-specific differences are small relative to uncertainty
→ The data supports a common mechanism
Education-specific parameter values for parameters with different pooling strengths
Key Insights
To meaningfully compare variability across parameters with different units, we use the coefficient of variation (CV = σ/|μ|), which measures relative variability:
Most Variable (by CV): Adjustment speed (adj_speed) (CV = 69.4%)
Education levels differ substantially in this aspect of unemployment dynamics
Hierarchical prior prevents overfitting while allowing genuine differences
Raw σ = 0.900
Least Variable (by CV): 2020 decay rate (CV = 1.8%)
All education levels share nearly identical dynamics for this parameter
Strong pooling is appropriate and improves estimation efficiency
Raw σ = 5.503
Practical Implications:
Parameters with high CV require education-specific interpretation
Parameters with low CV can be summarized at the population level
The hierarchy automatically discovers which dynamics are universal vs. education-specific
Note: Comparing raw σ values across parameters is invalid because they have different units
Data Preparation
Show code
# Data and model already loaded in setup chunk for report efficiency# Display data summarycat(sprintf("Time series length: %d months\n", stan_data$T))
# Check all chains completedif (length(result$diagnostics$ebfmi) <4) {cat(sprintf("WARNING: Only %d of 4 chains completed!\n", length(result$diagnostics$ebfmi)))}
Economic Parameter Estimates
Separation and Finding Rates
The model estimates education-specific job separation rates (probability of losing a job per month) and job finding rates (probability of finding a job per month while unemployed).
eq_df <-data.frame(education = stan_data$education_levels,mean = eq_summary$mean,lower = eq_summary$q5,upper = eq_summary$q95)# Order by mean rateeq_df$education <-factor(eq_df$education,levels = eq_df$education[order(eq_df$mean)])ggplot(eq_df, aes(x = education, y = mean *100)) +geom_point(size =4, color ="steelblue") +geom_errorbar(aes(ymin = lower *100, ymax = upper *100),width =0.2, linewidth =1, color ="steelblue") +coord_flip() +labs(x ="Education Level",y ="Equilibrium Unemployment Rate (%)",title ="Long-Run Unemployment Rates by Education",subtitle ="Based on estimated separation and finding rates") +theme(axis.text.y =element_text(size =11))
Equilibrium unemployment rates by education level (with 90% credible intervals)
Shock Effects
The key advantage of the state space model: shock effects are positive and identifiable. With hierarchical priors, we estimate both population-level means and education-specific effects.
The seasonal effect is the difference between the full model (with seasonality) and the trend (without). This directly shows how much unemployment oscillates due to seasonal hiring patterns.
Show code
# Extract seasonal effects directly computed by the modelseasonal_effects <-extract_seasonal_effects(result, summary =TRUE)# Focus on key education levelssample_edu <-c("phd", "bachelors", "less_than_hs")seas_subset <- seasonal_effects[seasonal_effects$education %in% sample_edu, ]# Plot the seasonal oscillation over timeggplot(seas_subset, aes(x = year_frac)) +geom_ribbon(aes(ymin = q5 *100, ymax = q95 *100, fill = education),alpha =0.3) +geom_line(aes(y = mean *100, color = education), linewidth =0.6) +geom_hline(yintercept =0, linetype ="dashed", color ="gray40") +facet_wrap(~education, scales ="free_y", ncol =1) +scale_x_continuous(breaks =seq(2000, 2025, 5)) +labs(x ="Year",y ="Seasonal Effect (percentage points)",title ="Seasonal Oscillation in Unemployment by Education",subtitle ="Positive = unemployment above trend; Negative = below trend; Band = 90% CI") +theme(legend.position ="none",strip.text =element_text(size =12, face ="bold"))
Seasonal oscillation in unemployment rates (Full model minus Trend)
Monthly Seasonal Pattern
The model includes a direct seasonal effect on unemployment (on the logit scale). This captures observed seasonal patterns like academic hiring calendars, summer employment fluctuations, and annual hiring cycles.
Direct monthly seasonal effects on unemployment by education level
Model Decomposition: Full vs Trend vs Pure ODE
This visualization compares three trajectories:
Full model (blue): Includes all components - ODE dynamics, shocks, seasonality, and stochastic innovations
Trend (orange): Removes seasonality but keeps innovations (shows underlying dynamics)
Pure ODE (red): Only structural ODE dynamics - no innovations, no seasonality (shows what separation/finding rates alone predict)
Show code
# Extract pure ODE trajectorypure_ode <-extract_pure_ode(result, summary =TRUE)# Debug: Check data dimensionscat("Full model data rows:", nrow(plot_data), "\n")
Full model data rows: 2170
Show code
cat("Trend data rows:", nrow(trend_data), "\n")
Trend data rows: 2170
Show code
cat("Pure ODE data rows:", nrow(pure_ode), "\n")
Pure ODE data rows: 2170
Show code
# Create separate data frames with only needed columnspure_ode_df <-data.frame(year_frac = pure_ode$year_frac,education = pure_ode$education,rate = pure_ode$mean,type ="Pure ODE")full_df <-data.frame(year_frac = plot_data$year_frac,education = plot_data$education,rate = plot_data$mean,type ="Full Model")trend_df <-data.frame(year_frac = trend_data$year_frac,education = trend_data$education,rate = trend_data$mean,type ="Trend (no seasonal)")# Combine alldecomp_data <-rbind(pure_ode_df, trend_df, full_df)# Debug: Check combined datacat("Combined data rows:", nrow(decomp_data), "\n")