Bayesian Statistics

Bayesian Structural Time Series

Bayesian structural time series models decompose a time series into interpretable components — trend, seasonality, regression effects, and irregular variation — with Bayesian priors governing the flexibility of each component.

yₜ = Zₜαₜ + εₜ, αₜ₊₁ = Tₜαₜ + Rₜηₜ

Bayesian structural time series (BSTS) models represent a time series as the sum of latent structural components, each evolving according to a state-space model. A trend component captures the long-run trajectory; a seasonal component captures periodic fluctuations; regression components capture the effects of covariates. The state-space formulation provides a unified framework for estimation, forecasting, and — crucially — causal impact analysis, where the question is: what would have happened if an intervention had not occurred?

The Bayesian approach adds spike-and-slab priors for automatic variable selection among potentially many regressors, and places priors on the variance parameters that control component smoothness. The result is a flexible, interpretable, and probabilistic model that has become widely used in technology companies for measuring the causal effect of marketing campaigns, product launches, and policy changes.

State Space Formulation Observation equation:   yₜ = Zₜαₜ + εₜ,    εₜ ~ N(0, Hₜ)
State equation:         αₜ₊₁ = Tₜαₜ + Rₜηₜ,   ηₜ ~ N(0, Qₜ)

Common Components Local level trend:    μₜ₊₁ = μₜ + δₜ + ω₁ₜ,   δₜ₊₁ = δₜ + ω₂ₜ
Seasonal:            γₜ₊₁ = −Σⱼ₌₁ˢ⁻¹ γₜ₋ⱼ + ω₃ₜ
Regression:          βₜxₜ  (static or time-varying coefficients)

Causal Impact Analysis

The most prominent application of BSTS is the CausalImpact method (Brodersen et al., 2015), developed at Google. The idea is elegantly simple: fit a BSTS model to a time series before an intervention, then forecast what would have happened after the intervention in the absence of the intervention. The difference between the observed data and the counterfactual forecast is the estimated causal impact.

Causal Impact Impact at time t = yₜ(observed) − yₜ(counterfactual)
Cumulative impact = Σₜ [yₜ(observed) − yₜ(counterfactual)]

Bayesian Posterior The counterfactual yₜ has a full posterior predictive distribution,
so the causal impact has a posterior distribution with credible intervals.

The Bayesian framework is essential here because it produces a full posterior distribution over the counterfactual, not just a point estimate. This means the causal impact comes with calibrated uncertainty intervals — the analyst can say not just "we estimate the intervention increased sales by 15%" but "the probability that the intervention had a positive effect is 97%, with a 95% credible interval of 8% to 22%."

Why Not Just Use Difference-in-Differences?

Traditional causal inference methods like difference-in-differences require a parallel control group. BSTS models can estimate causal effects even without a perfect control, using the pre-intervention time series and contemporaneous covariates (related time series that were not affected by the intervention) to build the counterfactual. This makes BSTS applicable in many settings where controlled experiments are impossible but observational time series data are abundant.

Spike-and-Slab Variable Selection

When many potential regressors are available, BSTS models use spike-and-slab priors to automatically select which covariates enter the model. Each coefficient has a prior that is a mixture of a point mass at zero (the "spike," representing exclusion) and a diffuse distribution (the "slab," representing inclusion). The posterior inclusion probability for each variable provides a natural measure of its relevance.

Software and Applications

The R package bsts (Scott and Varian, 2014) and Google's CausalImpact package have made BSTS models accessible to practitioners. They are used extensively for marketing mix modeling, website traffic analysis, demand forecasting, and economic policy evaluation. Hal Varian, Google's chief economist, has been a prominent advocate of BSTS methods for business applications.

"The key advantage of Bayesian structural time series models is that they produce probabilistic forecasts, which means we get not just a prediction but a measure of how uncertain that prediction is." — Hal Varian and Steven Scott, "Predicting the Present with Bayesian Structural Time Series" (2014)

Related Topics