Bayesian Statistics

Bayesian Survival Analysis

Bayesian survival analysis applies Bayesian inference to time-to-event data, placing priors on hazard functions, regression coefficients, and frailty parameters to handle censoring, heterogeneity, and complex dependence structures in ways that frequentist methods cannot easily accommodate.

h(t | x) = h₀(t) · exp(β′x), β ~ π(β)

Survival analysis — also called time-to-event analysis or reliability analysis — studies the time until an event of interest occurs: death, machine failure, customer churn, disease recurrence. The defining feature of survival data is censoring: some subjects have not yet experienced the event by the end of the study, so their event times are known only to exceed the observation period. This partial information makes survival analysis technically distinct from standard regression.

The Bayesian approach to survival analysis places prior distributions on the parameters of hazard models, baseline hazard functions, and latent variables. This framework naturally handles the complexities that arise in practice — cure fractions (some subjects may never experience the event), competing risks (multiple event types), time-varying covariates, and hierarchical data structures (patients nested within hospitals).

Key Survival Quantities Survival function:   S(t) = P(T > t) = exp(−∫₀ᵗ h(u) du)
Hazard function:     h(t) = lim_{δ→0} P(t ≤ T < t+δ | T ≥ t) / δ
Cumulative hazard:   H(t) = ∫₀ᵗ h(u) du = −log S(t)

Cox Proportional Hazards (Bayesian) h(t | x) = h₀(t) · exp(β′x)
β ~ N(0, Σ₀)   (prior on regression coefficients)
h₀(t) ~ Gamma Process or Piecewise Constant

Bayesian Cox Regression

The Cox proportional hazards model is the most widely used survival model. It separates the hazard into a baseline hazard h₀(t), common to all subjects, and a multiplicative effect exp(β′x) that depends on covariates. The classical approach treats h₀(t) as a nuisance function and estimates β by partial likelihood, avoiding specification of h₀(t) entirely.

The Bayesian approach requires a prior on h₀(t). Common choices include the gamma process prior (Kalbfleisch, 1978), piecewise constant hazards with a Markov prior on adjacent intervals, and Gaussian process priors on the log-baseline hazard. Each choice represents a different belief about the smoothness and shape of the underlying hazard. The posterior provides full uncertainty quantification over both β and h₀(t), and the posterior predictive distribution yields survival curves for new patients with credible bands.

Why Go Bayesian for Survival Analysis?

Frequentist survival analysis runs into difficulties with small samples, many covariates, complex hierarchical structures, and informative priors from historical data. Bayesian methods handle all of these naturally: priors regularize estimates when data are sparse, hierarchical models borrow strength across groups, and prior information from previous trials can be formally incorporated. In oncology and rare diseases, where sample sizes are small and prior information is abundant, Bayesian survival analysis is increasingly standard.

Parametric and Semiparametric Models

Fully parametric models — Weibull, log-normal, log-logistic — specify the entire hazard function through a few parameters. Bayesian estimation is straightforward: place priors on the shape and scale parameters and sample from the posterior using MCMC. Semiparametric models like the Cox model are more flexible but require priors on the infinite-dimensional baseline hazard.

Bayesian nonparametric approaches go further, using Dirichlet process mixtures or Polya tree priors to model the survival distribution with minimal assumptions. These methods are particularly valuable when the true hazard shape is unknown and may have multiple modes, crossing hazards, or other features that standard parametric families cannot capture.

Competing Risks and Multi-State Models

In many applications, subjects face multiple possible events. A cancer patient may die from the cancer, from treatment complications, or from unrelated causes. Bayesian competing risk models place priors on cause-specific hazards and estimate the cumulative incidence of each event type. Multi-state models generalize further, modeling transitions between multiple disease states (healthy → diagnosed → treated → remission → relapse) with Bayesian estimation of transition intensities.

Applications

Bayesian survival analysis is used extensively in clinical trials (adaptive designs, historical borrowing, pediatric extrapolation), reliability engineering (predicting equipment failure), actuarial science (mortality modeling), and customer analytics (churn prediction). The R packages rstanarm and brms support Bayesian survival models, and specialized packages like survHE provide health economic extensions.

"In survival analysis, the Bayesian approach is not a luxury but a necessity — the combination of censoring, time-varying effects, and hierarchical data structures demands a framework that can handle them coherently." — Ibrahim, Chen, and Sinha, Bayesian Survival Analysis (2001)

Related Topics