Bayesian Statistics

Probability Of Direction

The probability of direction (pd) is the probability that a parameter is strictly positive or strictly negative, offering a simple, intuitive Bayesian analogue to the frequentist p-value that avoids the interpretive pitfalls of null hypothesis significance testing.

pd = max(P(θ > 0 | y), P(θ < 0 | y))

The probability of direction (pd) quantifies the certainty about the sign of a parameter. Given a posterior distribution p(θ | y), the pd is defined as the maximum of the probability that θ is positive and the probability that θ is negative. It ranges from 0.5 (complete uncertainty about the sign) to 1 (complete certainty), and is arguably the simplest summary of "effect existence" available from a Bayesian posterior.

Unlike p-values, the pd has a direct probabilistic interpretation: pd = 0.97 means there is a 97% posterior probability that the effect is in the direction indicated by the posterior median. Unlike Bayes factors, it does not require specification of a prior under a point null hypothesis. And unlike credible intervals, it provides a single number that directly addresses the question "is this effect real?"

Probability of Direction pd  =  max(P(θ > 0 | y), P(θ < 0 | y))

Equivalently pd  =  max(∫₀^∞ p(θ | y) dθ,  ∫₋∞^0 p(θ | y) dθ)

From MCMC samples pd  ≈  max(proportion of samples > 0,  proportion of samples < 0)

Relationship to p-values

Under certain conditions, there is an approximate monotonic relationship between the pd and the two-sided p-value. For a symmetric, unimodal posterior centered near the maximum likelihood estimate (which holds asymptotically by the Bernstein-von Mises theorem), the approximate conversion is:

Approximate Conversion p-value  ≈  2 × (1 − pd)

Under this correspondence, pd = 0.975 maps to p ≈ 0.05, and pd = 0.995 maps to p ≈ 0.01. However, this relationship is approximate and depends on the posterior being roughly symmetric. For skewed posteriors, multimodal posteriors, or posteriors derived from informative priors, the mapping breaks down, and the pd should be interpreted on its own terms.

Advantages over Traditional Approaches

The pd has several attractive properties as a measure of evidence. First, it is prior-sensitive but not prior-dependent in the way Bayes factors are: it does not require specifying a point null or a particular alternative distribution. Second, it is invariant to reparameterization of the scale (though not to changes in the parameter itself). Third, it is trivially easy to compute from any set of posterior samples — simply count the proportion on each side of zero.

The pd also avoids the Jeffreys-Lindley paradox that afflicts Bayes factors with vague priors. Because it does not compare a point null against a continuous alternative, it does not suffer from the Bartlett paradox or the sensitivity to prior scale that plagues default Bayes factor computations.

The pd in the bayestestR Package

The R package bayestestR (Makowski et al., 2019) popularized the probability of direction as a practical tool for Bayesian inference reporting. The package provides the p_direction() function, which computes the pd from posterior samples. The authors recommend reporting the pd alongside other indices such as the 89% highest density interval and the Region of Practical Equivalence (ROPE) to provide a comprehensive picture of posterior evidence. Their simulation studies show that the pd has excellent sensitivity for detecting true effects while maintaining interpretability for non-statisticians.

Limitations and Critiques

The pd is not without limitations. Most importantly, it says nothing about the magnitude of an effect. A pd of 0.999 tells us the effect is almost certainly positive but says nothing about whether it is large enough to matter practically. For this reason, the pd should always be reported alongside measures of effect size, such as the posterior median and credible interval.

Additionally, the pd is defined relative to zero, which is often an arbitrary threshold. In many applications, the scientifically meaningful boundary is not zero but some minimum effect size. For such cases, the probability of the parameter exceeding a practically meaningful threshold — P(θ > δ | y) — is more informative.

Some Bayesian purists argue that the pd, by providing a single "significance-like" number, encourages the same dichotomous thinking that Bayesian methods are meant to transcend. The full posterior distribution is always more informative than any single summary, and the pd should be seen as a complement to — not a replacement for — thorough posterior exploration.

Practical Usage

The pd has found particular traction in psychology, neuroscience, and the social sciences, where researchers are familiar with p-values and seek Bayesian analogues that facilitate the transition from frequentist to Bayesian reporting. Guidelines from Makowski et al. suggest: pd ≥ 0.95 as "possibly existing," pd ≥ 0.97 as "likely existing," and pd ≥ 0.999 as "certainly existing," though these thresholds should be adapted to the domain.

"The probability of direction is the closest Bayesian equivalent to the frequentist p-value — but unlike the p-value, it has a straightforward interpretation: the probability that the effect goes in the indicated direction." — Dominique Makowski et al., Journal of Open Source Software (2019)

Worked Example: Assessing Drug Effect Direction from MCMC Samples

A clinical trial produces 30 MCMC posterior samples for the treatment effect parameter θ (positive = beneficial). We compute the probability of direction to assess whether the drug has a beneficial effect.

Given 30 posterior samples for θ:
0.45, 0.32, 0.67, 0.12, −0.05, 0.55, 0.38, 0.71, 0.28, 0.43,
0.19, 0.60, 0.35, −0.10, 0.50, 0.42, 0.65, 0.22, 0.48, 0.30,
0.58, 0.15, 0.40, 0.52, 0.33, 0.70, 0.25, 0.46, 0.08, 0.55

Step 1: Count Samples by Sign Positive (θ > 0): 28 out of 30
Negative (θ < 0): 2 out of 30

Step 2: Compute pd pd = max(28/30, 2/30) = max(0.933, 0.067) = 0.933

Step 3: Approximate p-value p ≈ 2 × (1 − pd) = 2 × 0.067 = 0.133

Step 4: Posterior Summary Median = 0.415, Mean = 0.388, SD = 0.203
95% CI: [−0.05, 0.71]

With pd = 93.3%, there is moderate evidence that the treatment effect is positive. The approximate equivalent p-value of 0.133 would not meet the conventional α = 0.05 threshold, but the pd tells us directly that there is a 93.3% posterior probability the drug is beneficial. Combined with the posterior median of 0.415, we have a likely positive but uncertain effect — warranting further study with a larger sample.

Interactive Calculator

Each row is an MCMC sample_value drawn from a posterior distribution. The calculator computes the probability of direction (pd): the proportion of the posterior on the same side as the median. pd near 1 indicates strong directional evidence. It also maps pd to a frequentist-like p-value for comparison.

Click Calculate to see results, or Animate to watch the statistics update one record at a time.

Related Topics