Every medical test produces two types of errors: false positives and false negatives. The sensitivity (true positive rate) and specificity (true negative rate) describe how the test performs, but they do not directly answer the question a patient and clinician care about most: given that the test came back positive, what is the probability that the patient actually has the disease? Answering this question requires Bayes' theorem, and the answer depends critically on the prevalence — the base rate of the disease in the relevant population.
The Base Rate Problem
Consider a screening test with 99% sensitivity and 99% specificity applied to a disease with 0.1% prevalence. Among 100,000 people tested, 100 have the disease and 99 test positive (sensitivity). Of the 99,900 without disease, 999 test falsely positive (1% false positive rate). So of 1,098 positive results, only 99 are true positives — the positive predictive value (PPV) is just 9%. This counterintuitive result is one of the most important applications of Bayesian reasoning in all of medicine.
Negative Predictive Value NPV = (Specificity × (1 − Prevalence)) / [(Specificity × (1 − Prevalence)) + (1 − Sensitivity) × Prevalence]
The likelihood ratio formulation makes the Bayesian structure even clearer. The positive likelihood ratio LR+ = Sensitivity / (1 − Specificity) tells us how much a positive result should shift our odds. A test with LR+ of 10 moves the odds by a factor of ten — powerful for a patient with moderate pre-test probability, but insufficient for a patient with very low prior odds.
Sequential Testing and Clinical Reasoning
Clinicians rarely rely on a single test. Bayesian reasoning naturally extends to sequential testing, where the posterior probability after the first test becomes the prior for the second. A screening mammogram followed by a diagnostic ultrasound followed by a biopsy represents a chain of Bayesian updates, each refining the probability of malignancy. The ordering of tests — from cheap and broad to expensive and specific — follows naturally from Bayesian decision theory.
Research by Kahneman and Tversky showed that both clinicians and patients systematically neglect base rates when interpreting test results. Studies have found that many physicians dramatically overestimate the probability of disease given a positive screening test. Presenting results in natural frequencies (e.g., "out of 1,000 people tested...") rather than conditional probabilities significantly improves intuitive Bayesian reasoning.
ROC Curves and Decision Thresholds
The Receiver Operating Characteristic (ROC) curve plots sensitivity against (1 − specificity) across all possible decision thresholds. From a Bayesian perspective, the optimal threshold depends on the prior probability of disease, the costs of false positives and false negatives, and the patient's preferences. Bayesian decision theory provides a principled framework for choosing thresholds that minimize expected loss rather than arbitrarily maximizing the Youden index.
Applications Across Medicine
Bayesian diagnostic reasoning pervades modern medicine. In radiology, computer-aided detection systems combine imaging features with patient risk factors as priors. In infectious disease, the probability of a positive COVID-19 rapid antigen test depends on community prevalence, symptom status, and days since exposure. In prenatal screening, the combined first-trimester screen integrates maternal age (prior) with ultrasound and blood markers (likelihood) to produce a posterior risk of chromosomal abnormalities.
"The clinician who ignores the base rate when interpreting a diagnostic test is making the same error as the gambler who ignores the house odds." — David Spiegelhalter, on the centrality of Bayesian thinking in medical reasoning
Current Frontiers
Bayesian meta-analysis of diagnostic test accuracy (the bivariate or HSROC model) pools evidence from multiple studies while accounting for heterogeneity. Machine learning classifiers trained on electronic health records increasingly use Bayesian calibration to produce well-calibrated probability estimates rather than binary classifications. And shared decision-making tools present patients with personalized Bayesian risk estimates that integrate their individual risk factors with test results.