One of the misconceptions regarding the relationship between Bayesian inference and frequentist inference is that they will lead to the same inferences, and hence all confidence intervals can simply be interpreted in a Bayesian way. In the case where data are normally distributed, for instance, there is a particular prior that will lead to a confidence interval that is numerically identical to Bayesian credible intervals computed using the Bayesian posterior (Jeffreys, 1961; Lindley, 1965). This might lead one to suspect that it does not matter whether one uses confidence procedures or Bayesian procedures. We showed, however, that confidence intervals and credible intervals can disagree markedly. The only way to know that a confidence interval is numerically identical to some credible interval is to prove it. The correspondence cannot — and should not — be assumed.
More broadly, the defense of confidence procedures by noting that, in some restricted cases, they numerically correspond to Bayesian procedures is actually no defense at all. One must first choose which confidence procedure, of many, to use; if one is committed to the procedure that allows a Bayesian interpretation, then one’s time is much better spent simply applying Bayesian theory. If the benefits of Bayesian theory are desired — and they clearly are, by proponents of confidence intervals — then there is no reason why Bayesian inference should not be applied in its full generality, rather than using the occasional correspondence with credible intervals as a hand-waving defense of confidence intervals.
It is important to emphasize, however, that for many of the confidence procedures presented in the applied statistical literature, no effort has been made to show that the intervals have the properties that proponents of confidence intervals desire. We should expect, as a matter of course, that developers of new confidence intervals show that their intervals have the desired inferential properties, instead of just nominal coverage of the true value and “short” width. Because developers of confidence intervals have not done this, the push for confidence intervals rests on uncertain ground. Adopting Bayesian inference, where all inferences arise within a logical, unified framework, would render the problems of assessing the properties of these confidence procedures moot. If desired, coverage of a Bayesian procedure can also be assessed; but if one is interested primarily in reasonable post-data inference, then Bayesian properties should be the priority, not frequentist coverage (cf. Gelman, 2008; Wasserman, 2008).
For advocates of reasoning by intervals, adopting Bayesian inference would have other benefits. The end-points of a confidence interval are always set by the data. Suppose, however, we are interested in determining the plausibility that a parameter is in a particular range; for instance, in the United States, it is against the law to execute criminals who are intellectually disabled. The criterion used for intellectual disability in the US state of Florida is having a true IQ less than 70. Since IQ is measured with error, one might ask what confidence we have that a particular criminal’s true IQ is less than 70 (see Anastasi & Urbina, 1997 or Cronbach (1990) for an overview of confidence intervals for IQ). In this case, the interval we would like to assess for plausibility is no longer a function of the sample. The long-run probability that the true value is inside a fixed interval is unknown and is either 0 or 1, and hence no confidence procedure can be constructed, even though such information may be critically important to a researcher, policy maker, or criminal defendant (Pratt, Raiffa, & Schlaifer, 1995).
Even in seemingly simple cases where a fixed interval is nested inside a CI, or vice versa, one cannot draw conclusions about the plausibility of a fixed interval. One might assume that an interval nested within a CI must have lower confidence than the CI, given that it is shorter; however, as shown in Figure 1B, a 100% confidence interval (the likelihood) is nested within some of the 50% confidence intervals. Likewise, one might believe that if a CI is nested within a fixed interval, then the fixed interval must have greater probability than the interval. But in Figure 1A, one can imagine a fixed interval just larger than the 50% UMP interval; this will have much lower than 50% probability of containing the true value, due to the fact that it occupies a small proportion of the likelihood. Knowledge that the FCF is a fallacy prohibits one from using confidence intervals to assess the probability of fixed intervals. Bayesian procedures, on the other hand, offer the ability to compute the plausibility of any given range of values. Because all such inferences must be made from the posterior distribution, inferences must remain mutually consistent (Lindley, 1985; see also Fisher (1935) for a similar argument).
Moving to credible intervals from confidence intervals would necessitate a shift in thinking, however, away from a test-centric view with respect to intervals (e.g., “is 0 in the interval?”). Although every confidence interval can be interpreted as a test, credible intervals cannot be so interpreted. Assessing the Bayesian credibility of a specific parameter value by checking whether it is included in a credible interval is, as J. O. Berger (2006) puts it, “simply wrong.” When testing a specific value is of interest (such as a null hypothesis), that specific value must be assigned non-zero probability a priori. While not conceptually difficult, it is beyond the scope of this paper; see Rouder, Speckman, Sun, Morey, & Iverson (2009), E.-J. Wagenmakers, Lee, Lodewyckx, & Iverson (2008), or Dienes (2011) for accessible accounts.
Finally, we believe that in science, the meaning of our inferences are important. Bayesian credible intervals support an interpretation of probability in terms of plausibility, thanks to the explicit use of a prior. Confidence intervals, on the other hand, are based on a philosophy that does not allow inferences about plausibility, and does not utilize prior information. Using confidence intervals as if they were credible intervals is an attempt to smuggle Bayesian meaning into frequentist statistics, without proper consideration of a prior. As they say, there is no such thing as a free lunch; one must choose. We suspect that researchers, given the choice, would rather specify priors and get the benefits that come from Bayesian theory. We should not pretend, however, that the choice need not be made. Confidence interval theory and Bayesian theory are not interchangeable, and should not be treated as so.