The problem with bayes

PS YC HOLOGICA L SC IENCE Commentary The Problem With Bayes Peter R Killeen Arizona State University Bayesian analysis is ideal for aggregating information, as likelihood analysis is for comparing alternative models that make theoretically motivated predictions For all other cases, prep permits the evaluation of results without bias from arbitrary priors and ad hoc alternatives THE PROBLEM WITH BAYES The problem with the Bayesian perspective, as portrayed by Wagenmakers and Gruănwalds (2006) article in this issue, is the same as the problem with the traditional (frequentist) perspective: Both evaluate hypotheses, rather than data Hypotheses are guesses about models, and model selection—in the simplest case of a null or an alternative—depends crucially on the hypothesis space: Just what alternatives are on the table? It is seldom obvious what those should be Usually researchers just want to know if their observations are reliable, without bemusement over alternatives, or without having an arbitrarily chosen alternative come back to bite them How can researchers code the question of reliability into formulas adequate to pass statistical muster? Wagenmakers and Gruănwalds Figure epitomizes their Bayesian perspective Under their assumption of equal prior plausibilities, their posterior probabilities fill most of Figure 1a between 128 and 9, depending on n Which alternative should we choose? The crystal-ball prior has its attractions, but the authors consider it quite unfair and offer us three exemplary alternatives: b(1, 1), b(6, 3), or b(7, 7) For realistic values of n, the choice matters Which will it be? How about b(6, 5)? What if we cannot choose? Such indifference requires calculating the probability of the alternatives’ union Why should we all this? What confidence will we have in any outcome that we obtain? WHAT IS TRUE? Wagenmakers and Gruănwald argue that we should adopt the Bayesian perspective because with increasing n, it favors the null: Bayes implies that the null is true; prep implies that the null is false But is the null true? Wagenmakers and Gruănwald seem to assume that it is But they should know better With n inAddress correspondence to Peter Killeen, Department of Psychology, Box 871104, Arizona State University, Tempe, AZ 85287-1104, e-mail: killeen@asu.edu Volume 17—Number creasing from 50 to 10,000, they kept getting significant effects! They programmed their computer to provide consistent evidence against the null, yet they published Figure to support a perspective favoring the null In contrast, prep supports a simple, correct, consistent decision about replicability at every single point along the line: Repeated sampling at the authors’ z scores will generate positive effects 92% of the time If Bayes implies that the null is true, but the null is false, what should we conclude about the authors’ Bayesian perspective? Perhaps Wagenmakers and Gruănwald meant to say that prep might predict high replicability even when effect sizes are so small as to be useless But an inferential statistic should not dictate what is useful I not think that this is what Wagenmakers and Gruănwald meant, in any case I think they knew that the null was true a priori: They generated their data by computing the number of successes needed for the tests to be marginally significant given that the null was true But given only the numbers, not their origin, the numbers consistently discredit the null Either the authors’ knowledge—their subjective prior favoring the null—was wrong or their implementation was biased, as they did not sample randomly under the null Our accepting their worst-case data-generation technique as a representative prior for nature’s data-generation techniques would also be wrong These kinds of mistakes in priors, which even experts can make, demonstrate one of the reasons why it is so dangerous to give alternatives that are based on vague priors equal footing with data PRIOR DISPOSITION AND CRYSTAL BALLS Priors are the rats in the woodpile How does prep avoid them? It ignores them; it leaves them out in the woodpile One can get to prep either via Fisher’s fiducial argument or by using uninformative or reference priors.1 As soon as enough data are collected to be worth considering, those ignorance priors are displaced by the real data: Alternative hypotheses are implicitly ‘‘integrated out,’’ based on their probability in light of the obtained data Bayesian priors, however, stay upstage with the When available, informative priors will ground a more accurate estimate of replicability, as will estimates of realization variance But such externalities should be set aside to keep prep an unbiased index of merit for the data on the table They can be considered subsequently, in meta-analytic estimates of effect size given other relevant data, including subjective priors It is from this vantage point that Wagenmakers and Gruănwalds Bayesian perspective is worth taking Copyright r 2006 Association for Psychological Science 643 The Bayesian Disposition null: The difference among the curves of Figure ‘‘occurs because the Bayesian analysis explicitly takes the alternative hypothesis into account (Wagenmakers & Gruănwald, 2006, p 642) If the alternative is a well-defined model of theoretical interest on whose prior plausibility there is consensus, that is wise If it is vague or composite, or if there is disagreement on its prior plausibility, ‘‘taking it into account’’ is like taking the rats into the house for a closer look Well, then, how about that crystal-ball prior? Is it really so unfair to use all the available data to guide our decision? Notice how its limit, 128, lies just above – prep in Figure 1a The difference occurs because the crystal ball posits the parameter, whereas prep, tempered by its fallibility as a statistic, is encumbered with a variance twice that used for the crystal ball That is what causes its lower elevation in the figure, and that is what makes it fair Prep is a simple, honest crystal ball It is available off the shelf—no tailoring with conjugate priors necessary It will not say if the null is true, or how much better than a hypothetical 644 alternative the null may be; but it will predict how often replications of our experiment will support our finding of a positive effect Perhaps that is the first question to ask about our data Acknowledgments—National Science Foundation Grant IBN 0236821 and National Institute of Mental Health Grant 1R01MH066860 supported this work REFERENCES Wagenmakers, E.-J., & Gruănwald, P (2006) A Bayesian perspective on hypothesis testing: A comment on Killeen (2005) Psychological Science, 17, 641–642 (RECEIVED 11/28/05; ACCEPTED 1/23/06; FINAL MATERIALS RECEIVED 1/27/06) Volume 17—Number .. .The Bayesian Disposition null: The difference among the curves of Figure ‘‘occurs because the Bayesian analysis explicitly takes the alternative hypothesis into account (Wagenmakers... Figure 1a The difference occurs because the crystal ball posits the parameter, whereas prep, tempered by its fallibility as a statistic, is encumbered with a variance twice that used for the crystal... ‘‘taking it into account’’ is like taking the rats into the house for a closer look Well, then, how about that crystal-ball prior? Is it really so unfair to use all the available data to guide our decision?

Định dạng
Số trang	2
Dung lượng	37,29 KB