tag:blogger.com,1999:blog-5376139322696503442.post235031664904349573..comments2017-09-15T09:50:20.406-05:00Comments on Statistical Thinking: My Journey From Frequentist to Bayesian StatisticsFrank Harrellhttp://www.blogger.com/profile/15263496257600444093noreply@blogger.comBlogger37125tag:blogger.com,1999:blog-5376139322696503442.post-80696973225860453332017-09-01T11:25:51.930-05:002017-09-01T11:25:51.930-05:00No, posterior probabilities factor in all uncertai...No, posterior probabilities factor in all uncertainties and are self-contained. But if you want to discuss uncertainties in a parameter, that is represented by the entire posterior distribution.Frank Harrellhttps://www.blogger.com/profile/15263496257600444093noreply@blogger.comtag:blogger.com,1999:blog-5376139322696503442.post-40557112284370335442017-08-31T08:24:45.692-05:002017-08-31T08:24:45.692-05:00One disadvantage of a Bayesian approach is that it...One disadvantage of a Bayesian approach is that it doesn't give you an estimate of error. Can this be accomplished simply by applying bootstrap methods to obtain a confidence level to the posterior probabilities?Randy Collicahttps://www.blogger.com/profile/15300392002403806390noreply@blogger.comtag:blogger.com,1999:blog-5376139322696503442.post-44963127303162088952017-03-20T17:27:16.420-05:002017-03-20T17:27:16.420-05:00# realized previous post misspelled your name (apo...# realized previous post misspelled your name (apologies) <br /><br />Hello Deborah: I am not sure what is meant by "...required "the <br />possibility of infinitely many repetitions of identical experiments"." The usefulness of frequentist statistics, as I see it, is the emphasis on how ones model does perform over the long run given a set of assumptions (that may or may not be realistic). As such, I do not see this as much of a problem, so long as we do understand that the estimates (e.g., errors) are derived from many assumptions that probably do not generalize one-to-one to "real" world settings. That said, they can be very useful, and at minimum, we should ensure our model is has optimal calibration (whether this is obtained from error rates, coverage, parameter bias,..etc.). <br /><br />The estimates, however, are indeed computed from sampling a population with a set of characteristics many (many) of times. Infinite would be nice, but really we need about 5,000 to 10,000 repetitions to get an estimate that is stable. Of course, we can then vary population values to see what could happen over the long run under different assumptions. So, I really do not think the argument against repetition is very useful, as that is just how error (type one, or bias) and power are estimated. This is just the way it is, and may or may not makes sense, but is useful (IMO) so long as we do not take our model too seriously (they are all wrong after all!).<br /><br />I prefer Bayesian statistics, but I too study long run properties of my models. I do not really see any other way to see how a give model is performing. Posterior predictive checks are also useful, and often provide similar inferences as simulations. For example, exploring variances of each group can show misfit when assuming equal variances, and over the long run this misfit results in an inflated error rate. <br /><br />Furthermore, if confidence and credible intervals are basically equivalent with a so-called "uniformative" prior, it follows that Bayesian have, at minimum, expected error rates. One can not have equivalent intervals without this being the case! Once said, notion probably "feels" like common sense. With a prior centered at zero, and is informative, this will just make power lower than a frequentist estimate, so do not see how errors are not controlled. If anything, the Bayesian model can be considered sub-optimal (in NHST framework), since it can provide conservative estimates. This is the kind of Bayesian statistics that I use.<br /><br />Now, often Bayesians do not focus on error rates, and would prefer to have a model that best describes the data generating process. This approach often leads to controlling error rates, optimal power, among other things. This occurs as a by-product, so to speak, from focusing on modeling the data. <br /><br />Where Bayesian methods really shine is that we do not need to come up with different ways to estimate the standard error, or ways to approximate the degrees of freedom to obtain reasonable inferences. For example, even to accommodate unequal variances, a Welch's t-test resorts to approximating the degrees of freedom for the sampling distribution of the t-statistic. In a multilevel framework, the sampling distribution is entirely unknown, but people have figured out approximations that ensure optimal error rates. Indeed, some would even say that exact p-value do not exist (so much for exact error rates :-)) In contrast, whether comparing two group or multilevel with varying slopes and interecepts, Bayesian methods do not depend on a known sampling distribution and everything is estimated much the same (Yes, even NHST error rates are obtained!). It is actually quite elegant!<br /><br />D<br /><br />DeleteDonald Williamshttps://www.blogger.com/profile/13045015065678590524noreply@blogger.comtag:blogger.com,1999:blog-5376139322696503442.post-80060828428411716272017-03-20T07:12:24.779-05:002017-03-20T07:12:24.779-05:00I resonate with Donald's comments on these poi...I resonate with Donald's comments on these points and don't see justification for some of Deborah's. Writing simulation pseudo-code will expose many of the issues properly. I don't need to show long-run operating characteristics to show that Bayesian methods optimize the probability of making the right inference for a given set of data. True I need a large number of simulated clinical trials to demonstrate perfect calibration of Bayesian posterior probabilities, but these simulations are made under an entire array of treatment effects not for one single effect as with frequentist methods.Frank Harrellhttps://www.blogger.com/profile/15263496257600444093noreply@blogger.comtag:blogger.com,1999:blog-5376139322696503442.post-40686163161627081732017-03-19T21:24:58.545-05:002017-03-19T21:24:58.545-05:00This comment has been removed by the author.Donald Williamshttps://www.blogger.com/profile/13045015065678590524noreply@blogger.comtag:blogger.com,1999:blog-5376139322696503442.post-71618429584269989872017-03-19T20:45:08.978-05:002017-03-19T20:45:08.978-05:00Error statistics (neiterh Fisherian nor N-P style)...Error statistics (neiterh Fisherian nor N-P style) never required "the possibility of infinitely many repetitions of identical experiments". That's absurd. When people complain about cherry-picking, p-hacking, optional stopping, data-dependent endpoints, etc. it's because they prevent a stringent test in the case at hand. The appeal to the "ease" (frequency) of producing impressive-looking results, even under Ho, only alludes to hypothetical possibilities (nor need they be identical). Such appeals are at the heart of criticisms of bad statistics and bad science. Unless your Bayesianism takes account of "what could have occurred but didn't" I fail to see your grounds for caring about preregistration, RCTs, etc. You seem to have boxed yourself into an inconsistent position--and I don't know what kind of priors you favor-- based on a mickey-mouse caricature of hypotheses tests. <br />On the other hand, if your Bayesian does consider what could have occurred--counterfactual reasoning that we can simulate on our computers today--then you can't say such considerations are irrelevant. <br /><br />Frequentists also estimate, and any statistical inference can be appraised according to how well or stringently tested claims are (that's just semantics). Deborah Mayohttps://www.blogger.com/profile/06527423269272136310noreply@blogger.comtag:blogger.com,1999:blog-5376139322696503442.post-37396459411885604992017-02-27T20:33:53.227-06:002017-02-27T20:33:53.227-06:00I am not really sure there are deep problems with ...I am not really sure there are deep problems with confidence intervals. <br /><br />I use both Bayesian and frequentist in simulation studies, but only Bayesian for analyzing "real" data. That said, just because people misinterpret something does not mean it is bad. With this logic, almost all things in life have deep issues. I find confidence intervals useful, although I do not think that they necessarily generalize exactly to actual research situations. Exploring long-run outcomes, CI coverage, bias,..etc provide useful information, IMO. The problem as I see it, is individuals not realizing the limitations of models, frequentist, Bayesian, or agent based. Donald Williamshttps://www.blogger.com/profile/13045015065678590524noreply@blogger.comtag:blogger.com,1999:blog-5376139322696503442.post-13528068516477121752017-02-27T20:33:19.618-06:002017-02-27T20:33:19.618-06:00I am not really sure there are deep problems with ...I am not really sure there are deep problems with confidence intervals. <br /><br />I use both Bayesian and frequentist in simulation studies, but only Bayesian for analyzing "real" data. That said, just because people misinterpret something does not mean it is bad. With this logic, almost all things in life have deep issues. I find confidence intervals useful, although I do not think that they necessarily generalize exactly to actual research situations. Exploring long-run outcomes, CI coverage, bias,..etc provide useful information, IMO. The problem as I see it, is individuals not realizing the limitations of models, frequentist, Bayesian, or agent based. Donald Williamshttps://www.blogger.com/profile/13045015065678590524noreply@blogger.comtag:blogger.com,1999:blog-5376139322696503442.post-15623156653457397502017-02-27T17:52:20.326-06:002017-02-27T17:52:20.326-06:00I was speaking more of the problem with statistici...I was speaking more of the problem with statisticians and stat grad students understanding the concept. If after multiple attempts at understanding a primary concept in a paradigm one has to give up, there is a problem with the paradigm.Frank Harrellhttps://www.blogger.com/profile/15263496257600444093noreply@blogger.comtag:blogger.com,1999:blog-5376139322696503442.post-44101728506247037062017-02-27T12:57:37.102-06:002017-02-27T12:57:37.102-06:00Thank you for a very interesting and informative p...Thank you for a very interesting and informative post. One oft-repeated argument against the frequentist perspective that has never resonated with me is the fact that CI are hard to explain. In most (but not all) cases I'm interested in selecting the technique that will maximize my chance to deliver correct conclusions, regardless of how hard it would be for a collaborator to understand the statistical methods I used. Should I avoid using MCMC because my collaborators can't understand the technique?<br /><br />I may be at risk of attacking a straw-man here, because of course there are deep philosophical/statistical problems with CI. But that's kind of my point -- isn't it best to focus on these fundamentally problematic aspects rather than didactic issues?a.fosshttps://www.blogger.com/profile/12112682818521148455noreply@blogger.comtag:blogger.com,1999:blog-5376139322696503442.post-18680387730950640642017-02-26T22:42:19.440-06:002017-02-26T22:42:19.440-06:00Hi Frank:
I basically agree with everything you ha...Hi Frank:<br />I basically agree with everything you have said here. The trimmed means is far from intuitive. I did some simulations with the trimmed means to see how<br />a determined researcher can "find" significance. Basically, it reduces<br />to a multiple comparisons problem. However, even with the exact same data<br />but using different thresholds to trim, inflates the error rate almost<br />0.05 * the number of tests (assuming, on average, no difference between<br />group). Lots up researchers degrees of freedom with this approach, and others<br />(winsorizing).<br /><br /><br /> Donald Williamshttps://www.blogger.com/profile/13045015065678590524noreply@blogger.comtag:blogger.com,1999:blog-5376139322696503442.post-52523357030166762602017-02-26T15:58:55.054-06:002017-02-26T15:58:55.054-06:00Very interesting Donald, and do call me Frank. I ...Very interesting Donald, and do call me Frank. I would emphasize mean squared and mean absolute estimation errors, probably. I don't find trimmed means satisfactory because I'm unable to define to a collaborator what they mean prospectively. With Bayes you can estimate the mean or quantiles of the raw data distribution, or better, estimate the whole distribution.Frank Harrellhttps://www.blogger.com/profile/15263496257600444093noreply@blogger.comtag:blogger.com,1999:blog-5376139322696503442.post-82771284796214180012017-02-26T15:56:24.204-06:002017-02-26T15:56:24.204-06:00Thanks to all 4 of you for pointing us to some exc...Thanks to all 4 of you for pointing us to some excellent resources.Frank Harrellhttps://www.blogger.com/profile/15263496257600444093noreply@blogger.comtag:blogger.com,1999:blog-5376139322696503442.post-72820728950539701522017-02-26T13:58:02.000-06:002017-02-26T13:58:02.000-06:00I have a set of blog posts that are intended to pr...I have a set of blog posts that are intended to provide an accessible introduction:<br /><br />- http://gandenberger.org/2014/07/21/intro-to-statistical-methods/<br />- http://gandenberger.org/2014/07/28/intro-to-statistical-methods-2/<br />- http://gandenberger.org/2014/08/26/intro-to-statistical-methods-3/<br /><br />Greg Gandenbergerhttps://www.blogger.com/profile/07515101396390926107noreply@blogger.comtag:blogger.com,1999:blog-5376139322696503442.post-60293624454989572262017-02-26T13:57:15.254-06:002017-02-26T13:57:15.254-06:00This comment has been removed by the author.Greg Gandenbergerhttps://www.blogger.com/profile/07515101396390926107noreply@blogger.comtag:blogger.com,1999:blog-5376139322696503442.post-20302172500084054952017-02-26T13:56:40.300-06:002017-02-26T13:56:40.300-06:00This comment has been removed by the author.Greg Gandenbergerhttps://www.blogger.com/profile/07515101396390926107noreply@blogger.comtag:blogger.com,1999:blog-5376139322696503442.post-81743182172919226902017-02-26T13:50:04.126-06:002017-02-26T13:50:04.126-06:00This comment has been removed by the author.Greg Gandenbergerhttps://www.blogger.com/profile/07515101396390926107noreply@blogger.comtag:blogger.com,1999:blog-5376139322696503442.post-70181413781238931982017-02-26T11:40:30.740-06:002017-02-26T11:40:30.740-06:00Hi Frank (I hope that is OK). I (with a collaborat...Hi Frank (I hope that is OK). I (with a collaborator) am currently working on a paper "introducing" a Bayesian heteroscedastic skew-normal model. We are basically characterizing parameter bias, error rates, and power, all while estimating the degree of skew as well as sigma for each group. (yes, error rates. I don't really agree with type one error, but still find that is is useful so long as we understand its limitations).<br /><br />Interestingly, for those not wanting to learn Stan or Bayesian methods, I would loosely advocate the trimmed means approach as long as the degree of trim was not used to get a significant p-value. Counter to my original thoughts, the trimmed means approach actually does perform rather well. <br /><br />As you said, the Bayesian framework is much more interpretable (IMO) and does not entail directly altering the data. Furthermore, the Bayesian approach actually provides estimates for skew, sigma, etc, which are important for prospective power analyses. Here, it also becomes clear how important priors can be! Not that they influence the final estimates all that much (although they can), but that the model needs them to converge. <br /><br />Finally, if one does learn a general Bayesian approach, it will likely fulfill all their needs. This mean no jumping around packages or functions to get the "correct" estimate of standard error. Generalized estimating equations are a good example: there are many bias corrections (small N, etc.) for SE that it can make ones head spin (all resulting in slightly to drastically different p-values).Donald Williamshttps://www.blogger.com/profile/13045015065678590524noreply@blogger.comtag:blogger.com,1999:blog-5376139322696503442.post-8312696431700018212017-02-26T06:41:45.376-06:002017-02-26T06:41:45.376-06:00Nicely put Donald. Another example of flexibility...Nicely put Donald. Another example of flexibility that is worked out in detail in the Box and Tiao book is having a parameter specifying the degree of non-normality of the data, and having a prior for that. They show how this leads to something that is almost the trimmed mean but which is much more interpretable.Frank Harrellhttps://www.blogger.com/profile/15263496257600444093noreply@blogger.comtag:blogger.com,1999:blog-5376139322696503442.post-90393858992751282762017-02-25T10:05:01.179-06:002017-02-25T10:05:01.179-06:00This comment has been removed by the author.Donald Williamshttps://www.blogger.com/profile/13045015065678590524noreply@blogger.comtag:blogger.com,1999:blog-5376139322696503442.post-18260734020733174612017-02-25T10:03:33.734-06:002017-02-25T10:03:33.734-06:00Hi John: Differences in interpretation aside, I se...Hi John: Differences in interpretation aside, I see the greatest benefit of using Bayesian methods as flexibility. For example, fitting a skewed normal model, in which sigma and the skew are estimated for each group.This can also easily be extended to a multilevel framework. As far as I know, this is not currently possible in a frequentist framework.<br /><br />Finally, I see the use of so called noninformative prior as not very Bayesian. We generally can rule out effects greater than d of 1, if not less.<br /><br />In sum, the benefits of Bayesian are only fully realized, IMO, when one sees the benefits of informative prior (especially in MLM) and the great flexibility offered in Rstan and brms..etc.Donald Williamshttps://www.blogger.com/profile/13045015065678590524noreply@blogger.comtag:blogger.com,1999:blog-5376139322696503442.post-29955901372578393452017-02-24T14:50:53.577-06:002017-02-24T14:50:53.577-06:00Agreed. Moreover, a Bayesian perspective at the de...Agreed. Moreover, a Bayesian perspective at the design stage goes even further: It incorporates uncertainty into the research hypothesis, instead of assuming a specific effect size (or small set of candidate effect sizes) as is traditionally done in a frequentist approach to design.John K. Kruschkehttps://www.blogger.com/profile/17323153789716653784noreply@blogger.comtag:blogger.com,1999:blog-5376139322696503442.post-54160251474442753312017-02-24T14:42:05.348-06:002017-02-24T14:42:05.348-06:00For a few simple side-by-side comparisons of Bayes...For a few simple side-by-side comparisons of Bayesian and frequentist, for hypothesis testing and parameter estimation, see the article linked in this blog post: http://doingbayesiandataanalysis.blogspot.com/2017/02/the-bayesian-new-statistics-finally.html John K. Kruschkehttps://www.blogger.com/profile/17323153789716653784noreply@blogger.comtag:blogger.com,1999:blog-5376139322696503442.post-2894420133180182222017-02-24T14:39:20.589-06:002017-02-24T14:39:20.589-06:00Here's an introductory article focused exactly...Here's an introductory article focused exactly on putting frequentist and Bayesian side by side, for both hypothesis testing and parameter estimation. Links at this blog post: http://doingbayesiandataanalysis.blogspot.com/2017/02/the-bayesian-new-statistics-finally.htmlJohn K. Kruschkehttps://www.blogger.com/profile/17323153789716653784noreply@blogger.comtag:blogger.com,1999:blog-5376139322696503442.post-55972741351682924852017-02-23T08:14:44.626-06:002017-02-23T08:14:44.626-06:00Check out this YouTube lecture series by Richard M...Check out this YouTube lecture series by Richard McElreath. It parallels his outstanding book, "Statistical Rethinking."<br /><br />https://www.youtube.com/watch?v=WFv2vS8ESkk&list=PLDcUM9US4XdMdZOhJWJJD4mDBMnbTWw_zJohn Kapsonhttps://www.blogger.com/profile/05864654286672005256noreply@blogger.com