Statistical Thinking

This blog is devoted to statistical thinking and its impact on science and everyday life. Emphasis is given to maximizing the use of information, avoiding statistical pitfalls, describing problems caused by the frequentist approach to statistical inference, describing advantages of Bayesian and likelihood methods, and discussing intended and unintended differences between statistics and data science. I’ll also cover regression modeling strategies, clinical trials, drug evaluation, medical diagnosis, and decision making.

Recent Posts

More Posts

I prefer fractions and ratios over percents. Here are the reasons.


It is easy to compute the sample size N1 needed to reliably estimate how one predictor relates to an outcome. It is next to impossible for a machine learning algorithm entertaining hundreds of features to yield reliable answers when the sample size < N1.


Methodologic goals and wishes for research and clinical practice for 2018


This post will grow to cover questions about data reduction methods, also known as unsupervised learning methods. These are intended primarily for two purposes: collapsing correlated variables into an overall score so that one does not have to disentangle correlated effects, which is a difficult statistical task reducing the effective number of variables to use in a regression or other predictive model, so that fewer parameters need to be estimated The latter example is the “too many variables too few subjects” problem.


I have been critical of a number of articles, authors, and journals in this growing blog article. Linking the blog with Twitter is a way to expose the blog to more readers. It is far too easy to slip into hyperbole on the blog and even easier on Twitter with its space limitations. Importantly, many of the statistical problems pointed out in my article, are very, very common, and I dwell on recent publications to get the point across that inadequate statistical review at medical journals remains a serious problem.




Recent & Upcoming Talks

The Next Generation of Clinical Trial Reporting
Nov 16, 2017 12:00 AM
Why Bayes for Clinical Trials?
Nov 16, 2017 12:00 AM


FDA Office of Biostatistics

Enhancing capabilities of CDER and its Office of Biostatistics in Bayesian clinical trial design and analysis


I teach the BIOS7330 Regression Modeling Strategies course in the Biostatistics Graduate Program at Vanderbilt University in the spring semester. The course web page is here. I teach a 4-day version of this course each May at Vanderbilt. Registration information for the short course may be found here.