This article provides my reflections after the PCORI/PACE Evidence and the Individual Patient meeting on 2018-05-31. The discussion includes a high-level view of heterogeneity of treatment effect in optimizing treatment for individual patients.
Deep learning and other forms of machine learning are getting a lot of press in medicine. The reality doesn’t match the hype, and interpretable statistical models still have a lot to offer.
In my opinion, null hypothesis testing and p-values have done significant harm to science. The purpose of this note is to catalog the many problems caused by p-values. As readers post new problems in their comments, more will be incorporated into the list, so this is a work in progress.
The American Statistical Association has done a great service by issuing its Statement on Statistical Significance and P-values. Now it’s time to act.
Imagine watching a baseball game, seeing the batter get a hit, and hearing the announcer say “The chance that the batter is left handed is now 0.2!”
No one would care. Baseball fans are interested in the chance that a batter will get a hit conditional on his being right handed (handedness being already known to the fan), the handedness of the pitcher, etc. Unless one is an archaeologist or medical examiner, the interest is in forward probabilities conditional on current and past states.
It is important to distinguish prediction and classification. In many decisionmaking contexts, classification represents a premature decision, because classification combines prediction and decision making and usurps the decision maker in specifying costs of wrong decisions. The classification rule must be reformulated if costs/utilities or sampling criteria change. Predictions are separate from decisions and can be used by any decision maker. Classification is best used with non-stochastic/deterministic outcomes that occur frequently, and not when two individuals with identical inputs can easily have different outcomes.