In my opinion, null hypothesis testing and p-values have done significant harm to science. The purpose of this note is to catalog the many problems caused by p-values. As readers post new problems in their comments, more will be incorporated into the list, so this is a work in progress.
The American Statistical Association has done a great service by issuing its Statement on Statistical Significance and P-values. Now it’s time to act.
Imagine watching a baseball game, seeing the batter get a hit, and hearing the announcer say “The chance that the batter is left handed is now 0.2!”
No one would care. Baseball fans are interested in the chance that a batter will get a hit conditional on his being right handed (handedness being already known to the fan), the handedness of the pitcher, etc. Unless one is an archaeologist or medical examiner, the interest is in forward probabilities conditional on current and past states.
Classification combines prediction and decision making and usurps the decision maker in specifying costs of wrong decisions. The classification rule must be reformulated if costs/utilities change. Predictions are separate from decisions and can be used by any decision maker. The field of machine learning arose somewhat independently of the field of statistics. As a result, machine learning experts tend not to emphasize probabilistic thinking. Probabilistic thinking and understanding uncertainty and variation are hallmarks of statistics.