p-value

Musings on Multiple Endpoints in RCTs

This article discusses issues related to alpha spending, effect sizes used in power calculations, multiple endpoints in RCTs, and endpoint labeling. Changes in endpoint priority is addressed. Included in the the discussion is how Bayesian probabilities more naturally allow one to answer multiple questions without all-too-arbitrary designations of endpoints as "primary" and "secondary". And we should not quit trying to learn.

Bayesian vs. Frequentist Statements About Treatment Efficacy

To avoid "false positives" do away with "positive". A good poker player plays the odds by thinking to herself "The probability I can win with this hand is 0.91" and not "

Statistical Errors in the Medical Literature

Misinterpretation of P-values and Main Study Results Dichotomania Problems With Change Scores Improper Subgrouping Serial Data and Response Trajectories Cluster Analysis As Doug Altman famously wrote in his Scandal of Poor Medical Research in BMJ in 1994, the quality of how statistical principles and analysis methods are applied in medical research is quite poor.

My Journey From Frequentist to Bayesian Statistics

The difference between Bayesian and frequentist inference in a nutshell: With Bayes you start with a prior distribution for θ and given your data make an inference about the θ-driven process generating your data (whatever that process happened to be), to quantify evidence for every possible value of θ.

A Litany of Problems With p-values

With the many problems that p-values have, and the temptation to "bless" research when the p-value falls below an arbitrary threshold such as 0.05 or 0.005, researchers using p-values should at least be fully aware of what they are getting.

Clinicians' Misunderstanding of Probabilities Makes Them Like Backwards Probabilities Such As Sensitivity, Specificity, and Type I Error

Optimum decision making in the presence of uncertainty comes from probabilistic thinking. The relevant probabilities are of a predictive nature: P(the unknown given the known). Thresholds are not helpful and are completely dependent on the utility/cost/loss function.

p-values and Type I Errors are Not the Probabilities We Need

In trying to guard against false conclusions, researchers often attempt to minimize the risk of a “false positive” conclusion. In the field of assessing the efficacy of medical and behavioral treatments for improving subjects’ outcomes, falsely concluding that a treatment is effective when it is not is an important consideration.

Null Hypothesis Significance Testing Never Worked

Much has been written about problems with our most-used statistical paradigm: frequentist null hypothesis significance testing (NHST), p-values, type I and type II errors, and confidence intervals. Rejection of straw-man null hypotheses leads researchers to believe that their theories are supported, and the unquestioning use of a threshold such as p<0.