datamethods.org is a discussion site where data methodologists meet each other and subject matter experts including clinical trialists and clinical researchers. Its development is documented here. Datamethods is provided by the Department of Biostatistics, Vanderbilt University School of Medicine.
I have written some short articles on the site, listed below.
Responder analysis: Loser x 4 Problems with NNT Should we ignore covariate imbalance and stop presenting a stratified ‘table one’ for randomized trials?
This article provides my reflections after the PCORI/PACE Evidence and the Individual Patient meeting on 2018-05-31. The discussion includes a high-level view of heterogeneity of treatment effect in optimizing treatment for individual patients.
Deep learning and other forms of machine learning are getting a lot of press in medicine. The reality doesn’t match the hype, and interpretable statistical models still have a lot to offer.
With the many problems that p-values have, and the temptation to “bless” research when the p-value falls below an arbitrary threshold such as 0.05 or 0.005, researchers using p-values should at least be fully aware of what they are getting. They need to know exactly what a p-value means and what are the assumptions required for it to have that meaning. ♦ A p-value is the probability of getting, in another study, a test statistic that is more extreme than the one obtained in your study if a series of assumptions hold.
Optimum decision making in the presence of uncertainty comes from probabilistic thinking. The relevant probabilities are of a predictive nature: P(the unknown given the known). Thresholds are not helpful and are completely dependent on the utility/cost/loss function.
Corollary: Since p-values are P(someone else’s data are more extreme than mine if H0 is true) and we don’t know whether H0 is true, it is a non-predictive probability that is not useful for decision making.
It is important to distinguish prediction and classification. In many decisionmaking contexts, classification represents a premature decision, because classification combines prediction and decision making and usurps the decision maker in specifying costs of wrong decisions. The classification rule must be reformulated if costs/utilities or sampling criteria change. Predictions are separate from decisions and can be used by any decision maker. Classification is best used with non-stochastic/deterministic outcomes that occur frequently, and not when two individuals with identical inputs can easily have different outcomes.