The Burden of Demonstrating HTE

Reasons are given for why heterogeneity of treatment effect must be demonstrated, not assumed. An example is presented that shows that HTE must exceed a certain level before personalizing treatment results in better decisions than using the average treatment effect for everyone.

Assessing Heterogeneity of Treatment Effect, Estimating Patient-Specific Efficacy, and Studying Variation in Odds ratios, Risk Ratios, and Risk Differences

This article shows an example formally testing for heterogeneity of treatment effect in the GUSTO-I trial, shows how to use penalized estimation to obtain patient-specific efficacy, and studies variation across patients in three measures of treatment effect.


datamethods.org is a discussion site where data methodologists meet each other and subject matter experts including clinical trialists and clinical researchers. Its development is documented here. Datamethods is provided by the Department of Biostatistics, Vanderbilt University School of Medicine. I have written some short articles on the site, listed below. Responder analysis: Loser x 4 Problems with NNT Should we ignore covariate imbalance and stop presenting a stratified ‘table one’ for randomized trials?

Viewpoints on Heterogeneity of Treatment Effect and Precision Medicine

This article provides my reflections after the PCORI/PACE Evidence and the Individual Patient meeting on 2018-05-31. The discussion includes a high-level view of heterogeneity of treatment effect in optimizing treatment for individual patients.

Musings on Multiple Endpoints in RCTs

This article discusses issues related to alpha spending, effect sizes used in power calculations, multiple endpoints in RCTs, and endpoint labeling. Changes in endpoint priority is addressed. Included in the the discussion is how Bayesian probabilities more naturally allow one to answer multiple questions without all-too-arbitrary designations of endpoints as “primary” and “secondary”. And we should not quit trying to learn.

Information Gain From Using Ordinal Instead of Binary Outcomes

This article gives examples of information gained by using ordinal over binary response variables. This is done by showing that for the same sample size and power, smaller effects can be detected

Statistical Criticism is Easy; I Need to Remember That Real People are Involved

I have been critical of a number of articles, authors, and journals in this growing blog article. Linking the blog with Twitter is a way to expose the blog to more readers. It is far too easy to slip into hyperbole on the blog and even easier on Twitter with its space limitations. Importantly, many of the statistical problems pointed out in my article, are very, very common, and I dwell on recent publications to get the point across that inadequate statistical review at medical journals remains a serious problem.

Continuous Learning from Data: No Multiplicities from Computing and Using Bayesian Posterior Probabilities as Often as Desired

(In a Bayesian analysis) It is entirely appropriate to collect data until a point has been proven or disproven, or until the data collector runs out of time, money, or patience. — Edwards, Lindman, Savage (1963) Introduction Bayesian inference, which follows the likelihood principle, is not affected by the experimental design or intentions of the investigator. P-values can only be computed if both of these are known, and as been described by Berry (1987) and others, it is almost never the case that the computation of the p-value at the end of a study takes into account all the changes in design that were necessitated when pure experimental designs encounter the real world.

Bayesian vs. Frequentist Statements About Treatment Efficacy

To avoid “false positives” do away with “positive”. A good poker player plays the odds by thinking to herself “The probability I can win with this hand is 0.91” and not “I’m going to win this game” when deciding the next move. State conclusions honestly, completely deferring judgments and actions to the ultimate decision makers. Just as it is better to make predictions than classifications in prognosis and diagnosis, use the word “probably” liberally, and avoid thinking “the evidence against the null hypothesis is strong, so we conclude the treatment works” which creates the opportunity of a false positive.

EHRs and RCTs: Outcome Prediction vs. Optimal Treatment Selection

Frank Harrell Professor of Biostatistics Vanderbilt University School of Medicine Laura Lazzeroni Professor of Psychiatry and, by courtesy, of Medicine (Cardiovascular Medicine) and of Biomedical Data Science Stanford University School of Medicine Revised July 17, 2017 It is often said that randomized clinical trials (RCTs) are the gold standard for learning about therapeutic effectiveness. This is because the treatment is assigned at random so no variables, measured or unmeasured, will be truly related to treatment assignment.