The recent PATH (Predictive Approaches to Treatment effect Heterogeneity) Statement outlines principles, criteria, and key considerations for applying predictive approaches to clinical trials to provide patient-centered evidence in support of decision making. Here challenges in implementing the PATH Statement are addressed with the GUSTO-I trial as a case study.

Department of Biomedical Data Sciences Leiden University Medical Center NL @ESteyerberg

Published

November 24, 2020

Evidence is derived from groups while most medical decisions are made for individual patients. — Kent et al, PATH statement

Heterogeneity of treatment effect (HTE) refers to the nonrandom variation in the magnitude of the absolute treatment effect (treatment benefit) across individual patients. The recent PATH (Predictive Approaches to Treatment effect Heterogeneity) Statement outlines principles, criteria, and key considerations for applying predictive approaches to clinical trials to provide patient-centered evidence in support of decision making. The focus of PATH is on modeling of HTE across individual patients.

The PATH statement lists a number of principles and guidelines. A first principle is to establish overall treatment effect. In another blog, I summarized the arguments in favor of covariate adjustment as the primary analysis of a RCT. Illustration was in the GUSTO-I trial. Here I continue that illustration, also following the blog by Frank Harrell on examining HTE.

Illustration in the GUSTO-I trial

Let’s analyze the data from 30,510 patients with an acute myocardial infarction as included in the GUSTO-I trial. In GUSTO-I, 10,348 patients were randomized to receive tissue plasminogen activator (tPA), while 20,162 were randomzied to Streptokinase (SK) and had 30-day mortality status known.

Code

load(url('http://hbiostat.org/data/repo/gusto.rda'))# keep only SK and tPA arms; and selected set of covariatesgusto <-upData(gusto[gusto$tx=="SK"| gusto$tx=="tPA",], keep=Cs(day30, tx, age, Killip, sysbp, pulse, pmi, miloc, sex),print=FALSE)html(describe(gusto), scroll=TRUE)

gusto Descriptives

gusto

9 Variables 30510 Observations

day30

n

missing

distinct

Info

Sum

Mean

Gmd

30510

0

2

0.195

2128

0.06975

0.1298

sex: Sex

n

missing

distinct

30510

0

2

Value male female
Frequency 22795 7715
Proportion 0.747 0.253

Killip: Killip Class

n

missing

distinct

30510

0

4

Value I II III IV
Frequency 26007 3857 417 229
Proportion 0.852 0.126 0.014 0.008

Value Inferior Other Anterior
Frequency 17582 1062 11866
Proportion 0.576 0.035 0.389

pmi: Previous MI

n

missing

distinct

30510

0

2

Value no yes
Frequency 25452 5058
Proportion 0.834 0.166

tx: Tx in 3 groups

n

missing

distinct

30510

0

2

Value SK tPA
Frequency 20162 10348
Proportion 0.661 0.339

Overall treatment effect

The primary outcome was 30-day mortality. Among the tPA group, the 30-day mortality was 653/10,348 = 6.3% vs 1475/20,162 = 7.3% in the SK group. This an absolute difference of 1.0%, or an odds ratio of 0.85 [0.78-0.94].

# 'tpa' as 0/1 variable for tPA vs SK treatmentgusto$tx <-droplevels(gusto$tx) # Drop the SK + tPA category which has 0 obsgusto$tpa <-as.numeric(gusto$tx) -1# 0/1 coding for tPA treatmentlabel(gusto$tpa) <-"tPA"levels(gusto$tpa) <-c("SK", "tPA")tab2 <-table(gusto$day30, gusto$tpa)result <-OddsRatio(tab2, conf.level =0.95)names(result) <-c("Odds Ratio", "Lower CI", "Upper CI")kable(as.data.frame(t(result))) %>%kable_styling(full_width=F, position ="left")

The unadjusted odds ratio of 0.853 is a marginal estimate. As explained in the other blog, a lot can be said in favor of conditional estimates, where we adjust for prognostically important baseline characteristics. In line with Califf 1997 and Steyerberg 2000, we consider a prediction model with 6 baseline covariates, including age and Killip class (a measure for ventricular function). Pulse rate is modeled using a linear spline with a knot at 50 beats/minute.

lrm(formula = day30 ~ tpa + age + Killip + pmin(sysbp, 120) +
lsp(pulse, 50) + pmi + miloc, data = gusto, x = T, maxit = 99)

Model Likelihood
Ratio Test

Discrimination
Indexes

Rank Discrim.
Indexes

Obs 30510

LR χ^{2} 2991.95

R^{2} 0.235

C 0.815

0 28382

d.f. 11

R^{2}_{11,30510} 0.093

D_{xy} 0.631

1 2128

Pr(>χ^{2}) <0.0001

R^{2}_{11,5938.7} 0.395

γ 0.631

max |∂log L/∂β| 7×10^{-10}

Brier 0.056

τ_{a} 0.082

β

S.E.

Wald Z

Pr(>|Z|)

Intercept

-3.0203

0.7973

-3.79

0.0002

tpa

-0.2080

0.0529

-3.93

<0.0001

age

0.0769

0.0025

31.28

<0.0001

Killip=II

0.6137

0.0589

10.42

<0.0001

Killip=III

1.1610

0.1214

9.57

<0.0001

Killip=IV

1.9213

0.1618

11.87

<0.0001

sysbp

-0.0392

0.0019

-20.33

<0.0001

pulse

-0.0242

0.0159

-1.52

0.1282

pulse'

0.0433

0.0162

2.67

0.0075

pmi=yes

0.4472

0.0562

7.96

<0.0001

miloc=Other

0.2863

0.1347

2.13

0.0335

miloc=Anterior

0.5432

0.0511

10.62

<0.0001

So, we note that the adjusted regression coefficient for tPA was -0.208. The adjusted OR = 0.812.
Let’s check for statistical interaction with

Individual covariates

The linear predictor

Code

## check for interaction# 1. traditional approachg <-lrm(day30 ~ tx * (age + Killip +pmin(sysbp, 120) +lsp(pulse, 50) + pmi + miloc), data=gusto, maxit=100)print(anova(g)) # tx interactions: 10 df, p=0.5720; based on LR test

Wald Statistics for day30

χ^{2}

d.f.

P

tx (Factor+Higher Order Factors)

23.48

11

0.0151

All Interactions

8.53

10

0.5772

age (Factor+Higher Order Factors)

981.65

2

<0.0001

All Interactions

1.60

1

0.2065

Killip (Factor+Higher Order Factors)

281.44

6

<0.0001

All Interactions

1.37

3

0.7131

sysbp (Factor+Higher Order Factors)

412.30

2

<0.0001

All Interactions

0.11

1

0.7369

pulse (Factor+Higher Order Factors)

235.92

4

<0.0001

All Interactions

4.22

2

0.1214

Nonlinear (Factor+Higher Order Factors)

7.37

2

0.0251

pmi (Factor+Higher Order Factors)

64.23

2

<0.0001

All Interactions

0.05

1

0.8233

miloc (Factor+Higher Order Factors)

115.23

4

<0.0001

All Interactions

2.99

2

0.2239

tx × age (Factor+Higher Order Factors)

1.60

1

0.2065

tx × Killip (Factor+Higher Order Factors)

1.37

3

0.7131

tx × sysbp (Factor+Higher Order Factors)

0.11

1

0.7369

tx × pulse (Factor+Higher Order Factors)

4.22

2

0.1214

Nonlinear

0.08

1

0.7784

Nonlinear Interaction : f(A,B) vs. AB

0.08

1

0.7784

tx × pmi (Factor+Higher Order Factors)

0.05

1

0.8233

tx × miloc (Factor+Higher Order Factors)

2.99

2

0.2239

TOTAL NONLINEAR

7.37

2

0.0251

TOTAL INTERACTION

8.53

10

0.5772

TOTAL NONLINEAR + INTERACTION

15.57

11

0.1579

TOTAL

2343.20

21

<0.0001

Code

# 2. PATH statement: linear interaction with linear predictor; baseline risk, so tx=ref, SKlp.no.tx <- f$coefficients[1] +0* f$coefficients[2] + f$x[,-1] %*% f$coefficients[-(1:2)]gusto$lp <-as.vector(lp.no.tx) # add lp to data frameh <-lrm(day30 ~ tx * lp, data=gusto)print(anova(h)) # tx interaction: 1 df, p=0.35; based on Wald statistics

Wald Statistics for day30

χ^{2}

d.f.

P

tx (Factor+Higher Order Factors)

16.24

2

0.0003

All Interactions

0.86

1

0.3526

lp (Factor+Higher Order Factors)

2338.37

2

<0.0001

All Interactions

0.86

1

0.3526

tx × lp (Factor+Higher Order Factors)

0.86

1

0.3526

TOTAL

2342.20

3

<0.0001

So:

The overall test for interaction with the individual covariates is far from statistically significant (p>0.5).

Similarly, the test for interaction with the linear predictor is far from statistically significant (p>0.3).

Conclusion: no interaction needed

We conclude that we may proceed by ignoring any interactions. We have no evidence against the assumption that the overall effect of treatment is applicable to all patients.
The patients vary widely in risk, as can easily be seen in the histogram below.

Code

hist(plogis(lp.no.tx), xlim=c(0,.4), main="")

We note that many patients have baseline risks (tPA==0) below 5%. Obviously their maximum benefit is bounded by this risk estimate, even if tPA, hypothetically, would reduce the risk to Null.

Reporting RCT results stratified by a risk model is encouraged when overall trial results are positive to better understand the distribution of effects across the trial population.

How do we provide such reporting of RCT results?

We can provide estimates of relative effects and absolute benefit

By group (e.g. quarters defined by quartiles)

By baseline risk (continuous, as in the histogram above)

1a. Relative effects of treatment by risk-group

The checks for interaction were far from statistically significant in GUSTO-I. We can further illustrate the relative effects in a forest plot. Many reports from RCTs include forest plots that show relative effects by single variables, such as men vs women; young vs old age; disease subtype; etc.
The PATH statement encourages reporting by risk-based subgroup. How can such reporting be done?

Let’s do some data processing to make a better forest plot.

A PATH compatible forest plot

Let’s expand the standard forest plot for subgroup effects with risk-based subgroups.
Below subgroup effects are defined for 4 risk-based groups using cut2(lp.no.tx, g=4),
and for 3 classical subgroups (by sex, age, type of infarction).

Code

## quantiles, suggest to use quartersgroups <-cut2(lp.no.tx, g=4)group0 <- groups[gusto$tpa==0] # SK gropupgroup1 <- groups[gusto$tpa==1] # tPA grouprate0 <-prop.table(table(group0, gusto$day30[gusto$tpa==0]),1 )[,2]rate1 <-prop.table(table(group1, gusto$day30[gusto$tpa==1]),1 )[,2]ratediff <- rate0-rate1 # benefit of tPA by group# Make a data frame for the resultsdata.subgroups <-as.data.frame(matrix(nrow=(4+6+1), ncol=10))colnames(data.subgroups) <-c("tevent", "tnoevent", "cevent", "cnoevent", "name", "type", "tn", "pt", "cn", "pc")data.subgroups[11,1:4] <-table(gusto$tpa,gusto$day30)[4:1] # overall results# define event and non-event numbersevents1 <-table(group0, gusto$day30[gusto$tpa==0])[,2]nevents1 <-table(group0, gusto$day30[gusto$tpa==0])[,1]events2 <-table(group1, gusto$day30[gusto$tpa==1])[,2]nevents2 <-table(group1, gusto$day30[gusto$tpa==1])[,1]n1 <- events1 + nevents1n2 <- events2 + nevents2data.subgroups[10:7,1:4] <-cbind(events2,nevents2,events1,nevents1)

Data for classic subgroups are managed below:

Code

# Use `table` to get the summary of cell numbers, by subgroup# SEXdata.subgroups[5,1:4] <-table(1-gusto$day30,1-gusto$tpa, gusto$sex)[1:4]data.subgroups[6,1:4] <-table(1-gusto$day30,1-gusto$tpa, gusto$sex)[5:8]# AGEdata.subgroups[3,1:4] <-table(1-gusto$day30,1-gusto$tpa, gusto$age>=75)[1:4]data.subgroups[4,1:4] <-table(1-gusto$day30,1-gusto$tpa, gusto$age>=75)[5:8]# ANTdata.subgroups[1,1:4] <-table(1-gusto$day30,1-gusto$tpa, gusto$miloc=="Anterior")[1:4]data.subgroups[2,1:4] <-table(1-gusto$day30,1-gusto$tpa, gusto$miloc=="Anterior")[5:8]# Namesdata.subgroups[11,5] <-"Overall"data.subgroups[10:7,5] <-paste("Quarter",1:4, sep=" ")data.subgroups[5:6,5] <-c("Male sex","Female sex")data.subgroups[3:4,5] <-c("Age <75","Age>=75")data.subgroups[1:2,5] <-c("Other MI","Anterior")# Type of subgroupdata.subgroups[11,6] <-""data.subgroups[10:7,6] <-c(rep("Risk-based subgroups", length(ratediff)))data.subgroups[1:6,6] <-c(rep("Location",2), rep("Age",2), rep("Sex",2))data.subgroups[,7] <- data.subgroups[,1] + data.subgroups[,2]data.subgroups[,8] <-paste(round(100*data.subgroups[,1] / data.subgroups[,7] , 1),"%", sep="")data.subgroups[,9] <- data.subgroups[,3] + data.subgroups[,4]data.subgroups[,10] <-paste(round(100*data.subgroups[,3] / data.subgroups[,9] , 1),"%", sep="")# Show the datakable(as.data.frame((data.subgroups))) %>%kable_styling(full_width=F, position ="left")

tevent

tnoevent

cevent

cnoevent

name

type

tn

pt

cn

pc

301

6025

649

11669

Other MI

Location

6326

4.8%

12318

5.3%

352

3670

826

7018

Anterior

Location

4022

8.8%

7844

10.5%

397

8622

981

16781

Age <75

Age

9019

4.4%

17762

5.5%

256

1073

494

1906

Age>=75

Age

1329

19.3%

2400

20.6%

383

7342

888

14182

Male sex

Sex

7725

5%

15070

5.9%

270

2353

587

4505

Female sex

Sex

2623

10.3%

5092

11.5%

455

2163

1000

4009

Quarter 4

Risk-based subgroups

2618

17.4%

5009

20%

130

2479

300

4719

Quarter 3

Risk-based subgroups

2609

5%

5019

6%

49

2481

130

4967

Quarter 2

Risk-based subgroups

2530

1.9%

5097

2.6%

19

2572

45

4992

Quarter 1

Risk-based subgroups

2591

0.7%

5037

0.9%

653

1475

9695

18687

Overall

2128

30.7%

28382

34.2%

In this table tevent means #events among treated; cevent means #events among non-treated; etc

Results can be plotted with metafor functions:

Code

par(mar=c(4,4,1,2))### fit random-effects model (use slab argument to define "study" labels)res <-rma(ai=tevent, bi=tnoevent, ci=cevent, di=cnoevent, data=data.subgroups, measure="OR",slab=name, method="ML")### set up forest plot (with 2x2 table counts added); rows argument is used### to specify exactly in which rows the outcomes will be plotted)forest(res, xlim=c(-8, 2.5), at=log(c(0.5, 1)), alim=c(log(0.2), log(2)), atransf=exp,ilab=cbind(data.subgroups$tn, data.subgroups$pt, data.subgroups$cn, data.subgroups$pc),ilab.xpos=c(-5,-4,-3,-2), adj=1,cex=.9, ylim=c(0, 19),rows=c(1:2, (4:5)-.5, 6:7, 10:13, 15),xlab="", mlab="", psize=.8, addfit=F)# lines(x=c(-.15, -.15), y=c(0, 17)) ## could add a reference line of the overall treatment effecttext(c(-5,-4,-3,-2, 2.2), 18, c("n", "%mort", "n", "%mort", "OR [95% CI]"), font=2, adj=1, cex=.9)text(-8, 18, c("GUSTO-I trial"), font=2, adj=0, cex=.9)text(c(-4.5,-2.5), 19, c("tPA", "SK"), font=2, adj=1)

Code

# This can be improved

This forest plot shows the unadjusted overall effect of tPA vs SK treatment; risk-based subgroup effects; and traditional one at a time subgroup effects. The latter are to be interpreted with much caution; many false-positive findings may arise.

Q: What R function can assist trialists in their reporting of risk-based subgroups together with classic subgroups?

We can also estimate the same subgroup effects, adjusted for baseline risk.

Code

# function for adjustmentsubgroup.adj <-function(data=gusto, subgroup=gusto$sex) { coef.unadj <-by(data, subgroup, function(x)lrm.fit(y=x$day30, x=x$tpa)$coef[2]) var.unadj <-by(data, subgroup, function(x)lrm.fit(y=x$day30, x=x$tpa)$var[2,2]) coef.adj <-by(data, subgroup, function(x)lrm.fit(y=x$day30, x=x$tpa, offset=x$lp)$coef[2]) var.adj <-by(data, subgroup, function(x)lrm.fit(y=x$day30, x=x$tpa, offset=x$lp)$var[2,2]) result <-cbind(coef.unadj, coef.adj, coef.ratio=coef.adj/coef.unadj, SEunadj=sqrt(var.unadj), SEadj=sqrt(var.adj), SEratio=sqrt(var.adj)/sqrt(var.unadj)) result} # end functionoptions(digits=3)

Overall trial result (un)adjusted

Code

kable(as.data.frame(subgroup.adj(gusto, gusto$tpa>-1))) %>%kable_styling(full_width=F, position ="left")

coef.unadj

coef.adj

coef.ratio

SEunadj

SEadj

SEratio

TRUE

-0.159

-0.208

1.31

0.049

0.053

1.09

Risk-based subgroups

Code

kable(as.data.frame(subgroup.adj(gusto, groups))) %>%kable_styling(full_width=F, position ="left")

coef.unadj

coef.adj

coef.ratio

SEunadj

SEadj

SEratio

[-7.18,-4.01)

-0.199

-0.198

0.993

0.275

0.275

1.00

[-4.01,-3.20)

-0.282

-0.281

0.998

0.169

0.170

1.00

[-3.20,-2.36)

-0.193

-0.187

0.971

0.108

0.108

1.00

[-2.36, 4.70]

-0.170

-0.208

1.219

0.063

0.067

1.07

Classic subgroup: men vs women

Code

kable(as.data.frame(subgroup.adj(gusto, gusto$sex))) %>%kable_styling(full_width=F, position ="left")

coef.unadj

coef.adj

coef.ratio

SEunadj

SEadj

SEratio

male

-0.183

-0.237

1.29

0.063

0.068

1.08

female

-0.127

-0.164

1.29

0.078

0.085

1.10

Classic subgroup: old (>=75) vs younger age

Code

kable(as.data.frame(subgroup.adj(gusto, gusto$age>=75))) %>%kable_styling(full_width=F, position ="left")

coef.unadj

coef.adj

coef.ratio

SEunadj

SEadj

SEratio

FALSE

-0.239

-0.262

1.1

0.061

0.065

1.06

TRUE

-0.083

-0.100

1.2

0.086

0.093

1.08

The unadjusted and adjusted results are usually quite in line; only subtle differences is the estimates of the relative effects are noted. We might hypothesize that the unadjusted effect for women is confounded by the higher age of women (where higher age was associated with a somewhat weaker treatment effect); this was not confirmed.

Q: What R function can be developed that extends the unadjusted forest plots to provide subgroup effects, adjusted for baseline characteristics?

1b. Absolute benefit of treatment by risk-group

Code

# 95% CI CI <-BinomDiffCI(x1 = events1, n1 = n1, x2 = events2, n2 = n2, method ="scorecc")colnames(CI) <-c("Absolute difference", "Lower CI", "Upper CI")rownames(CI) <-names(events1)result <-round(CI, 3) # absolute difference with confidence intervalkable(as.data.frame(result)) %>%kable_styling(full_width=F, position ="left")

Absolute difference

Lower CI

Upper CI

[-7.18,-4.01)

0.002

-0.003

0.006

[-4.01,-3.20)

0.006

-0.001

0.013

[-3.20,-2.36)

0.010

-0.001

0.020

[-2.36, 4.70]

0.026

0.007

0.044

So, we see a substantial difference in absolute benefit. Low risk according to the linear predictor (lp<-2.36) implies low benefit (<1%); higher risk, higher benefit (>2%). As Frank Harrell would also emphasize, the grouping in quarters might primarily be considered for illustration. A better estimation of benefit avoids grouping, and conditions on the baseline risk.

2a. Relative effects of treatment by baseline risk

The checks for interaction with the linear predictor were far from statistically significant in GUSTO-I, as shown above; supporting the assumption that 1 single adjusted, relative effect applies across baseline risk.

2b. Absolute benefit of treatment by baseline risk

Estimation of absolute benefit can follow a parametric approach, i.e. following the no interaction, main effect model that includes baseline characteristics and a treatment effect; the model f considered above for the primary analysis of the trial.
Further down we will consider relaxations of the proportionality of effect that is assumed in this model.

This plot shows the benefit by tPA treatment over SK. The red line assumes a proportional effect of treatment, which may be quite reasonable here and in many other diseases. The quarters provide for a non-parametric confirmation of the benefit across baseline risk.

Relaxation of the proportional effect assumption

If we want to relax the proportional effect assumption, the blog by Frank Harrell on examining HTE provides an illustration with penalized logistic regression.

Another possible relaxation is by including interaction with the linear predictor. We consider linear interaction and a non-linear interaction (rcs, 2 df for non-linearity).
And we could try a more non-paramteric approach as in Califf 1997. There, loess smoothers were used for risk estimation the tPA (day30~lp, subset=tpa==1) and SK groups (day30~lp, subset=tpa==0).
Benefit was the differences between these 2 risk groups conditional on baseline risk.

This plot confirms that all estimates of benefit by baseline risk are more or less similar, with benefit clearly increasing by baseline risk. For very low baseline risk, the loess estimates are implausible.

We can also add the grouped observations by decile, as in Califf 1997. The 95% confidence intervals show that the uncertainty per risk group is huge.

Q: How many risk-based groups should be used for illustration of benefit by risk? Default: use quartiles to define 4 quarters; perhaps 3 or only 2 groups in smaller trials?

The GUSTO-I trial serves well to illustrate the impact of conditioning on baseline covariates when we consider relative and absolute effects of treatment on binary outcomes. The risk-adjusted estimate of the overall treatment effect has a different interpretation than the unadjusted estimate: the effect for ‘Patients with acute MI’ versus ‘A patient with a certain risk profile’.

Implications for reporting of RCTs

RCTs typically report on:

an overall effect in the primary analysis;
this analysis should condition on baseline covariates as argued in another blog.

effects stratified by single characteristics: one at a time subgroup analyses;
these analyses should be regarded as secondary and exploratory.

Future RCT reports should include:

an adjusted estimate of the overall treatment effect as the primary analysis;

effects stratified by baseline risk;
typically in 4 risk-based subgroups for illustration,
and in an analysis with continuous baseline risk, typically plotted as benefit by baseline risk.

traditional subgroup analyses only as secondary and exploratory information; not to influence decision-making based on the current RCT, but to inform future studies and new RCTs. An exception may be the situation that strong prior hypotheses exist on effect modeification on the relative scale, as discussed in the ICEMAN report.

The Predictive Approaches to Treatment effect Heterogeneity (PATH) Statement David M. Kent, MD, MS; Jessica K. Paulus, ScD; David van Klaveren, PhD; Ralph D’Agostino, PhD; Steve Goodman, MD, MHS, PhD; Rodney Hayward, MD; John P.A. Ioannidis, MD, DSc; Bray Patrick-Lake, MFS; Sally Morton, PhD; Michael Pencina, PhD; Gowri Raman, MBBS, MS; Joseph S. Ross, MD, MHS; Harry P. Selker, MD, MSPH; Ravi Varadhan, PhD; Andrew Vickers, PhD; John B. Wong, MD; and Ewout W. Steyerberg, PhD Ann Intern Med. 2020;172:35-45.

---title: Implementation of the PATH Statementauthor: - name: Ewout Steyerberg affiliation: Department of Biomedical Data Sciences<br>Leiden University Medical Center NL<br>`@ESteyerberg` url: https://scholar.google.com/citations?user=_75LDyMAAAAJ&hl=nl email: e.w.steyerberg@lumc.nldate: 2020-11-24contents: [RCT, drug-evaluation, generalizability, medicine, personalized-medicine, prediction, subgroup, 2020]description: 'The recent PATH (Predictive Approaches to Treatment effect Heterogeneity) Statement outlines principles, criteria, and key considerations for applying predictive approaches to clinical trials to provide patient-centered evidence in support of decision making. Here challenges in implementing the PATH Statement are addressed with the GUSTO-I trial as a case study.'---```{r setup, include=FALSE}require(rms)require(table1)require(DescTools)require(metafor)require(knitr)require(kableExtra)options(digits=3)mu <- markupSpecs$html # in Hmisc```::: {.quoteit}Evidence is derived from groups while most medical decisions are made for individual patients. — Kent _et al_, PATH statement:::Heterogeneity of treatment effect (**_HTE_**) refers to the nonrandom variation in the magnitude of the absolute treatment effect (**_treatment benefit_**) across individual patients. The recent **_[PATH](https://www.ncbi.nlm.nih.gov/pubmed/31711134)_** (Predictive Approaches to Treatment effect Heterogeneity) Statement outlines principles, criteria, and key considerations for applying predictive approaches to clinical trials to provide patient-centered evidence in support of decision making. The focus of PATH is on modeling of **_HTE_** across individual patients. [[Comments](https://hbiostat.org/comment.html)]{.aside}The PATH statement lists a number of principles and guidelines. A first principle is to establish **_overall treatment effect_**. In another [blog](../covadj), I summarized the arguments in favor of covariate adjustment as the primary analysis of a RCT. Illustration was in the GUSTO-I trial. Here I continue that illustration, also following the blog by **Frank Harrell** on **_[examining HTE](../varyor)_**.## Illustration in the GUSTO-I trialLet's analyze the data from 30,510 patients with an acute myocardial infarction as included in the GUSTO-I trial. In GUSTO-I, 10,348 patients were randomized to receive tissue plasminogen activator (tPA), while 20,162 were randomzied to Streptokinase (SK) and had 30-day mortality status known.```{r IDA.gusto}load(url('http://hbiostat.org/data/repo/gusto.rda'))# keep only SK and tPA arms; and selected set of covariatesgusto <- upData(gusto[gusto$tx=="SK" | gusto$tx=="tPA",], keep=Cs(day30, tx, age, Killip, sysbp, pulse, pmi, miloc, sex), print=FALSE)html(describe(gusto), scroll=TRUE)```### Overall treatment effectThe primary outcome was 30-day mortality. Among the tPA group, the 30-day mortality was 653/10,348 = 6.3% vs 1475/20,162 = 7.3% in the SK group. This an absolute difference of 1.0%, or an odds ratio of 0.85 [0.78-0.94].```{r overall.tx}# simple cross-tabletable1(~ as.factor(day30) | tx, data=gusto, digits=2) # 'tpa' as 0/1 variable for tPA vs SK treatmentgusto$tx <- droplevels(gusto$tx) # Drop the SK + tPA category which has 0 obsgusto$tpa <- as.numeric(gusto$tx) - 1 # 0/1 coding for tPA treatmentlabel(gusto$tpa) <- "tPA"levels(gusto$tpa) <- c("SK", "tPA")tab2 <- table(gusto$day30, gusto$tpa)result <- OddsRatio(tab2, conf.level = 0.95)names(result) <- c("Odds Ratio", "Lower CI", "Upper CI")kable(as.data.frame(t(result))) %>% kable_styling(full_width=F, position = "left")# BinomDiffCI(x1 = events1, n1 = n1, x2 = events2, n2 = n2, ...)CI <- BinomDiffCI(x1 = tab2[2,1], n1 = sum(tab2[,1]), x2 = tab2[2,2], n2 = sum(tab2[,2]), method = "scorecc")colnames(CI) <- c("Absolute difference", "Lower CI", "Upper CI")result <- round(CI, 3) # absolute difference with confidence intervalkable(as.data.frame(result)) %>% kable_styling(full_width=F, position = "left")```***### Adjustment for baseline covariatesThe unadjusted odds ratio of `r round(OddsRatio(tab2, conf.level = 0.95)[1], 3)` is a marginal estimate. As explained in the other blog, a lot can be said in favor of *conditional estimates*, where we adjust for prognostically important baseline characteristics. In line with [Califf 1997](https://www.sciencedirect.com/science/article/pii/S0002870397701649) and [Steyerberg 2000](https://www.sciencedirect.com/science/article/pii/S0002870300900012), we consider a prediction model with 6 baseline covariates, including age and Killip class (a measure for ventricular function). Pulse rate is modeled using a linear spline with a knot at 50 beats/minute.```{r adjusted.tx}options(prType='html')f <- lrm(day30 ~ tpa + age + Killip + pmin(sysbp, 120) + lsp(pulse, 50) + pmi + miloc, data=gusto, x=T, maxit=99)print(f) # coef tPA: -0.2080```So, we note that the **adjusted** regression coefficient for tPA was `r round(f$coef[2],4)`. The adjusted OR = `r round(exp(f$coef[2]),3)`. Let's check for statistical interaction with1. Individual covariates 2. The linear predictor ```{r interaction.tx}## check for interaction# 1. traditional approachg <- lrm(day30 ~ tx * (age + Killip + pmin(sysbp, 120) + lsp(pulse, 50) + pmi + miloc), data=gusto, maxit=100)print(anova(g)) # tx interactions: 10 df, p=0.5720; based on LR test# 2. PATH statement: linear interaction with linear predictor; baseline risk, so tx=ref, SKlp.no.tx <- f$coefficients[1] + 0 * f$coefficients[2] + f$x[,-1] %*% f$coefficients[-(1:2)]gusto$lp <- as.vector(lp.no.tx) # add lp to data frameh <- lrm(day30 ~ tx * lp, data=gusto)print(anova(h)) # tx interaction: 1 df, p=0.35; based on Wald statistics```So: 1. The overall test for interaction with the individual covariates is far from statistically significant (p>0.5). 2. Similarly, the test for interaction with the linear predictor is far from statistically significant (p>0.3). #### Conclusion: no interaction neededWe conclude that we may proceed by ignoring any interactions. We have no evidence against the assumption that the overall effect of treatment is applicable to all patients. The patients vary widely in risk, as can easily be seen in the histogram below. ```{r histogram.lp}hist(plogis(lp.no.tx), xlim=c(0,.4), main="")```We note that many patients have baseline risks (tPA==0) below 5%. Obviously their maximum benefit is bounded by this risk estimate, even if tPA, hypothetically, would reduce the risk to Null.## PATH principle: perform risk-based subgrouping**_[Fig 3](https://www.ncbi.nlm.nih.gov/pubmed/31711134)_** of the PATH Statement starts with: ::: {.quoteit}Reporting RCT results stratified by a risk model is encouraged when overall trial results are positive to better understand the distribution of effects across the trial population.:::How do we provide such reporting of RCT results? We can provide estimates of relative effects and absolute benefit1. By group (e.g. quarters defined by quartiles) 2. By baseline risk (continuous, as in the histogram above) ### 1a. Relative effects of treatment by risk-groupThe checks for interaction were far from statistically significant in GUSTO-I. We can further illustrate the relative effects in a forest plot. Many reports from RCTs include forest plots that show relative effects by single variables, such as men vs women; young vs old age; disease subtype; etc. The *PATH* statement encourages reporting by risk-based subgroup. How can such reporting be done? Let's do some data processing to make a better forest plot.#### A PATH compatible forest plotLet's expand the standard forest plot for subgroup effects with risk-based subgroups. Below subgroup effects are defined for 4 risk-based groups using `cut2(lp.no.tx, g=4)`, and for 3 classical subgroups (by sex, age, type of infarction). ```{r risk.subgroup.effects}## quantiles, suggest to use quartersgroups <- cut2(lp.no.tx, g=4)group0 <- groups[gusto$tpa==0] # SK gropupgroup1 <- groups[gusto$tpa==1] # tPA grouprate0 <- prop.table(table(group0, gusto$day30[gusto$tpa==0]),1 )[,2]rate1 <- prop.table(table(group1, gusto$day30[gusto$tpa==1]),1 )[,2]ratediff <- rate0-rate1 # benefit of tPA by group# Make a data frame for the resultsdata.subgroups <- as.data.frame(matrix(nrow=(4+6+1), ncol=10))colnames(data.subgroups) <- c("tevent", "tnoevent", "cevent", "cnoevent", "name", "type", "tn", "pt", "cn", "pc")data.subgroups[11,1:4] <- table(gusto$tpa,gusto$day30)[4:1] # overall results# define event and non-event numbersevents1 <- table(group0, gusto$day30[gusto$tpa==0])[,2]nevents1 <- table(group0, gusto$day30[gusto$tpa==0])[,1]events2 <- table(group1, gusto$day30[gusto$tpa==1])[,2]nevents2 <- table(group1, gusto$day30[gusto$tpa==1])[,1]n1 <- events1 + nevents1n2 <- events2 + nevents2data.subgroups[10:7,1:4] <- cbind(events2,nevents2,events1,nevents1)```Data for classic subgroups are managed below: ```{r classic.subgroup.effects}# Use `table` to get the summary of cell numbers, by subgroup# SEXdata.subgroups[5,1:4] <- table(1-gusto$day30,1-gusto$tpa, gusto$sex)[1:4]data.subgroups[6,1:4] <- table(1-gusto$day30,1-gusto$tpa, gusto$sex)[5:8]# AGEdata.subgroups[3,1:4] <- table(1-gusto$day30,1-gusto$tpa, gusto$age>=75)[1:4]data.subgroups[4,1:4] <- table(1-gusto$day30,1-gusto$tpa, gusto$age>=75)[5:8]# ANTdata.subgroups[1,1:4] <- table(1-gusto$day30,1-gusto$tpa, gusto$miloc=="Anterior")[1:4]data.subgroups[2,1:4] <- table(1-gusto$day30,1-gusto$tpa, gusto$miloc=="Anterior")[5:8]# Namesdata.subgroups[11,5] <- "Overall"data.subgroups[10:7,5] <- paste("Quarter",1:4, sep=" ")data.subgroups[5:6,5] <- c("Male sex","Female sex")data.subgroups[3:4,5] <- c("Age <75","Age>=75")data.subgroups[1:2,5] <- c("Other MI","Anterior")# Type of subgroupdata.subgroups[11,6] <- ""data.subgroups[10:7,6] <- c(rep("Risk-based subgroups", length(ratediff)))data.subgroups[1:6,6] <- c(rep("Location",2), rep("Age",2), rep("Sex",2))data.subgroups[,7] <- data.subgroups[,1] + data.subgroups[,2]data.subgroups[,8] <- paste(round(100*data.subgroups[,1] / data.subgroups[,7] , 1),"%", sep="")data.subgroups[,9] <- data.subgroups[,3] + data.subgroups[,4]data.subgroups[,10] <- paste(round(100*data.subgroups[,3] / data.subgroups[,9] , 1),"%", sep="")# Show the datakable(as.data.frame((data.subgroups))) %>% kable_styling(full_width=F, position = "left")```In this table **tevent** means #events among treated; **cevent** means #events among non-treated; etc Results can be plotted with `metafor` functions: ```{r subgroup.effects}par(mar=c(4,4,1,2))### fit random-effects model (use slab argument to define "study" labels)res <- rma(ai=tevent, bi=tnoevent, ci=cevent, di=cnoevent, data=data.subgroups, measure="OR", slab=name, method="ML")### set up forest plot (with 2x2 table counts added); rows argument is used### to specify exactly in which rows the outcomes will be plotted)forest(res, xlim=c(-8, 2.5), at=log(c(0.5, 1)), alim=c(log(0.2), log(2)), atransf=exp, ilab=cbind(data.subgroups$tn, data.subgroups$pt, data.subgroups$cn, data.subgroups$pc), ilab.xpos=c(-5,-4,-3,-2), adj=1, cex=.9, ylim=c(0, 19), rows=c(1:2, (4:5)-.5, 6:7, 10:13, 15), xlab="", mlab="", psize=.8, addfit=F)# lines(x=c(-.15, -.15), y=c(0, 17)) ## could add a reference line of the overall treatment effecttext(c(-5,-4,-3,-2, 2.2), 18, c("n", "%mort", "n", "%mort", "OR [95% CI]"), font=2, adj=1, cex=.9)text(-8, 18, c("GUSTO-I trial"), font=2, adj=0, cex=.9)text(c(-4.5,-2.5), 19, c("tPA", "SK"), font=2, adj=1)# This can be improved```This forest plot shows the unadjusted overall effect of tPA vs SK treatment; risk-based subgroup effects; and traditional one at a time subgroup effects. The latter are to be interpreted with much caution; many false-positive findings may arise. ::: {.quoteit}Q: What R function can assist trialists in their reporting of risk-based subgroups together with classic subgroups?:::We can also estimate the same subgroup effects, adjusted for baseline risk.```{r adjusted.subgroup.effects.function}# function for adjustmentsubgroup.adj <- function(data=gusto, subgroup=gusto$sex) { coef.unadj <- by(data, subgroup, function(x)lrm.fit(y=x$day30, x=x$tpa)$coef[2]) var.unadj <- by(data, subgroup, function(x)lrm.fit(y=x$day30, x=x$tpa)$var[2,2]) coef.adj <- by(data, subgroup, function(x)lrm.fit(y=x$day30, x=x$tpa, offset=x$lp)$coef[2]) var.adj <- by(data, subgroup, function(x)lrm.fit(y=x$day30, x=x$tpa, offset=x$lp)$var[2,2]) result <- cbind(coef.unadj, coef.adj, coef.ratio=coef.adj/coef.unadj, SEunadj=sqrt(var.unadj), SEadj=sqrt(var.adj), SEratio=sqrt(var.adj)/ sqrt(var.unadj)) result} # end functionoptions(digits=3)```Overall trial result (un)adjusted```{r}kable(as.data.frame(subgroup.adj(gusto, gusto$tpa>-1))) %>%kable_styling(full_width=F, position ="left")```Risk-based subgroups```{r}kable(as.data.frame(subgroup.adj(gusto, groups))) %>%kable_styling(full_width=F, position ="left")```Classic subgroup: men vs women```{r}kable(as.data.frame(subgroup.adj(gusto, gusto$sex))) %>%kable_styling(full_width=F, position ="left")```Classic subgroup: old (>=75) vs younger age```{r}kable(as.data.frame(subgroup.adj(gusto, gusto$age>=75))) %>%kable_styling(full_width=F, position ="left")```The unadjusted and adjusted results are usually quite in line; only subtle differences is the estimates of the relative effects are noted. We might hypothesize that the unadjusted effect for women is confounded by the higher age of women (where higher age was associated with a somewhat weaker treatment effect); this was not confirmed. ::: {.quoteit}Q: What R function can be developed that extends the unadjusted forest plots to provide subgroup effects, adjusted for baseline characteristics?:::### 1b. Absolute benefit of treatment by risk-group ```{r grouping}# 95% CI CI <- BinomDiffCI(x1 = events1, n1 = n1, x2 = events2, n2 = n2, method = "scorecc")colnames(CI) <- c("Absolute difference", "Lower CI", "Upper CI")rownames(CI) <- names(events1)result <- round(CI, 3) # absolute difference with confidence intervalkable(as.data.frame(result)) %>% kable_styling(full_width=F, position = "left")```So, we see a substantial difference in absolute benefit. Low risk according to the linear predictor (lp<-2.36) implies low benefit (<1%); higher risk, higher benefit (>2%). As **Frank Harrell** would also emphasize, the grouping in quarters might primarily be considered for illustration. A better estimation of benefit avoids grouping, and conditions on the baseline risk. ### 2a. Relative effects of treatment by baseline riskThe checks for interaction with the linear predictor were far from statistically significant in GUSTO-I, as shown above; supporting the assumption that 1 single adjusted, relative effect applies across baseline risk. ### 2b. Absolute benefit of treatment by baseline riskEstimation of absolute benefit can follow a parametric approach, i.e. following the no interaction, main effect model that includes baseline characteristics and a treatment effect; the model `f` considered above for the primary analysis of the trial. Further down we will consider relaxations of the proportionality of effect that is assumed in this model.```{r plot.benefit}# create baseline predictions: X-axisxp <- seq(0.002,.5,by=0.001)logxp0 <- log(xp/(1-xp))# expected difference, if covariate adjusted model holdsp1exp <- plogis(logxp0) - plogis(logxp0+coef(f)[2]) # proportional effect assumedplot(x=xp, y=p1exp, type='l', lty=2, lwd=3, xlim=c(0,.35), ylim=c(-0.007,.05), col="red", xlab="Baseline risk", ylab="Benefit by tPA", cex.lab=1.2, las=1, bty='l' )# add horizontal linelines(x=c(0,.5), y=c(0,0)) # distribution of predicted histSpike(plogis(lp.no.tx), add=T, side=1, nint=300, frac=.15) points(x=rate0, y=ratediff, pch=1, cex=2, lwd=2, col="blue")arrows(x0=rate0, x1=rate0, y0=CI[,2], y1=CI[,3], angle=90, code=3,len=.1, col="blue")legend("topleft", lty=c(2,NA), pch=c(NA,1), lwd=c(3,2), bty='n',col=c("red", "blue"), cex=1.2, legend=c("Expected with proportional effect", "Grouped patients"))```This plot shows the benefit by tPA treatment over SK. The red line assumes a proportional effect of treatment, which may be quite reasonable here and in many other diseases. The quarters provide for a non-parametric confirmation of the benefit across baseline risk. ### Relaxation of the proportional effect assumptionIf we want to relax the proportional effect assumption, the blog by **Frank Harrell** on **_[examining HTE](../varyor/)_** provides an illustration with **penalized logistic regression**.Another possible relaxation is by including **interaction with the linear predictor**. We consider linear interaction and a non-linear interaction (`rcs`, 2 *df* for non-linearity). And we could try a more non-paramteric approach as in [Califf 1997](https://www.sciencedirect.com/science/article/pii/S0002870397701649). There, `loess` smoothers were used for risk estimation the tPA (`day30~lp, subset=tpa==1`) and SK groups (`day30~lp, subset=tpa==0`). Benefit was the differences between these 2 risk groups conditional on baseline risk.```{r interaction.with.lp}h <- lrm(day30 ~ tpa + tpa * lp, data=gusto, eps=0.005, maxit=30)h2 <- lrm(day30 ~ tpa + rcs(lp,3)*tpa, data=gusto, eps=0.005, maxit=99)# loess smoothingl0 <- loess(day30 ~ lp, data=gusto, subset=tpa==0)l1 <- loess(day30 ~ lp, data=gusto, subset=tpa==1)# subtract predicted risks with from without txp1 <- plogis(Predict(h, tpa=0, lp = logxp0)[,3]) - plogis(Predict(h, tpa=1, lp = logxp0)[,3])p2 <- plogis(Predict(h2, tpa=0, lp = logxp0)[,3]) - plogis(Predict(h2, tpa=1, lp = logxp0)[,3])l <- predict(l0, data.frame(lp = logxp0)) - predict(l1, data.frame(lp = logxp0))plot(x=xp, y=p1exp, type='l', lty=1, lwd=4, xlim=c(0,.35), ylim=c(-0.007,.05), col="red", xlab="Baseline risk", ylab="Benefit by tPA", cex.lab=1.2, las=1, bty='l' )# benefit with interaction termslines(x=xp, y=p1, type='l', lty=2, lwd=3, col="dark blue") lines(x=xp, y=p2, type='l', lty=3, lwd=2, col="purple") lines(x=xp, y=l, type='l', lty=1, lwd=3, col="black") # horizontal linelines(x=c(0,.5), y=c(0,0)) # distribution of predicted histSpike(plogis(lp.no.tx), add=T, side=1, nint=300, frac=.1) legend("topleft", lty=c(1,2,3,1), pch=c(NA,NA,NA,NA), lwd=c(4,3,2,3), bty='n', col=c("red", "dark blue", "purple","black"), cex=1.2, legend=c("Expected with proportional effect", "Linear interaction", "Spline smoothing, 2 df", "Loess"))```This plot confirms that all estimates of benefit by baseline risk are more or less similar, with benefit clearly increasing by baseline risk. For very low baseline risk, the `loess` estimates are implausible. We can also add the grouped observations by decile, as in [Califf 1997](https://www.sciencedirect.com/science/article/pii/S0002870397701649). The 95% confidence intervals show that the uncertainty per risk group is huge.::: {.quoteit}Q: How many risk-based groups should be used for illustration of benefit by risk? Default: use quartiles to define 4 quarters; perhaps 3 or only 2 groups in smaller trials?:::```{r interaction.with.lp.deciles}g <- 10 # first decilesh <- lrm(day30 ~ tpa + tpa * lp, data=gusto, eps=0.005, maxit=30)h2 <- lrm(day30 ~ tpa + rcs(lp,3)*tpa, data=gusto, eps=0.005, maxit=99)# loess smoothingl0 <- loess(day30 ~ lp, data=gusto, subset=tpa==0)l1 <- loess(day30 ~ lp, data=gusto, subset=tpa==1)# subtract predicted risks with from without txp1 <- plogis(Predict(h, tpa=0, lp = logxp0)[,3]) - plogis(Predict(h, tpa=1, lp = logxp0)[,3])p2 <- plogis(Predict(h2, tpa=0, lp = logxp0)[,3]) - plogis(Predict(h2, tpa=1, lp = logxp0)[,3])l <- predict(l0, data.frame(lp = logxp0)) - predict(l1, data.frame(lp = logxp0))plot(x=xp, y=p1exp, type='l', lty=1, lwd=4, xlim=c(0,.35), ylim=c(-0.007,.05), col="red", xlab="Baseline risk", ylab="Benefit by tPA", cex.lab=1.2, las=1, bty='l' )# benefit with interaction termslines(x=xp, y=p1, type='l', lty=2, lwd=3, col="dark blue") lines(x=xp, y=p2, type='l', lty=3, lwd=2, col="purple") lines(x=xp, y=l, type='l', lty=1, lwd=3, col="black") # horizontal linelines(x=c(0,.5), y=c(0,0)) # distribution of predicted histSpike(plogis(lp.no.tx), add=T, side=1, nint=300, frac=.1) # risk groupsgroups <- cut2(lp.no.tx, g=g)group0 <- groups[gusto$tpa==0] # SK gropupgroup1 <- groups[gusto$tpa==1] # tPA grouprate0 <- prop.table(table(group0, gusto$day30[gusto$tpa==0]),1 )[,2]rate1 <- prop.table(table(group1, gusto$day30[gusto$tpa==1]),1 )[,2]ratediff <- rate0-rate1 # benefit of tPA by group# CI re-calculationevents1 <- table(group0, gusto$day30[gusto$tpa==0])[,2]nevents1 <- table(group0, gusto$day30[gusto$tpa==0])[,1]events2 <- table(group1, gusto$day30[gusto$tpa==1])[,2]nevents2 <- table(group1, gusto$day30[gusto$tpa==1])[,1]n1 <- events1 + nevents1n2 <- events2 + nevents2CI <- BinomDiffCI(x1 = events1, n1 = n1, x2 = events2, n2 = n2, method = "scorecc")points(x=rate0, y=ratediff, pch=1, cex=1, lwd=1, col="black")arrows(x0=rate0, x1=rate0, y0=CI[,2], y1=CI[,3], angle=90, code=3,len=.1, col="black", lwd=.5)legend("topleft", lty=c(1,2,3,1,NA), pch=c(NA,NA,NA,NA,1), lwd=c(4,3,2,3,1), bty='n', col=c("red", "dark blue", "purple","black","black"), cex=1.2, legend=c("Expected with proportional effect", "Linear interaction", "Spline smoothing, 2 df", "Loess", "Grouped patients"))``````{r interaction.with.lp.quarters}g <- 4 # quartersplot(x=xp, y=p1exp, type='l', lty=1, lwd=4, xlim=c(0,.35), ylim=c(-0.007,.05), col="red", xlab="Baseline risk", ylab="Benefit by tPA", cex.lab=1.2, las=1, bty='l' )# benefit with interaction termslines(x=xp, y=p1, type='l', lty=2, lwd=3, col="dark blue") lines(x=xp, y=p2, type='l', lty=3, lwd=2, col="purple") lines(x=xp, y=l, type='l', lty=1, lwd=3, col="black") # horizontal linelines(x=c(0,.5), y=c(0,0)) # distribution of predicted histSpike(plogis(lp.no.tx), add=T, side=1, nint=300, frac=.1) # risk groupsgroups <- cut2(lp.no.tx, g=g)group0 <- groups[gusto$tpa==0] # SK gropupgroup1 <- groups[gusto$tpa==1] # tPA grouprate0 <- prop.table(table(group0, gusto$day30[gusto$tpa==0]),1 )[,2]rate1 <- prop.table(table(group1, gusto$day30[gusto$tpa==1]),1 )[,2]ratediff <- rate0-rate1 # benefit of tPA by group# CI re-calculationevents1 <- table(group0, gusto$day30[gusto$tpa==0])[,2]nevents1 <- table(group0, gusto$day30[gusto$tpa==0])[,1]events2 <- table(group1, gusto$day30[gusto$tpa==1])[,2]nevents2 <- table(group1, gusto$day30[gusto$tpa==1])[,1]n1 <- events1 + nevents1n2 <- events2 + nevents2CI <- BinomDiffCI(x1 = events1, n1 = n1, x2 = events2, n2 = n2, method = "scorecc")points(x=rate0, y=ratediff, pch=1, cex=1, lwd=1, col="black")arrows(x0=rate0, x1=rate0, y0=CI[,2], y1=CI[,3], angle=90, code=3,len=.1, col="black", lwd=.5)legend("topleft", lty=c(1,2,3,1,NA), pch=c(NA,NA,NA,NA,1), lwd=c(4,3,2,3,1), bty='n', col=c("red", "dark blue", "purple","black","black"), cex=1.2, legend=c("Expected with proportional effect", "Linear interaction", "Spline smoothing, 2 df", "Loess", "Grouped patients"))```## ConclusionsThe GUSTO-I trial serves well to illustrate the impact of conditioning on baseline covariates when we consider relative and absolute effects of treatment on binary outcomes. The risk-adjusted estimate of the overall treatment effect has a different interpretation than the unadjusted estimate: the effect for *‘Patients with acute MI’* versus *‘A patient with a certain risk profile’*. ### Implications for reporting of RCTsRCTs typically report on: 1. an overall effect in the primary analysis; this analysis should condition on baseline covariates as argued in another blog. 2. effects stratified by single characteristics: one at a time subgroup analyses; these analyses should be regarded as secondary and exploratory.Future RCT reports should include: 1. an adjusted estimate of the overall treatment effect as the primary analysis; 2. effects stratified by baseline risk; typically in 4 risk-based subgroups for illustration, and in an analysis with continuous baseline risk, typically plotted as benefit by baseline risk. 3. traditional subgroup analyses only as secondary and exploratory information; not to influence decision-making based on the current RCT, but to inform future studies and new RCTs. An exception may be the situation that strong prior hypotheses exist on effect modeification on the relative scale, as discussed in the [ICEMAN report](https://macsphere.mcmaster.ca/handle/11375/24375).### References#### GUSTO-I referencesGusto Investigators - New England Journal of Medicine, 1993 [An international randomized trial comparing four thrombolytic strategies for acute myocardial infarction](https://www.nejm.org/doi/full/10.1056/NEJM199309023291001)Califf R, …, ML Simoons, EJ Topol, GUSTO-I Investigators - American heart journal, 1997 [Selection of thrombolytic therapy for individual patients: development of a clinical model](https://www.sciencedirect.com/science/article/pii/S0002870397701649)EW Steyerberg, PMM Bossuyt, KL Lee - American heart journal, 2000 [Clinical trials in acute myocardial infarction: should we adjust for baseline characteristics?](https://www.sciencedirect.com/science/article/pii/S0002870300900012)#### PATH Statement references**The Predictive Approaches to Treatment effect Heterogeneity(PATH) Statement**<br>David M. Kent, MD, MS; Jessica K. Paulus, ScD; David van Klaveren, PhD; Ralph D’Agostino, PhD;Steve Goodman, MD, MHS, PhD; Rodney Hayward, MD; John P.A. Ioannidis, MD, DSc; Bray Patrick-Lake, MFS; Sally Morton, PhD;Michael Pencina, PhD; Gowri Raman, MBBS, MS; Joseph S. Ross, MD, MHS; Harry P. Selker, MD, MSPH; Ravi Varadhan, PhD;Andrew Vickers, PhD; John B. Wong, MD; and Ewout W. Steyerberg, PhD *Ann Intern Med. 2020;172:35-45.* [Annals of Internal Medicine, main text](https://annals.org/acp/content_public/journal/aim/938321/aime202001070-m183667.pdf?casa_token=rjMhlqehmZYAAAAA:s2gnJxSo---fGeU5d3BYBJ0s3S8UMxMWhH7RQXNlK8WmHH7WvghrEZJODK1fI_8KsvW1bPavUew)[Annals of Internal Medicine, Explanation and Elaboration](https://annals.org/acp/content_public/journal/aim/938321/aime202001070-m183668.pdf?casa_token=HSOwDdYuIfUAAAAA:LprkkDo9-n-qcHqcOC5IHsDWQtjd3CO_SXrSdF0SimqTuIG5I6rKEwhT24ySEZwxTr7GLpn56vM)[Editorial by Localio et al, 2020](https://annals.org/aim/fullarticle/2755584/advancing-personalized-medicine-through-prediction)#### Risks of traditional subgroup analysesSF Assmann, SJ Pocock, LE Enos, LE Kasten - The Lancet, 2000 [Subgroup analysis and other (mis) uses of baseline data in clinical trials](https://www.sciencedirect.com/science/article/pii/S0140673600020390)PM Rothwell - The Lancet, 2005 [Subgroup analysis in randomised controlled trials: importance, indications, and interpretation](https://www.sciencedirect.com/science/article/pii/S0140673605177095)RA Hayward, DM Kent, S Vijan… - BMC medical research nmethodology, 2006 [Multivariable risk prediction can greatly enhance the statistical power of clinical trial subgroup analysis](https://bmcmedresmethodol.biomedcentral.com/articles/10.1186/1471-2288-6-18)AV Hernández, E Boersma, GD Murray… - American heart journal, 2006 [Subgroup analyses in therapeutic cardiovascular clinical trials: are most of them misleading?](https://www.sciencedirect.com/science/article/pii/S0002870305004394)JD Wallach, …, KL Sainani, EW Steyerberg, JPA Ioannidis - JAMA internal medicine, 2017 [Evaluation of evidence of statistical support and corroboration of subgroup claims in randomized clinical trials](https://jamanetwork.com/journals/jamainternalmedicine/article-abstract/2601419)JD Wallach, …, JF Trepanowski, EW Steyerberg, JPA Ioannidis - BMJ, 2016 [Sex based subgroup differences in randomized controlled trials: empirical evidence from Cochrane meta-analyses](https://www.bmj.com/content/355/bmj.i5826)