Three-level survival models - IPD meta-analysis of recurrent event data

The do file with all the code from this tutorial is available to download here

In this example I’ll look at the analysis of clustered survival data with three levels. Such data arises in the meta-analysis of recurrent event data, where we have events, $k$, nested within patients, $j$, nested within trials, $i$.

Random intercepts

The first example will consider a model with a random intercept at level 1 and a random intercept at level 2,

$$ h_{ijk}(t) = h_{0}(t) \exp (X_{ij}\beta + b_{1i} + b_{2ij}) $$

where

$$ b_{1i} \sim N(0,\sigma_{1}^{2}) \text{ and } b_{2ij} \sim N(0,\sigma^{2}_{2})$$

First we must be clear about what assumptions this model has. By assuming a random intercept at the trial level 1, we are assuming that each trial comes from a distribution of trials, which arguably is a strong and perhaps implausible assumption. We may prefer to stratify by trial, but for the sake of illustration, here I’ll proceed with a random trial effect. A second important note is that our covariate effects, $\beta$, are interpreted as conditonal on the random effects.

I’ll start by showing how to simulate data from such a model. Let’s assume we have data from 15 trials, so we generate a trial variable which will indicate which trial the observations come from, and also our trial level random effect

. clear

. set seed 498093

. set obs 15
number of observations (_N) was 0, now 15

. gen trial = _n

. gen b1 = rnormal(0,1)

Note that I set the simulation seed for reproducibility!

Now let’s assume we recruit 100 patients within each trial. We can use expand for this,

. expand 100
(1,485 observations created)

. sort trial

and then generate a unique patient id variable, and our patient level random effect

. gen id = _n

. gen b2 = rnormal(0,1)

We also generate a binary treatment variable, at the patient level,

. gen trt = runiform()>0.5

Finally, we allow each patient to experience up to two events,

. expand 2
(1,500 observations created)

. sort trial id

Now we can simulate our observation level survival times, by using survsim. I’ll assume a Weibull baseline hazard function with $\lambda=0.1$ and $\gamma=1.2$, and a log hazard ratio of $\beta=-0.5$ for the treatment effect,

. survsim stime died, dist(weibull) lambda(0.1) gamma(1.2) maxtime(5)     ///
>                          covariates(trt -0.5 b1 1 b2 1)

Note that I’m simulating the recurrent events under a clock-reset approach, i.e. after each event the timescale is reset to 0. In a further example I will look at the clock-forward approach, i.e. one continuous timescale.

We can fit our data-generating model, i.e. the true model, with merlin as follows,

. merlin (stime trt M1[trial]@1 M2[trial>id]@1, family(weibull, failure(died))) 

Fitting fixed effects model:

Fitting full model:

Iteration 0:   log likelihood = -4250.8981  (not concave)
Iteration 1:   log likelihood = -4183.9242  
Iteration 2:   log likelihood = -4176.4546  
Iteration 3:   log likelihood = -4175.4023  
Iteration 4:   log likelihood = -4175.3996  
Iteration 5:   log likelihood = -4175.3996  

Mixed effects regression model                  Number of obs     =      3,000
Log likelihood = -4175.3996
------------------------------------------------------------------------------
             |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
stime:       |            
         trt |  -.5449902   .0757995    -7.19   0.000    -.6935545    -.396426
   M1[trial] |          1          .        .       .            .           .
M2[trial>id] |          1          .        .       .            .           .
       _cons |   -1.99857   .2224338    -8.99   0.000    -2.434533   -1.562608
  log(gamma) |    .192227   .0259834     7.40   0.000     .1413005    .2431534
-------------+----------------------------------------------------------------
trial:       |            
      sd(M1) |   .8217196    .156671                      .5654985    1.194031
-------------+----------------------------------------------------------------
id:          |            
      sd(M2) |    .971307   .0566649                      .8663601    1.088967
------------------------------------------------------------------------------

Random trial and random treatment effect

Now we consider a second example, but still sticking within a meta-analysis context. In many situations, we would investigate the presence of heterogeneity, i.e. unobserved variability, in any treatment effects. We can do this by modelling a random treatment effect, with the following model.

$$ h_{ijk}(t) = h_{0}(t) \exp (b_{12i} + (\beta + b_{11i}) X_{ij} + b_{2ij}) $$

where

$$ b_{11i} \sim N(0,\sigma_{11}^{2}), b_{12i} \sim N(0,\sigma_{12}^{2}) \text{ and } b_{2ij} \sim N(0,\sigma^{2}_{2})$$

Once more let’s assume we have data from 15 trials, so we generate a trial variable which will indicate which trial the observations come from, but this time we have a random intercept and also a random treatment effect, which we assume are independent (we could use drawnorm if we wanted to impose correlated random effects),

. clear

. set seed 498093

. set obs 15
number of observations (_N) was 0, now 15

. gen trial = _n

. gen b11 = rnormal(0,1)

. gen b12 = rnormal(0,1)

Again let’s assume we recruit 100 patients within each trial,

. expand 100
(1,485 observations created)

. sort trial

and then generate our unique patient id variable, and our patient level random effect

. gen id = _n

. gen b2 = rnormal(0,1)

We also generate a binary treatment variable, at the patient level,

. gen trt = runiform()>0.5

Finally, we allow each patient to experience up to two events,

. expand 2
(1,500 observations created)

. sort trial id

Now to impose our random treatment effect (varying at the trial level), we can first generate its overall effect using

. . gen trtb12 = (-0.5 + b12) * trt

and then simulate our survival times as follows,

. survsim stime died, dist(weibull) lambda(0.1) gamma(1.2) maxtime(5)     ///
>                          covariates(trtb12 1 b11 1 b2 1)

We can fit our data-generating model, i.e. the true model, with merlin as follows,

. merlin (stime trt trt#M1[trial]@1 M2[trial]@1 M3[trial>id]@1, family(weibull, failure(died))) 

Fitting fixed effects model:

Fitting full model:

Iteration 0:   log likelihood = -4116.4692  (not concave)
Iteration 1:   log likelihood = -4016.0111  
Iteration 2:   log likelihood = -4009.4163  
Iteration 3:   log likelihood = -4009.1759  
Iteration 4:   log likelihood = -4009.1752  
Iteration 5:   log likelihood = -4009.1752  

Mixed effects regression model                  Number of obs     =      3,000
Log likelihood = -4009.1752
------------------------------------------------------------------------------
             |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
stime:       |            
         trt |  -.5156967   .3010874    -1.71   0.087    -1.105817    .0744238
trt#M1[tri~] |          1          .        .       .            .           .
   M2[trial] |          1          .        .       .            .           .
M3[trial>id] |          1          .        .       .            .           .
       _cons |  -2.027292   .2255663    -8.99   0.000    -2.469394   -1.585191
  log(gamma) |   .2010559   .0258223     7.79   0.000     .1504451    .2516668
-------------+----------------------------------------------------------------
trial:       |            
      sd(M1) |   1.120972   .2208732                      .7618621    1.649351
      sd(M2) |   .8307332   .1644114                      .5636367    1.224402
-------------+----------------------------------------------------------------
id:          |            
      sd(M3) |    1.03136   .0559169                      .9273872     1.14699
------------------------------------------------------------------------------

Related

comments powered by Disqus