Joint longitudinal-survival models

This page describes examples of what can be considered the standard joint model framework, which we can define as follows,

$$h(t) = h_{0}(t) \exp(X_{1ij}\mathbf{\beta_{1}} + \alpha m_{i}(t))$$

and

$$y_{}(t) = m_{i}(t) + \epsilon_{ij}$$

where

$$m_{i}(t) = X_{2ij}(t)\mathbf{\beta}_{2} + Z_{ij}(t)b_{i}$$

and

$$b_{i} \sim N(0,\Sigma)$$

This formulation is known as the current value parametrisation, where $\alpha$ is a log hazard ratio for a one-unit increase in the biomarker, at time $t$. Let’s assume a simple random intercept and random linear trend for the biomarker trajectory, i.e.

$$m_{i}(t) = (\beta_{20} + b_{0i}) + (\beta_{21} + b_{1i})t$$

I’ll use the standard Primary Biliary Cirrhosis (PBC) example, often used to illustrate joint models.

We load the dataset

. use https://www.mjcrowther.co.uk/data/jm_example.dta, clear
(Example dataset for joint modelling, Michael Crowther 2011)

I can fit the above joint model very simply using merlin, as follows

. merlin (stime trt EV[logb], family(weibull, failure(died)) timevar(stime))      ///
>        (logb fp(time,pow(1)) fp(time,pow(1))#M2[id]@1 M1[id]@1, family(gaussian) timevar(time)) ///
>        , covariance(unstructured) restartvalues(M2 0.1)
variables created for model 2, component 1: _cmp_2_1_1 to _cmp_2_1_1
variables created for model 2, component 2: _cmp_2_2_1 to _cmp_2_2_1

Fitting fixed effects model:

Fitting full model:

Iteration 0:   log likelihood = -3147.7168  (not concave)
Iteration 1:   log likelihood = -2332.4702  (not concave)
Iteration 2:   log likelihood = -2282.1053  (not concave)
Iteration 3:   log likelihood =  -2138.644  
Iteration 4:   log likelihood = -2086.5157  
Iteration 5:   log likelihood = -1952.1697  
Iteration 6:   log likelihood =    -1921.1  
Iteration 7:   log likelihood = -1919.2203  
Iteration 8:   log likelihood = -1919.2164  
Iteration 9:   log likelihood = -1919.2164  

Mixed effects regression model                  Number of obs     =      1,945
Log likelihood = -1919.2164
------------------------------------------------------------------------------
             |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
stime:       |            
         trt |   .0441736     .17909     0.25   0.805    -.3068362    .3951835
        EV[] |   1.240676   .0932792    13.30   0.000     1.057852      1.4235
       _cons |  -4.411849   .2741419   -16.09   0.000    -4.949157   -3.874541
  log(gamma) |    .019314   .0825814     0.23   0.815    -.1425425    .1811706
-------------+----------------------------------------------------------------
logb:        |            
        fp() |   .1850394   .0133236    13.89   0.000     .1589256    .2111532
 fp()#M2[id] |          1          .        .       .            .           .
      M1[id] |          1          .        .       .            .           .
       _cons |   .4929444   .0582791     8.46   0.000     .3787195    .6071692
  sd(resid.) |   .3471211   .0066724                      .3342868    .3604481
-------------+----------------------------------------------------------------
id:          |            
      sd(M1) |   1.002467   .0426595                      .9222474    1.089664
      sd(M2) |   .1808176   .0123978                      .1580803    .2068253
 corr(M2,M1) |   .4252257   .0729127                      .2725388    .5570211
------------------------------------------------------------------------------

Here, I use the EV[] element type to link the expected value of the response for the model for logb, my repeatedly measured biomarker, directly to survival. Because this is a time-dependent association structure, I can’t simply use the variable time in the linear predictor for the biomarker model, as I need to integrate out time in my survival model contribution to the log likelihood. Therefore, anything to do with time must be done using the element functions fp() and rcs(), and consequently we must specify the timevar() option as well, in both models.

You’ll see I’ve used the restartvalues() option to specify a starting value for the variance of the random effect on the slope. Starting values are a big challenge in such a general framework such as merlin, and as such the default of 1 doesn’t work. Within a joint model, we can fit a separate linear mixed effects model to get an idea of the scale of the variance, to provide a more sensible starting value. I’m working on improving this.

Looking at the results, we get an estimate of $\alpha = 1.241$ (95% CI: 1.058, 1.423), showing a very strong association between the biomarker and survival, with a hazard ratio of 3.459, indicating a large increase in the mortality rate for a one-unit increase in the log of serum bilirubin, at time $t$.

Note, this is a simple random intercept and random linear slope model for my biomarker trajectory. We should investigate whether something more complex is required, by using fractional polynomials or splines, but I leave that to you. Hint…it does matter (Crowther et al., 2016)

We can of course fit the same model using stjm, but we need to setup the data into start and stop times for each observation window, and then stset our data.

. stset stop, enter(start) f(event)

     failure event:  event != 0 & event < .
obs. time interval:  (0, stop]
 enter on or after:  time start
 exit on or before:  failure

------------------------------------------------------------------------------
      1,945  total observations
          0  exclusions
------------------------------------------------------------------------------
      1,945  observations remaining, representing
        140  failures in single-record/single-failure data
  2,000.307  total analysis time at risk and under observation
                                                at risk from t =         0
                                     earliest observed entry t =         0
                                          last observed exit t =  14.30566

. stjm logb , panel(id) survmodel(weibull) rfp(1) survcov(trt)
-> gen double _time_1 = X^(1)
(where X = _t0)

Obtaining initial values:

Fitting full model:

-> Conducting adaptive Gauss-Hermite quadrature

Iteration 0:   log likelihood = -1923.9524  
Iteration 1:   log likelihood = -1919.2258  
Iteration 2:   log likelihood = -1919.2035  
Iteration 3:   log likelihood = -1919.2034  

Joint model estimates                            Number of obs.     =     1945
Panel variable: id                               Number of panels   =      312
                                                 Number of failures =      140

Log-likelihood = -1919.2034

------------------------------------------------------------------------------
             |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
Longitudinal |
     _time_1 |    .184966   .0133105    13.90   0.000     .1588779    .2110541
       _cons |   .4927079    .058284     8.45   0.000     .3784733    .6069424
-------------+----------------------------------------------------------------
Survival     |
 assoc:value |
       _cons |   1.240578    .093102    13.32   0.000     1.058101    1.423054
   ln_lambda |
         trt |   .0438859   .1790461     0.25   0.806    -.3070381    .3948098
       _cons |  -4.410736    .274055   -16.09   0.000    -4.947874   -3.873598
    ln_gamma |
       _cons |   .0188261   .0827756     0.23   0.820     -.143411    .1810632
------------------------------------------------------------------------------

------------------------------------------------------------------------------
  Random effects Parameters  |   Estimate   Std. Err.     [95% Conf. Interval]
-----------------------------+------------------------------------------------
id: Unstructured             |
                 sd(_time_1) |   .1807736   .0123757      .1580745    .2067322
                   sd(_cons) |   1.002508   .0426641      .9222805    1.089715
         corr(_time_1,_cons) |     .42525   .0728635      .2726711    .5569635
-----------------------------+------------------------------------------------
                sd(Residual) |   .3471243   .0066718       .334291    .3604502
------------------------------------------------------------------------------

 Longitudinal submodel: Linear mixed effects model
     Survival submodel: Weibull proportional hazards model
    Integration method: Adaptive Gauss-Hermite quadrature using 5 nodes
     Cumulative hazard: Gauss-Kronrod quadrature using 15 nodes

You can immediately see how much quicker, with fewer iterations, that stjm reaches convergence - because it’s a specific joint model implementation, I can provide very good starting values.

The current value association structure is one of many ways to link the outcomes models, so let’s look at how to fit some others.

Random effects association structure

We can directly link individual random effects to survival, for example the random intercept,

$$ h(t) = h_{0}(t) \exp(X_{1ij}\mathbf{\beta_{1}} + \alpha b_{0i}) $$

This can be fitted with merlin as follows,

. merlin (stime trt M1[id]      , family(weibull, failure(died))) ///
>        (logb time time#M2[id]@1 M1[id]@1, family(gaussian))     ///
>        , covariance(unstructured) restartvalues(M2 0.1)

Fitting fixed effects model:

Fitting full model:

Iteration 0:   log likelihood = -3147.7128  (not concave)
Iteration 1:   log likelihood =   -2283.14  (not concave)
Iteration 2:   log likelihood = -2007.3943  
Iteration 3:   log likelihood =  -1966.757  
Iteration 4:   log likelihood = -1955.9423  
Iteration 5:   log likelihood = -1955.7024  
Iteration 6:   log likelihood = -1955.7016  
Iteration 7:   log likelihood = -1955.7016  

Mixed effects regression model                  Number of obs     =      1,945
Log likelihood = -1955.7016
------------------------------------------------------------------------------
             |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
stime:       |            
         trt |   .1564189    .176379     0.89   0.375    -.1892776    .5021155
      M1[id] |   1.333705   .1138538    11.71   0.000     1.110556    1.556855
       _cons |  -3.883383   .2765513   -14.04   0.000    -4.425414   -3.341353
  log(gamma) |    .432533   .0722285     5.99   0.000     .2909677    .5740983
-------------+----------------------------------------------------------------
logb:        |            
        time |   .1782451   .0126941    14.04   0.000     .1533652    .2031251
 time#M2[id] |          1          .        .       .            .           .
      M1[id] |          1          .        .       .            .           .
       _cons |   .5007474   .0581737     8.61   0.000     .3867291    .6147657
  sd(resid.) |   .3516559   .0068733                      .3384392    .3653888
-------------+----------------------------------------------------------------
id:          |            
      sd(M1) |   .9999101   .0426313                      .9197501    1.087056
      sd(M2) |   .1653823   .0113989                      .1444842    .1893031
 corr(M2,M1) |   .4816441   .0743056                      .3234511     .613646
------------------------------------------------------------------------------

I simply include the M1[id] element in both linear predictors, and estimate a coefficient for it in the survival model. Because it’s named the same, merlin knows there are still only two random effects in this model.

Our $\alpha$ now represents the log hazard ratio for a one-unit increase in the patient-specific deviation from the mean intercept, i.e. baseline value of the biomarker. Given we are not using a time-dependent association structure in this example, we no longer require numerical integration to calculate the cumulative hazard function, and therefore don’t need to use the fp() or rcs() elements with the timevar() option. Such a structure provides us with computational speed gains because of this.

I can fit this model with stjm as well,

. stjm logb , panel(id) survmodel(weibull) rfp(1) survcov(trt) nocurrent intassoc nocoef
-> gen double _time_1 = X^(1)
(where X = _t0)

Obtaining initial values:

Fitting full model:

-> Conducting adaptive Gauss-Hermite quadrature

Iteration 0:   log likelihood = -1956.3262  
Iteration 1:   log likelihood = -1955.7027  
Iteration 2:   log likelihood = -1955.7015  
Iteration 3:   log likelihood = -1955.7015  

Joint model estimates                            Number of obs.     =     1945
Panel variable: id                               Number of panels   =      312
                                                 Number of failures =      140

Log-likelihood = -1955.7015

------------------------------------------------------------------------------
             |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
Longitudinal |
     _time_1 |   .1782451    .012694    14.04   0.000     .1533653    .2031248
       _cons |   .5007461   .0581723     8.61   0.000     .3867305    .6147618
-------------+----------------------------------------------------------------
Survival     |
   assoc:int |
       _cons |   1.333708    .113852    11.71   0.000     1.110562    1.556854
   ln_lambda |
         trt |     .15642   .1763785     0.89   0.375    -.1892755    .5021156
       _cons |  -3.883387   .2765534   -14.04   0.000    -4.425422   -3.341353
    ln_gamma |
       _cons |   .4325332   .0722285     5.99   0.000      .290968    .5740984
------------------------------------------------------------------------------

------------------------------------------------------------------------------
  Random effects Parameters  |   Estimate   Std. Err.     [95% Conf. Interval]
-----------------------------+------------------------------------------------
id: Unstructured             |
                 sd(_time_1) |   .1653823   .0113989      .1444842    .1893031
                   sd(_cons) |   .9999101   .0426311      .9197505    1.087056
         corr(_time_1,_cons) |   .4816445   .0743054       .323452    .6136461
-----------------------------+------------------------------------------------
                sd(Residual) |   .3516559   .0068733      .3384392    .3653886
------------------------------------------------------------------------------

 Longitudinal submodel: Linear mixed effects model
     Survival submodel: Weibull proportional hazards model
    Integration method: Adaptive Gauss-Hermite quadrature using 5 nodes
     Cumulative hazard: Gauss-Kronrod quadrature using 15 nodes

where I turn off the default current value association (nocurrent), and use the intercept association without including the fixed coefficient (intassoc nocoef).

Gradient or rate of change association structure

We can link the current gradient or rate of change of the biomarker directly to survival, for example

$$ h(t) = h_{0}(t) \exp(X_{1ij}\mathbf{\beta_{1}} + \alpha m^{\prime}_{i}(t)) $$

where

$$m^{\prime}_{i}(t) = \frac{\partial m_{i}(t)}{\partial t}$$

This can be fitted with merlin using the dEV[] syntax as follows

merlin (stime trt dEV[logb]        , family(weibull, failure(died)) timevar(stime))	///
       (logb fp(time,pow(1)) fp(time,pow(1))#M2[id]@1 M1[id]@1, family(gaussian) timevar(time))	///
       , covariance(unstructured) restartvalues(M2 0.1)

where $\alpha$ now represents the log hazard ratio for a one-unit increase in the gradient of the biomarker, at time $t$. Since the derivative of this linear model is constant, this model may struggle to estimate, as dEV[] uses numerical differentiation with respect to time (used so it can accommodate any trajectory), to calculate the derivative, which clearly isn’t needed in this case.

The same model in stjm can be fitted with,

stjm logb , panel(id) survmodel(weibull) rfp(1) survcov(trt) nocurrent derivassoc

Cumulative association structure

We can assess the cumulative effect of the biomarker directly to survival, for example

$$ h(t) = h_{0}(t) \exp \left( X_{1ij}\mathbf{\beta_{1}} + \alpha \int_{0}^{t} m_{i}(u) \partial u \right) $$

where the integral is generally approximated using numerical integration. This can be fitted with merlin using the iEV[] syntax as follows

. merlin (stime trt iEV[logb]        , family(weibull, failure(died)) timevar(stime))     ///
>        (logb fp(time,pow(1)) fp(time,pow(1))#M2[id]@1 M1[id]@1, family(gaussian) timevar(time)) ///
>        , covariance(unstructured) restartvalues(M2 0.1)
variables created for model 2, component 1: _cmp_2_1_1 to _cmp_2_1_1
variables created for model 2, component 2: _cmp_2_2_1 to _cmp_2_2_1

Fitting fixed effects model:

Fitting full model:

Iteration 0:   log likelihood = -3147.7168  (not concave)
Iteration 1:   log likelihood = -2312.6852  (not concave)
Iteration 2:   log likelihood = -2263.9611  (not concave)
Iteration 3:   log likelihood =  -2126.485  
Iteration 4:   log likelihood = -2048.1428  
Iteration 5:   log likelihood = -1991.9977  
Iteration 6:   log likelihood = -1985.8128  
Iteration 7:   log likelihood = -1985.7485  
Iteration 8:   log likelihood = -1985.7485  

Mixed effects regression model                  Number of obs     =      1,945
Log likelihood = -1985.7485
------------------------------------------------------------------------------
             |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
stime:       |            
         trt |  -.0979662   .1729337    -0.57   0.571      -.43691    .2409775
       iEV[] |   .1571468   .0143682    10.94   0.000     .1289857    .1853079
       _cons |  -2.644713   .2011413   -13.15   0.000    -3.038943   -2.250484
  log(gamma) |  -.2273467   .1007825    -2.26   0.024    -.4248768   -.0298166
-------------+----------------------------------------------------------------
logb:        |            
        fp() |   .1779682   .0130593    13.63   0.000     .1523724    .2035639
 fp()#M2[id] |          1          .        .       .            .           .
      M1[id] |          1          .        .       .            .           .
       _cons |   .4946745    .058134     8.51   0.000     .3807339    .6086151
  sd(resid.) |   .3487027   .0067517                      .3357176    .3621901
-------------+----------------------------------------------------------------
id:          |            
      sd(M1) |   .9993602   .0425709                       .919311     1.08638
      sd(M2) |   .1726163   .0118729                      .1508461    .1975283
 corr(M2,M1) |    .416575   .0764731                      .2563378    .5545283
------------------------------------------------------------------------------

where $\alpha$ now represents the log hazard ratio for a one-unit increase in the cumulative value of the biomarker, at time $t$. This example is discussed in more detail, and extended, in this tutorial. This is the first model in this tutorial that we can’t fit with stjm.

Related

comments powered by Disqus