# Joint longitudinal-survival models

This page describes examples of what can be considered the standard joint model framework, which we can define as follows,

$$h(t) = h_{0}(t) \exp(X_{1ij}\mathbf{\beta_{1}} + \alpha m_{i}(t))$$

and

$$y_{}(t) = m_{i}(t) + \epsilon_{ij}$$

where

$$m_{i}(t) = X_{2ij}(t)\mathbf{\beta}_{2} + Z_{ij}(t)b_{i}$$

and

$$b_{i} \sim N(0,\Sigma)$$

This formulation is known as the current value parametrisation, where $\alpha$ is a log hazard ratio for a one-unit increase in the biomarker, at time $t$. Let’s assume a simple random intercept and random linear trend for the biomarker trajectory, i.e.

$$m_{i}(t) = (\beta_{20} + b_{0i}) + (\beta_{21} + b_{1i})t$$

I’ll use the standard Primary Biliary Cirrhosis (PBC) example, often used to illustrate joint models.

. use https://www.mjcrowther.co.uk/data/jm_example.dta, clear
(Example dataset for joint modelling, Michael Crowther 2011)



I can fit the above joint model very simply using merlin, as follows

. merlin (stime trt EV[logb], family(weibull, failure(died)) timevar(stime))      ///
>        (logb fp(time,pow(1)) fp(time,pow(1))#M2[id]@1 M1[id]@1, family(gaussian) timevar(time)) ///
>        , covariance(unstructured) restartvalues(M2 0.1)
variables created for model 2, component 1: _cmp_2_1_1 to _cmp_2_1_1
variables created for model 2, component 2: _cmp_2_2_1 to _cmp_2_2_1

Fitting fixed effects model:

Fitting full model:

Iteration 0:   log likelihood = -3147.7168  (not concave)
Iteration 1:   log likelihood = -2332.4702  (not concave)
Iteration 2:   log likelihood = -2282.1053  (not concave)
Iteration 3:   log likelihood =  -2138.644
Iteration 4:   log likelihood = -2086.5157
Iteration 5:   log likelihood = -1952.1697
Iteration 6:   log likelihood =    -1921.1
Iteration 7:   log likelihood = -1919.2203
Iteration 8:   log likelihood = -1919.2164
Iteration 9:   log likelihood = -1919.2164

Mixed effects regression model                  Number of obs     =      1,945
Log likelihood = -1919.2164
------------------------------------------------------------------------------
|      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
stime:       |
trt |   .0441736     .17909     0.25   0.805    -.3068362    .3951835
EV[] |   1.240676   .0932792    13.30   0.000     1.057852      1.4235
_cons |  -4.411849   .2741419   -16.09   0.000    -4.949157   -3.874541
log(gamma) |    .019314   .0825814     0.23   0.815    -.1425425    .1811706
-------------+----------------------------------------------------------------
logb:        |
fp() |   .1850394   .0133236    13.89   0.000     .1589256    .2111532
fp()#M2[id] |          1          .        .       .            .           .
M1[id] |          1          .        .       .            .           .
_cons |   .4929444   .0582791     8.46   0.000     .3787195    .6071692
sd(resid.) |   .3471211   .0066724                      .3342868    .3604481
-------------+----------------------------------------------------------------
id:          |
sd(M1) |   1.002467   .0426595                      .9222474    1.089664
sd(M2) |   .1808176   .0123978                      .1580803    .2068253
corr(M2,M1) |   .4252257   .0729127                      .2725388    .5570211
------------------------------------------------------------------------------



Here, I use the EV[] element type to link the expected value of the response for the model for logb, my repeatedly measured biomarker, directly to survival. Because this is a time-dependent association structure, I can’t simply use the variable time in the linear predictor for the biomarker model, as I need to integrate out time in my survival model contribution to the log likelihood. Therefore, anything to do with time must be done using the element functions fp() and rcs(), and consequently we must specify the timevar() option as well, in both models.

You’ll see I’ve used the restartvalues() option to specify a starting value for the variance of the random effect on the slope. Starting values are a big challenge in such a general framework such as merlin, and as such the default of 1 doesn’t work. Within a joint model, we can fit a separate linear mixed effects model to get an idea of the scale of the variance, to provide a more sensible starting value. I’m working on improving this.

Looking at the results, we get an estimate of $\alpha = 1.241$ (95% CI: 1.058, 1.423), showing a very strong association between the biomarker and survival, with a hazard ratio of 3.459, indicating a large increase in the mortality rate for a one-unit increase in the log of serum bilirubin, at time $t$.

Note, this is a simple random intercept and random linear slope model for my biomarker trajectory. We should investigate whether something more complex is required, by using fractional polynomials or splines, but I leave that to you. Hint…it does matter (Crowther et al., 2016)

We can of course fit the same model using stjm, but we need to setup the data into start and stop times for each observation window, and then stset our data.

. stset stop, enter(start) f(event)

failure event:  event != 0 & event < .
obs. time interval:  (0, stop]
enter on or after:  time start
exit on or before:  failure

------------------------------------------------------------------------------
1,945  total observations
0  exclusions
------------------------------------------------------------------------------
1,945  observations remaining, representing
140  failures in single-record/single-failure data
2,000.307  total analysis time at risk and under observation
at risk from t =         0
earliest observed entry t =         0
last observed exit t =  14.30566

. stjm logb , panel(id) survmodel(weibull) rfp(1) survcov(trt)
-> gen double _time_1 = X^(1)
(where X = _t0)

Obtaining initial values:

Fitting full model:

Iteration 0:   log likelihood = -1923.9524
Iteration 1:   log likelihood = -1919.2258
Iteration 2:   log likelihood = -1919.2035
Iteration 3:   log likelihood = -1919.2034

Joint model estimates                            Number of obs.     =     1945
Panel variable: id                               Number of panels   =      312
Number of failures =      140

Log-likelihood = -1919.2034

------------------------------------------------------------------------------
|      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
Longitudinal |
_time_1 |    .184966   .0133105    13.90   0.000     .1588779    .2110541
_cons |   .4927079    .058284     8.45   0.000     .3784733    .6069424
-------------+----------------------------------------------------------------
Survival     |
assoc:value |
_cons |   1.240578    .093102    13.32   0.000     1.058101    1.423054
ln_lambda |
trt |   .0438859   .1790461     0.25   0.806    -.3070381    .3948098
_cons |  -4.410736    .274055   -16.09   0.000    -4.947874   -3.873598
ln_gamma |
_cons |   .0188261   .0827756     0.23   0.820     -.143411    .1810632
------------------------------------------------------------------------------

------------------------------------------------------------------------------
Random effects Parameters  |   Estimate   Std. Err.     [95% Conf. Interval]
-----------------------------+------------------------------------------------
id: Unstructured             |
sd(_time_1) |   .1807736   .0123757      .1580745    .2067322
sd(_cons) |   1.002508   .0426641      .9222805    1.089715
corr(_time_1,_cons) |     .42525   .0728635      .2726711    .5569635
-----------------------------+------------------------------------------------
sd(Residual) |   .3471243   .0066718       .334291    .3604502
------------------------------------------------------------------------------

Longitudinal submodel: Linear mixed effects model
Survival submodel: Weibull proportional hazards model
Cumulative hazard: Gauss-Kronrod quadrature using 15 nodes



You can immediately see how much quicker, with fewer iterations, that stjm reaches convergence - because it’s a specific joint model implementation, I can provide very good starting values.

The current value association structure is one of many ways to link the outcomes models, so let’s look at how to fit some others.

### Random effects association structure

We can directly link individual random effects to survival, for example the random intercept,

$$h(t) = h_{0}(t) \exp(X_{1ij}\mathbf{\beta_{1}} + \alpha b_{0i})$$

This can be fitted with merlin as follows,

. merlin (stime trt M1[id]      , family(weibull, failure(died))) ///
>        (logb time time#M2[id]@1 M1[id]@1, family(gaussian))     ///
>        , covariance(unstructured) restartvalues(M2 0.1)

Fitting fixed effects model:

Fitting full model:

Iteration 0:   log likelihood = -3147.7128  (not concave)
Iteration 1:   log likelihood =   -2283.14  (not concave)
Iteration 2:   log likelihood = -2007.3943
Iteration 3:   log likelihood =  -1966.757
Iteration 4:   log likelihood = -1955.9423
Iteration 5:   log likelihood = -1955.7024
Iteration 6:   log likelihood = -1955.7016
Iteration 7:   log likelihood = -1955.7016

Mixed effects regression model                  Number of obs     =      1,945
Log likelihood = -1955.7016
------------------------------------------------------------------------------
|      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
stime:       |
trt |   .1564189    .176379     0.89   0.375    -.1892776    .5021155
M1[id] |   1.333705   .1138538    11.71   0.000     1.110556    1.556855
_cons |  -3.883383   .2765513   -14.04   0.000    -4.425414   -3.341353
log(gamma) |    .432533   .0722285     5.99   0.000     .2909677    .5740983
-------------+----------------------------------------------------------------
logb:        |
time |   .1782451   .0126941    14.04   0.000     .1533652    .2031251
time#M2[id] |          1          .        .       .            .           .
M1[id] |          1          .        .       .            .           .
_cons |   .5007474   .0581737     8.61   0.000     .3867291    .6147657
sd(resid.) |   .3516559   .0068733                      .3384392    .3653888
-------------+----------------------------------------------------------------
id:          |
sd(M1) |   .9999101   .0426313                      .9197501    1.087056
sd(M2) |   .1653823   .0113989                      .1444842    .1893031
corr(M2,M1) |   .4816441   .0743056                      .3234511     .613646
------------------------------------------------------------------------------



I simply include the M1[id] element in both linear predictors, and estimate a coefficient for it in the survival model. Because it’s named the same, merlin knows there are still only two random effects in this model.

Our $\alpha$ now represents the log hazard ratio for a one-unit increase in the patient-specific deviation from the mean intercept, i.e. baseline value of the biomarker. Given we are not using a time-dependent association structure in this example, we no longer require numerical integration to calculate the cumulative hazard function, and therefore don’t need to use the fp() or rcs() elements with the timevar() option. Such a structure provides us with computational speed gains because of this.

I can fit this model with stjm as well,

. stjm logb , panel(id) survmodel(weibull) rfp(1) survcov(trt) nocurrent intassoc nocoef
-> gen double _time_1 = X^(1)
(where X = _t0)

Obtaining initial values:

Fitting full model:

Iteration 0:   log likelihood = -1956.3262
Iteration 1:   log likelihood = -1955.7027
Iteration 2:   log likelihood = -1955.7015
Iteration 3:   log likelihood = -1955.7015

Joint model estimates                            Number of obs.     =     1945
Panel variable: id                               Number of panels   =      312
Number of failures =      140

Log-likelihood = -1955.7015

------------------------------------------------------------------------------
|      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
Longitudinal |
_time_1 |   .1782451    .012694    14.04   0.000     .1533653    .2031248
_cons |   .5007461   .0581723     8.61   0.000     .3867305    .6147618
-------------+----------------------------------------------------------------
Survival     |
assoc:int |
_cons |   1.333708    .113852    11.71   0.000     1.110562    1.556854
ln_lambda |
trt |     .15642   .1763785     0.89   0.375    -.1892755    .5021156
_cons |  -3.883387   .2765534   -14.04   0.000    -4.425422   -3.341353
ln_gamma |
_cons |   .4325332   .0722285     5.99   0.000      .290968    .5740984
------------------------------------------------------------------------------

------------------------------------------------------------------------------
Random effects Parameters  |   Estimate   Std. Err.     [95% Conf. Interval]
-----------------------------+------------------------------------------------
id: Unstructured             |
sd(_time_1) |   .1653823   .0113989      .1444842    .1893031
sd(_cons) |   .9999101   .0426311      .9197505    1.087056
corr(_time_1,_cons) |   .4816445   .0743054       .323452    .6136461
-----------------------------+------------------------------------------------
sd(Residual) |   .3516559   .0068733      .3384392    .3653886
------------------------------------------------------------------------------

Longitudinal submodel: Linear mixed effects model
Survival submodel: Weibull proportional hazards model
Cumulative hazard: Gauss-Kronrod quadrature using 15 nodes



where I turn off the default current value association (nocurrent), and use the intercept association without including the fixed coefficient (intassoc nocoef).

### Gradient or rate of change association structure

We can link the current gradient or rate of change of the biomarker directly to survival, for example

$$h(t) = h_{0}(t) \exp(X_{1ij}\mathbf{\beta_{1}} + \alpha m^{\prime}_{i}(t))$$

where

$$m^{\prime}_{i}(t) = \frac{\partial m_{i}(t)}{\partial t}$$

This can be fitted with merlin using the dEV[] syntax as follows

merlin (stime trt dEV[logb]        , family(weibull, failure(died)) timevar(stime))	///
(logb fp(time,pow(1)) fp(time,pow(1))#M2[id]@1 M1[id]@1, family(gaussian) timevar(time))	///
, covariance(unstructured) restartvalues(M2 0.1)


where $\alpha$ now represents the log hazard ratio for a one-unit increase in the gradient of the biomarker, at time $t$. Since the derivative of this linear model is constant, this model may struggle to estimate, as dEV[] uses numerical differentiation with respect to time (used so it can accommodate any trajectory), to calculate the derivative, which clearly isn’t needed in this case.

The same model in stjm can be fitted with,

stjm logb , panel(id) survmodel(weibull) rfp(1) survcov(trt) nocurrent derivassoc


### Cumulative association structure

We can assess the cumulative effect of the biomarker directly to survival, for example

$$h(t) = h_{0}(t) \exp \left( X_{1ij}\mathbf{\beta_{1}} + \alpha \int_{0}^{t} m_{i}(u) \partial u \right)$$

where the integral is generally approximated using numerical integration. This can be fitted with merlin using the iEV[] syntax as follows

. merlin (stime trt iEV[logb]        , family(weibull, failure(died)) timevar(stime))     ///
>        (logb fp(time,pow(1)) fp(time,pow(1))#M2[id]@1 M1[id]@1, family(gaussian) timevar(time)) ///
>        , covariance(unstructured) restartvalues(M2 0.1)
variables created for model 2, component 1: _cmp_2_1_1 to _cmp_2_1_1
variables created for model 2, component 2: _cmp_2_2_1 to _cmp_2_2_1

Fitting fixed effects model:

Fitting full model:

Iteration 0:   log likelihood = -3147.7168  (not concave)
Iteration 1:   log likelihood = -2312.6852  (not concave)
Iteration 2:   log likelihood = -2263.9611  (not concave)
Iteration 3:   log likelihood =  -2126.485
Iteration 4:   log likelihood = -2048.1428
Iteration 5:   log likelihood = -1991.9977
Iteration 6:   log likelihood = -1985.8128
Iteration 7:   log likelihood = -1985.7485
Iteration 8:   log likelihood = -1985.7485

Mixed effects regression model                  Number of obs     =      1,945
Log likelihood = -1985.7485
------------------------------------------------------------------------------
|      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
stime:       |
trt |  -.0979662   .1729337    -0.57   0.571      -.43691    .2409775
iEV[] |   .1571468   .0143682    10.94   0.000     .1289857    .1853079
_cons |  -2.644713   .2011413   -13.15   0.000    -3.038943   -2.250484
log(gamma) |  -.2273467   .1007825    -2.26   0.024    -.4248768   -.0298166
-------------+----------------------------------------------------------------
logb:        |
fp() |   .1779682   .0130593    13.63   0.000     .1523724    .2035639
fp()#M2[id] |          1          .        .       .            .           .
M1[id] |          1          .        .       .            .           .
_cons |   .4946745    .058134     8.51   0.000     .3807339    .6086151
sd(resid.) |   .3487027   .0067517                      .3357176    .3621901
-------------+----------------------------------------------------------------
id:          |
sd(M1) |   .9993602   .0425709                       .919311     1.08638
sd(M2) |   .1726163   .0118729                      .1508461    .1975283
corr(M2,M1) |    .416575   .0764731                      .2563378    .5545283
------------------------------------------------------------------------------



where $\alpha$ now represents the log hazard ratio for a one-unit increase in the cumulative value of the biomarker, at time $t$. This example is discussed in more detail, and extended, in this tutorial. This is the first model in this tutorial that we can’t fit with stjm.