The standard joint model was described and illustrated with merlin
here.
In this example we consider time-dependent effects in the survival submodel. Once again we have,
$$h(t) = h_{0}(t) \exp(X_{1ij}\mathbf{\beta_{1}} + \alpha m_{i}(t))$$
and
$$y_{}(t) = m_{i}(t) + \epsilon_{ij}$$
where
$$m_{i}(t) = X_{2ij}(t)\mathbf{\beta}_{2} + Z_{ij}(t)b_{i}$$
and
$$b_{i} \sim N(0,\Sigma)$$
Given the survival submodel is a proportional hazards model, we are making the assumption that both our hazard ratios $\exp (\beta_{1})$ for the covariates $X_{1ij}$, and the hazard ratio for the effect of our biomarker, $\exp(\alpha)$, are constant across follow-up time. We can relax these assumptions to allow for non-proportional hazards i.e. by modelling time-dependent effects. First we consider the following model,
$$h(t) = h_{0}(t) \exp(X_{1ij}\beta_{1}(t) + \alpha m_{i}(t))$$
where if we assume $X_{1ij}$ is a binary treatment variable, we can allow its associated hazard ratio $\beta_{1}(t)$ to be time-dependent, by making it a function of $\log (t)$
$$h(t) = h_{0}(t) \exp(X_{1ij} \beta_{1} + X_{1ij} \times \beta_{2} \times \log (t) + \alpha m_{i}(t))$$
We can fit this model with merlin
as follows
. use https://www.mjcrowther.co.uk/data/jm_example.dta, clear
(Example dataset for joint modelling, Michael Crowther 2011)
. merlin (stime trt trt#fp(stime,pow(0)) EV[logb], family(weibull, failure(died)) timevar(stime)) ///
> (logb fp(time,pow(1)) fp(time,pow(1))#M2[id]@1 M1[id]@1, family(gaussian) timevar(time))
variables created for model 1, component 2: _cmp_1_2_1 to _cmp_1_2_1
variables created for model 2, component 1: _cmp_2_1_1 to _cmp_2_1_1
variables created for model 2, component 2: _cmp_2_2_1 to _cmp_2_2_1
Fitting fixed effects model:
Fitting full model:
Iteration 0: log likelihood = -3328.5706
Iteration 1: log likelihood = -2539.587
Iteration 2: log likelihood = -2107.9207
Iteration 3: log likelihood = -1962.2205
Iteration 4: log likelihood = -1933.5759
Iteration 5: log likelihood = -1932.1907
Iteration 6: log likelihood = -1932.1887
Iteration 7: log likelihood = -1932.1887
Mixed effects regression model Number of obs = 1,945
Log likelihood = -1932.1887
------------------------------------------------------------------------------
| Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
stime: |
trt | .1236867 .2526515 0.49 0.624 -.3715012 .6188746
trt#fp() | -.0929622 .1662182 -0.56 0.576 -.4187439 .2328196
EV[] | 1.261099 .0955683 13.20 0.000 1.073788 1.448409
_cons | -4.508608 .3321893 -13.57 0.000 -5.159688 -3.857529
log(gamma) | .0631748 .1101585 0.57 0.566 -.152732 .2790815
-------------+----------------------------------------------------------------
logb: |
fp() | .1690606 .013109 12.90 0.000 .1433674 .1947537
fp()#M2[id] | 1 . . . . .
M1[id] | 1 . . . . .
_cons | .4993858 .0595809 8.38 0.000 .3826093 .6161623
sd(resid.) | .345686 .006619 .3329534 .3589055
-------------+----------------------------------------------------------------
id: |
sd(M1) | 1.026386 .0433006 .9449324 1.114861
sd(M2) | .1805098 .0123637 .1578334 .2064441
------------------------------------------------------------------------------
We can also relax the proportional hazards assumption on our association parameter, as follows,
$$h(t) = h_{0}(t) \exp(X_{1ij}\mathbf{\beta_{1}} + \alpha_{1} m_{i}(t) + \alpha_{2} \times \log (t) \times m_{i}(t) )$$
which can be fitted using,
. merlin (stime trt EV[logb] EV[logb]#fp(stime,pow(0)) ///
> , family(weibull, failure(died)) timevar(stime)) ///
> (logb fp(time,pow(1)) fp(time,pow(1))#M2[id]@1 M1[id]@1 ///
> , family(gaussian) timevar(time))
variables created for model 1, component 3: _cmp_1_3_1 to _cmp_1_3_1
variables created for model 2, component 1: _cmp_2_1_1 to _cmp_2_1_1
variables created for model 2, component 2: _cmp_2_2_1 to _cmp_2_2_1
Fitting fixed effects model:
Fitting full model:
Iteration 0: log likelihood = -3328.8882 (not concave)
Iteration 1: log likelihood = -2385.0776
Iteration 2: log likelihood = -1966.1995
Iteration 3: log likelihood = -1933.5298
Iteration 4: log likelihood = -1929.9819
Iteration 5: log likelihood = -1929.8716
Iteration 6: log likelihood = -1929.8674
Iteration 7: log likelihood = -1929.8674
Mixed effects regression model Number of obs = 1,945
Log likelihood = -1929.8674
------------------------------------------------------------------------------
| Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
stime: |
trt | .0760779 .1805075 0.42 0.673 -.2777102 .429866
EV[] | 1.518906 .1637623 9.28 0.000 1.197937 1.839874
EV[]#fp() | -.2095986 .0977554 -2.14 0.032 -.4011957 -.0180015
_cons | -5.362488 .5439862 -9.86 0.000 -6.428681 -4.296295
log(gamma) | .3840735 .1604291 2.39 0.017 .0696382 .6985087
-------------+----------------------------------------------------------------
logb: |
fp() | .1698028 .0132098 12.85 0.000 .1439119 .1956936
fp()#M2[id] | 1 . . . . .
M1[id] | 1 . . . . .
_cons | .5003171 .0595577 8.40 0.000 .3835862 .6170479
sd(resid.) | .3454356 .0066109 .3327185 .3586388
-------------+----------------------------------------------------------------
id: |
sd(M1) | 1.025622 .0432787 .9442107 1.114053
sd(M2) | .1815506 .0124789 .1586684 .2077327
------------------------------------------------------------------------------
For completeness we can model non-proportional hazards in both our treatment effect and our association parameter as follows,
. merlin (stime trt trt#fp(stime,pow(0)) EV[logb] EV[logb]#fp(stime,pow(0)), ///
> family(weibull, failure(died)) timevar(stime)) ///
> (logb fp(time,pow(1)) fp(time,pow(1))#M2[id]@1 M1[id]@1, ///
> family(gaussian) timevar(time))
variables created for model 1, component 2: _cmp_1_2_1 to _cmp_1_2_1
variables created for model 1, component 4: _cmp_1_4_1 to _cmp_1_4_1
variables created for model 2, component 1: _cmp_2_1_1 to _cmp_2_1_1
variables created for model 2, component 2: _cmp_2_2_1 to _cmp_2_2_1
Fitting fixed effects model:
Fitting full model:
Iteration 0: log likelihood = -3328.5706 (not concave)
Iteration 1: log likelihood = -2384.8956
Iteration 2: log likelihood = -1964.4008
Iteration 3: log likelihood = -1933.1154
Iteration 4: log likelihood = -1929.7005
Iteration 5: log likelihood = -1929.5334
Iteration 6: log likelihood = -1929.5161
Iteration 7: log likelihood = -1929.516
Mixed effects regression model Number of obs = 1,945
Log likelihood = -1929.516
------------------------------------------------------------------------------
| Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
stime: |
trt | .2345087 .2600758 0.90 0.367 -.2752304 .7442478
trt#fp() | -.1396825 .1661591 -0.84 0.401 -.4653484 .1859834
EV[] | 1.54404 .1675975 9.21 0.000 1.215555 1.872525
EV[]#fp() | -.220537 .0990953 -2.23 0.026 -.4147602 -.0263138
_cons | -5.544952 .5856712 -9.47 0.000 -6.692847 -4.397058
log(gamma) | .4437689 .1675443 2.65 0.008 .1153881 .7721497
-------------+----------------------------------------------------------------
logb: |
fp() | .1698974 .0132107 12.86 0.000 .1440049 .1957899
fp()#M2[id] | 1 . . . . .
M1[id] | 1 . . . . .
_cons | .500362 .0595612 8.40 0.000 .3836241 .6170999
sd(resid.) | .3454272 .00661 .3327118 .3586286
-------------+----------------------------------------------------------------
id: |
sd(M1) | 1.025705 .0432803 .9442896 1.114139
sd(M2) | .1815853 .0124773 .1587055 .2077636
------------------------------------------------------------------------------
In any of the models above, there is no restriction on what functional form we use to model each of the time-dependent
effects. If a more complex form is desired, then we can use the fp()
or rcs()
syntax to get more flexible time
functions. For example using fractional polynomials,
. merlin (stime trt EV[logb] EV[logb]#fp(stime,pow(1 2)) ///
> , family(weibull, failure(died)) timevar(stime)) ///
> (logb fp(time,pow(1)) fp(time,pow(1))#M2[id]@1 M1[id]@1 ///
> , family(gaussian) timevar(time))
variables created for model 1, component 3: _cmp_1_3_1 to _cmp_1_3_2
variables created for model 2, component 1: _cmp_2_1_1 to _cmp_2_1_1
variables created for model 2, component 2: _cmp_2_2_1 to _cmp_2_2_1
Fitting fixed effects model:
Fitting full model:
Iteration 0: log likelihood = -3328.8882 (not concave)
Iteration 1: log likelihood = -2387.2505
Iteration 2: log likelihood = -2102.9719
Iteration 3: log likelihood = -1961.3853
Iteration 4: log likelihood = -1931.6287
Iteration 5: log likelihood = -1930.0736
Iteration 6: log likelihood = -1930.066
Iteration 7: log likelihood = -1930.066
Mixed effects regression model Number of obs = 1,945
Log likelihood = -1930.066
------------------------------------------------------------------------------
| Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
stime: |
trt | .0678673 .1803207 0.38 0.707 -.2855548 .4212893
EV[] | 1.489866 .2001506 7.44 0.000 1.097578 1.882154
EV[]#fp():1 | -.0563373 .0658021 -0.86 0.392 -.185307 .0726324
EV[]#fp():2 | .0012084 .0044445 0.27 0.786 -.0075026 .0099195
_cons | -4.948354 .4383442 -11.29 0.000 -5.807492 -4.089215
log(gamma) | .2286858 .1430934 1.60 0.110 -.0517721 .5091437
-------------+----------------------------------------------------------------
logb: |
fp() | .1689365 .0131413 12.86 0.000 .14318 .194693
fp()#M2[id] | 1 . . . . .
M1[id] | 1 . . . . .
_cons | .5000151 .0596196 8.39 0.000 .3831628 .6168673
sd(resid.) | .3455656 .0066158 .3328391 .3587787
-------------+----------------------------------------------------------------
id: |
sd(M1) | 1.026716 .0433295 .9452087 1.115251
sd(M2) | .1805794 .0123842 .1578673 .206559
------------------------------------------------------------------------------
or restricted cubic splines,
. merlin (stime trt EV[logb] EV[logb]#rcs(stime, df(3) log event) ///
> , family(weibull, failure(died)) timevar(stime)) ///
> (logb fp(time,pow(1)) fp(time,pow(1))#M2[id]@1 M1[id]@1 ///
> , family(gaussian) timevar(time))
variables created for model 1, component 3: _cmp_1_3_1 to _cmp_1_3_3
variables created for model 2, component 1: _cmp_2_1_1 to _cmp_2_1_1
variables created for model 2, component 2: _cmp_2_2_1 to _cmp_2_2_1
Fitting fixed effects model:
Fitting full model:
Iteration 0: log likelihood = -3328.8882 (not concave)
Iteration 1: log likelihood = -2385.6863
Iteration 2: log likelihood = -2100.3945 (not concave)
Iteration 3: log likelihood = -1976.7383
Iteration 4: log likelihood = -1933.5373
Iteration 5: log likelihood = -1929.5419
Iteration 6: log likelihood = -1929.3196
Iteration 7: log likelihood = -1929.316
Iteration 8: log likelihood = -1929.316
Mixed effects regression model Number of obs = 1,945
Log likelihood = -1929.316
------------------------------------------------------------------------------
| Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
stime: |
trt | .0860597 .1812671 0.47 0.635 -.2692172 .4413366
EV[] | 1.574605 .2154207 7.31 0.000 1.152389 1.996822
EV[]#rcs():1 | -.1524374 .1325173 -1.15 0.250 -.4121665 .1072917
EV[]#rcs():2 | -.0205148 .118086 -0.17 0.862 -.251959 .2109295
EV[]#rcs():3 | .0474106 .1847866 0.26 0.798 -.3147645 .4095856
_cons | -5.395892 .5639953 -9.57 0.000 -6.501303 -4.290482
log(gamma) | .387279 .1669141 2.32 0.020 .0601334 .7144246
-------------+----------------------------------------------------------------
logb: |
fp() | .1695192 .013202 12.84 0.000 .1436436 .1953947
fp()#M2[id] | 1 . . . . .
M1[id] | 1 . . . . .
_cons | .5005547 .0596044 8.40 0.000 .3837322 .6173771
sd(resid.) | .3454686 .0066128 .3327479 .3586755
-------------+----------------------------------------------------------------
id: |
sd(M1) | 1.026383 .0433163 .9449014 1.114892
sd(M2) | .1812489 .0124586 .1584039 .2073886
------------------------------------------------------------------------------
I’ve also only shown a time-dependent effect under the current value parameterisation. Any of the other structures shown here can also be made time-dependent in a similar way.
This example used merlin
version 1.15.1.