# Linear quantile mixed effects models

The do file with all the code from this tutorial is available to download here

I’ll start this by making clear that I know pretty much nothing about quantile regression. This is very much a ‘here is the tool, do with it what you will!’ kind of tutorial.

In my search for things I could do with merlin, I came across a few papers on linear quantile mixed effects models (Geraci and Bottai (2007), Geraci (2014)). From Geraci (2014), we have a continuous random variable $y \in \mathcal{R}$ is said to follow an asymmetric Laplace density with parameters $\mu, \sigma, \tau$, $y ∼ \text{AL}(\mu, \sigma, \tau)$, if its density can be expressed as

$$p(y | \mu, \sigma, \tau) = \frac{\tau (1-\tau)}{\sigma} \exp \left(-\frac{1}{\sigma}\rho_{\tau} (y-\mu) \right)$$

with $\mu \in \mathcal{R}$, $\sigma \in \mathcal{R}_{+}$, $0<\tau<1$ and,

$$\rho_{\tau}(v) = \tau \text{max}(v,0) + (1-\tau) \text{max}(-v,0)$$

Given that the above density function is pretty simple, where $\mu$ is our complex linear predictor, we can of course incorporate it into the merlin world. How exciting.

Here’s the source code for the log likelihood evaluator, which I provide for reference, and to give you an idea of just how easy it is to add new models,

real matrix merlin_logl_qtile(gml)
{
y 		= merlin_util_depvar(gml)
mu 		= merlin_util_xzb(gml)
sigma 		= exp(logsigma)
tau 		= gml.qtiles[gml.model]
omtau		= 1-tau
v 		= y :- mu
ncol 		= cols(mu)
row 		= J(rows(mu),ncol,0)

for (c=1;c<=ncol;c++) {
index1 = selectindex(v[,c]:>0)
row[index1,c] = v[index1,c] :* tau
index2 = selectindex((-v[,c]):>0)
row[index2,c] = -v[index2,c] :* omtau
}

return(log(tau) :+ log(omtau) :- logsigma :- row :/ sigma)
}


I’ll replicate some of the examples from the Journal of Statistical Software paper on the R package lqmm (Geraci (2014)), using the orthodontic growth dataset. This dataset consists of repeated measurements of the distance between the center of the pituitary to the pterygomaxillary fissure, two points that are identified on x-ray exposures of the side of the head, in 27 children (16 boys, 11 girls). Measurements were taken by researchers of the University of North Carolina Dental School at four different ages (8, 10, 12, 14 years), giving 108 observations in total, to study growth patterns by sex.

I’ll load the data, generate centred age and only look at the subset of females.

. use "https://www.mjcrowther.co.uk/data/orth.dta",clear

. gen agec = age - 11

. keep if Sex==2
(64 observations deleted)



To start we can of course fit a simple Gaussian response model, modelling distance as a linear function of fixed centred age, agec, with a random intercept.

. merlin (distance agec M1[id]@1 , family(gaussian))

Fitting fixed effects model:

Fitting full model:

Iteration 0:   log likelihood = -89.532139  (not concave)
Iteration 1:   log likelihood = -76.630288
Iteration 2:   log likelihood = -69.132666
Iteration 3:   log likelihood = -69.015572
Iteration 4:   log likelihood = -69.015198
Iteration 5:   log likelihood = -69.015198

Mixed effects regression model                  Number of obs     =         44
Log likelihood = -69.015198
------------------------------------------------------------------------------
|      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
distance:    |
agec |   .4795455   .0517869     9.26   0.000      .378045    .5810459
M1[id] |          1          .        .       .            .           .
_cons |   22.64773   .6051215    37.43   0.000     21.46171    23.83374
sd(resid.) |   .7681236   .0945495                      .6034696    .9777026
-------------+----------------------------------------------------------------
id:          |
sd(M1) |    1.96987   .4360396                      1.276499    3.039869
------------------------------------------------------------------------------



To fit our linear quantile model we use the new family(lquantile). The default quantile() is 0.5, but I show the full syntax here for clarity. I use Gauss-Hermite quadrature with 7 integration points, as was done in the JSS paper.

. merlin (distance agec M1[id]@1, family(lquantile, quantile(0.5))) , nolog

Fitting fixed effects model:

Fitting full model:

Mixed effects regression model                  Number of obs     =         44
Log likelihood = -70.308367
------------------------------------------------------------------------------
|      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
distance:    |
agec |   .4484051   .0019427   230.82   0.000     .4445975    .4522128
M1[id] |          1          .        .       .            .           .
_cons |   23.06786    .004131  5584.06   0.000     23.05976    23.07595
sd(resid.) |   .2945794   .0481925                      .2137706    .4059353
-------------+----------------------------------------------------------------
id:          |
sd(M1) |   1.919401   .0047394                      1.910134    1.928712
------------------------------------------------------------------------------

. est store g7



We find the median distance for an 11 year old girl is 23.092 mm, with a median slope showing an increase of 0.448 mm per year. We get a random effect standard deviation of 1.915, and estimate of the scale of $\sigma = 0.294$. I’ve added the nolog option to suppress the rather long iteration log.

We should immediately consider increasing our intpoints() and comparing estimates.

. merlin (distance agec M1[id]@1, family(lquant, q(0.5))) , intpoints(75) nolog

Fitting fixed effects model:

Fitting full model:

Mixed effects regression model                  Number of obs     =         44
Log likelihood = -70.248436
------------------------------------------------------------------------------
|      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
distance:    |
agec |   .4532894   .0018378   246.64   0.000     .4496873    .4568915
M1[id] |          1          .        .       .            .           .
_cons |   22.95262   .0051768  4433.75   0.000     22.94247    22.96276
sd(resid.) |   .2947333   .0482917                      .2137769    .4063476
-------------+----------------------------------------------------------------
id:          |
sd(M1) |    2.02848    .018345                      1.992841    2.064756
------------------------------------------------------------------------------

. est store g75



To compare we use est tab

. est tab g7 g75

----------------------------------------
Variable |     g7          g75
-------------+--------------------------
_cmp_1_1_1   |
_cons |  .44840513    .45328939
-------------+--------------------------
_cmp_1_2_1   |
_cons |          1            1
-------------+--------------------------
cons1        |
_cons |  23.067858    22.952618
-------------+--------------------------
dap1_1       |
_cons | -1.2222067   -1.2216844
-------------+--------------------------
lns1_1       |
_cons |  .65201311    .70728681
----------------------------------------



The above shows good agreement, but the iteration log is so poor in both model fits, I would definitely explore further. As I said in the beginning, my knowledge of these models is extremely limited, but in this case, I think the dataset is both small, highly influenced by the integration technique, and almost certainly the starting values (another thing for you to explore…).

Having said that, now that we’ve brought this model within the merlin realm, we get all our methodological extensions for free! Extensions including higher levels, time-dependent effects, non-linear covariate effects, joint models of various types…