The do file with all the code from this tutorial is available to download here
I’ll start this by making clear that I know pretty much nothing about quantile regression. This is very much a ‘here is the tool, do with it what you will!’ kind of tutorial.
In my search for things I could do with merlin
, I came across a few papers on linear quantile mixed effects models (Geraci and Bottai (2007), Geraci (2014)). From Geraci (2014), we have a continuous random variable $y \in \mathcal{R}$ is said to follow an asymmetric Laplace density with parameters $\mu, \sigma, \tau$, $y ∼ \text{AL}(\mu, \sigma, \tau)$, if its density can be expressed as
$$ p(y | \mu, \sigma, \tau) = \frac{\tau (1-\tau)}{\sigma} \exp \left(-\frac{1}{\sigma}\rho_{\tau} (y-\mu) \right) $$
with $\mu \in \mathcal{R}$, $\sigma \in \mathcal{R}_{+}$, $0<\tau<1$ and,
$$ \rho_{\tau}(v) = \tau \text{max}(v,0) + (1-\tau) \text{max}(-v,0) $$
Given that the above density function is pretty simple, where $\mu$ is our complex linear predictor, we can of course incorporate it into the merlin
world. How exciting.
Here’s the source code for the log likelihood evaluator, which I provide for reference, and to give you an idea of just how easy it is to add new models,
real matrix merlin_logl_qtile(gml)
{
y = merlin_util_depvar(gml)
mu = merlin_util_xzb(gml)
logsigma = merlin_util_ap(gml,1) //already exp'd
sigma = exp(logsigma)
tau = gml.qtiles[gml.model]
omtau = 1-tau
v = y :- mu
ncol = cols(mu)
row = J(rows(mu),ncol,0)
for (c=1;c<=ncol;c++) {
index1 = selectindex(v[,c]:>0)
row[index1,c] = v[index1,c] :* tau
index2 = selectindex((-v[,c]):>0)
row[index2,c] = -v[index2,c] :* omtau
}
return(log(tau) :+ log(omtau) :- logsigma :- row :/ sigma)
}
I’ll replicate some of the examples from the Journal of Statistical Software paper on the R package lqmm
(Geraci (2014)), using the orthodontic growth dataset. This dataset consists of repeated measurements of the distance
between the center of the pituitary to the pterygomaxillary fissure, two points that are identified on x-ray exposures of the side of the head, in 27 children (16 boys, 11 girls). Measurements were taken by researchers of the University of North Carolina Dental School at four different ages (8, 10, 12, 14 years), giving 108 observations in total, to study growth patterns by sex.
I’ll load the data, generate centred age and only look at the subset of females.
. use "https://www.mjcrowther.co.uk/data/orth.dta",clear
. gen agec = age - 11
. keep if Sex==2
(64 observations deleted)
To start we can of course fit a simple Gaussian response model, modelling distance
as a linear function of fixed centred age, agec
, with a random intercept.
. merlin (distance agec M1[id]@1 , family(gaussian))
Fitting fixed effects model:
Fitting full model:
Iteration 0: log likelihood = -89.532139 (not concave)
Iteration 1: log likelihood = -76.630287
Iteration 2: log likelihood = -69.132735
Iteration 3: log likelihood = -69.015569
Iteration 4: log likelihood = -69.015198
Iteration 5: log likelihood = -69.015198
Mixed effects regression model Number of obs = 44
Log likelihood = -69.015198
------------------------------------------------------------------------------
| Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
distance: |
agec | .4795455 .0517869 9.26 0.000 .378045 .5810459
M1[id] | 1 . . . . .
_cons | 22.64773 .6051215 37.43 0.000 21.46171 23.83374
sd(resid.) | .7681236 .0945495 .6034696 .9777026
-------------+----------------------------------------------------------------
id: |
sd(M1) | 1.96987 .4360396 1.276499 3.039869
------------------------------------------------------------------------------
To fit our linear quantile model we use the new family(lquantile)
. The default quantile()
is 0.5, but I show the full syntax here for clarity. I use Gauss-Hermite quadrature with 7 integration points, as was done in the JSS paper.
. merlin (distance agec M1[id]@1, family(lquantile, quantile(0.5))) , nolog
Fitting fixed effects model:
Fitting full model:
Mixed effects regression model Number of obs = 44
Log likelihood = -70.390949
------------------------------------------------------------------------------
| Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
distance: |
agec | .4475998 .0012088 370.28 0.000 .4452305 .449969
M1[id] | 1 . . . . .
_cons | 23.09247 .0020606 1.1e+04 0.000 23.08843 23.09651
sd(resid.) | .2946744 .0482398 .2137943 .4061521
-------------+----------------------------------------------------------------
id: |
sd(M1) | 1.860972 .0022719 1.856525 1.86543
------------------------------------------------------------------------------
. est store g7
We find the median distance for an 11 year old girl is 23.092 mm, with a median slope showing an increase of
0.448 mm per year. We get a random effect standard deviation of 1.915, and
estimate of the scale of $\sigma = 0.294$. I’ve added the nolog
option to suppress the rather long iteration log.
We should immediately consider increasing our intpoints()
and comparing estimates.
. merlin (distance agec M1[id]@1, family(lquant, q(0.5))) , intpoints(75) nolog
Fitting fixed effects model:
Fitting full model:
Mixed effects regression model Number of obs = 44
Log likelihood = -70.292734
------------------------------------------------------------------------------
| Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
distance: |
agec | .4544673 .0047569 95.54 0.000 .4451439 .4637907
M1[id] | 1 . . . . .
_cons | 23.00637 .0128758 1786.79 0.000 22.98113 23.0316
sd(resid.) | .2947012 .0482876 .213752 .4063064
-------------+----------------------------------------------------------------
id: |
sd(M1) | 2.040022 .0063875 2.027541 2.05258
------------------------------------------------------------------------------
. est store g75
To compare we use est tab
. est tab g7 g75
----------------------------------------
Variable | g7 g75
-------------+--------------------------
_cmp_1_1_1 |
_cons | .44759977 .45446728
-------------+--------------------------
_cmp_1_2_1 |
_cons | 1 1
-------------+--------------------------
cons1 |
_cons | 23.09247 23.006367
-------------+--------------------------
dap1_1 |
_cons | -1.2218842 -1.2217934
-------------+--------------------------
lns1_1 |
_cons | .62109898 .71296051
----------------------------------------
The above shows good agreement, but the iteration log is so poor in both model fits, I would definitely explore further. As I said in the beginning, my knowledge of these models is extremely limited, but in this case, I think the dataset is both small, highly influenced by the integration technique, and almost certainly the starting values (another thing for you to explore…).
Having said that, now that we’ve brought this model within the merlin
realm, we get all our methodological
extensions for free! Extensions including higher levels, time-dependent effects, non-linear covariate effects,
joint models of various types…
This example used merlin
version 2.0.2.