Recent Publications

A list of all my publications can be found here, with some recent ones described below:

merlin can do a lot of things. From simple stuff, like fitting a linear regression or a Weibull survival model, to a three-level logistic mixed effects model, or a multivariate joint model of multiple longitudinal outcomes (of different types) and a recurrent event and survival with non-linear effects…the list is rather endless. merlin can do things I haven’t even thought of yet. I’ll take a single dataset, and attempt to show you the full range of capabilities of merlin, and discuss some future directions for the implementation in Stata.

Simulation studies are computer experiments which involve creating data by pseudorandom sampling. The key strength of simulation studies is the ability to understand the behaviour of statistical methods because some `truth’ is known from the process of generating the data. This allows us to consider properties of methods, such as bias. While widely used, simulation studies are often poorly designed, analysed and reported. This article outlines the rationale for using simulation studies and offers guidance for design, execution, analysis, reporting and presentation. In particular, we provide: a structured approach for planning and reporting simulation studies; coherent terminology for simulation studies; guidance on coding simulation studies; a critical discussion of key performance measures and their computation; ideas on structuring tabular and graphical presentation of results; and new graphical presentations. With a view to describing current practice and identifying areas for improvement, we review 100 articles taken from Volume 34 of Statistics in Medicine which included at least one simulation study.
arXiv, 2017

Multivariate data occurs in a wide range of fields, with ever more flexible model specifications being proposed, often within a multivariate generalised linear mixed effects (MGLME) framework. In this article, we describe an extended framework, encompassing multiple outcomes of any type, each of which could be repeatedly measured (longitudinal), with any number of levels, and with any number of random effects at each level. Many standard distributions are described, as well as non-standard user-defined non-linear models. The extension focuses on a complex linear predictor for each outcome model, allowing sharing and linking between outcome models in an extremely flexible way, either by linking random effects directly, or the expected value of one outcome (or function of it) within the linear predictor of another. Non-linear and time-dependent effects are also seamlessly incorporated to the linear predictor through the use of splines or fractional polynomials. We further propose level-specific random effect distributions and numerical integration techniques to improve usability, relaxing the normally distributed random effects assumption to allow multivariate $t$-distributed random effects. We consider some special cases of the general framework, describing some new models in the fields of clustered survival data, joint longitudinal-survival models, and discuss various potential uses of the implementation. User friendly, and easily extendable, software is provided.
arXiv, 2017

With the release of Stata 14 came the mestreg command to fit multilevel mixed effects parametric survival models, assuming normally distributed random effects, estimated with maximum likelihood utilising Gaussian quadrature. In this article, I present the user written stmixed command, which serves as both an alternative and a complimentary program for the fitting of multilevel parametric survival models, to mestreg. The key extensions include incorporation of the flexible parametric Royston-Parmar survival model, and the ability to fit multilevel relative survival models. The methods are illustrated with a commonly used dataset of patients with kidney disease suffering recurrent infections, and a simulated example, illustrating a simple approach to simulating clustered survival data using survsim (Crowther and Lambert, 2012, 2013).
arXiv, 2017

Multi-state models are increasingly being used to model complex disease profiles. By modelling transitions between disease states, accounting for competing events at each transition, we can gain a much richer understanding of patient trajectories and how risk factors impact over the entire disease pathway. In this article we concentrate on parametric multi-state models, both Markov and semi-Markov, and develop a flexible framework where each transition can be specified by a variety of parametric models including exponential, Weibull, Gompertz, Royston-Parmar proportional hazards models or log-logistic, log-normal, generalised gamma accelerated failure time models, possibly sharing parameters across transitions. We also extend the framework to allow time-dependent effects. We then use an efficient and generalisable simulation method to calculate transition probabilities from any fitted multi-state model, and show how it facilitates the simple calculation of clinically useful measures, such as expected length of stay in each state, and differences and ratios of proportion within each state as a function of time, for specific covariate patterns. We illustrate our methods using a dataset of patients with primary breast cancer. User friendly Stata software is provided.
In Stats in Med, 2017


As part of my research I have developed a range of software packages in Stata. More details, including tutorials, can be found on the package-specific pages:

Each package can be installed by typing ssc install cmdname within Stata. Having said that, I’m starting to move things over to git repositories, so keep an eye on the package pages for installation instructions.

Recent Posts

More Posts

merlin version 1.0.0 now released


A major update to the multistate package in Stata, and other news in my multistate world


Some details on the importance of good starting values with megenreg, and my plans to reduce the worry



Flexible AFT models

Flexible parametric accelerated failure time models


MRC New Investigator Research Grant (01/03/2017 - 29/02/2020)


Multi-state survival analysis


Extended multivariate generalised linear and non-linear mixed effects models


My core teaching is on the MSc Medical Statistics course at the University of Leicester.

I teach a number of short courses, some teaching material is made freely available on the course pages:

Recent & Upcoming Talks


My research group currently consists of:

Research Staff

  • Dr Emma Martin, Post-doctoral Research Associate in Biostatistics, University of Leicester. Emma is funded by my MRC New Investigator Research Grant to work on a variety of projects in multi-state survival models and joint models.

  • Jonathan Broomfield, NIHR Methods Fellow, University of Leicester.

PhD students

As main supervisor:

  • Alessandro Gasparini, University of Leicester (1st October 2016 - Present) [GitHub]
    Alessandro has been working on frailty survival models and a RShiny app for use in summarising simulation studies. His main project centres on informative observations in joint modelling of longitudinal and survival data. More details on his PhD can be found here.

  • Nuzhat Ashra, University of Leicester (25th September 2017 - Present). Nuzhat is funded by an MRC IMPACT studentship and SPD Development Company, to work on joint modelling of biomarkers to predict miscarriage. She will also be working on some extensions to the stjm command in Stata, including dynamic predictions.

  • Micki Hill, University of Leicester (1st April 2019 - Present). Micki is funded by a Department of Health Sciences studentship, to work on methods in multi-state survival analysis with a focus on interval-censored data.

As co-supervisor:

  • Elinor Curnow, University of Bristol Ellie has been working on methods for imputing survival times from interval censored data. She’s studying part time, and I joined her supervisory team in November 2018.
  • Nikolaos Skourlis, Karolinska Institutet (1st November 2018 - Present)

Previous students:

  • Sam Brilleman, Monash University (Awarded 2018)
    [Homepage] [GitHub]
    Sam’s been working on a variety of projects, but a core project has been development of an R package to fit an extensive array of Bayesian joint models using Stan. More details on his PhD can be found here.