survsim
can now simulate survival data from a semi-Markov or multiple timescale multi-state model, and is about 70% faster
I’m a Biostatistician (20% FTE) in the Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm, and spend the rest of my time as a Consultant Stata Developer. Before moving to Stockholm in December 2020, I was an Associate Professor of Biostatistics at the University of Leicester. Since January 2018, I am an Associate Editor of the Stata Journal and was previously a Section Editor of the Journal of Statistical Software for over 5 years. My main research interests include survival analysis, multilevel and mixed effects models, and statistical software development. I lead a programme of research developing methodology for the analysis of complex survival data, motivated by applications to electronic health records.
After completing my PhD on complex survival and joint longitudinal-survival models, which can be downloaded here, I did a post-doc at the Department of Medical Epidemiology and Biostatistics, Karolinska Institutet in Stockholm, before returning to Leicester in March 2016 to take up a lectureship. I was promoted to Associate Professor in August 2018.
This website contains a collection of my work, including publications, software and teaching material. I hope you find it useful.
PhD in Medical Statistics, 2014
University of Leicester
MSc in Medical Statistics, 2010
University of Leicester
MMath in Mathematics and Statistics, 2009
University of St. Andrews
survsim
command for simulating survival data. survsim
can now simulate survival data from a parametric distribution, a custom/user-defined distribution, from a fitted merlin
model, from a specified cause-specific hazards competing risks model, or from a specified general multi-state model. I illustrate the command with some examples from each setting, demonstrating the huge flexibilty that can be used to better evaluate statistical methods.
merlin
can do a lot of things. From simple stuff, like fitting a linear regression or a Weibull survival model, to a three-level logistic mixed effects model, or a multivariate joint model of multiple longitudinal outcomes (of different types) and a recurrent event and survival with non-linear effects…the list is rather endless. merlin
can do things I haven’t even thought of yet. I’ll take a single dataset, and attempt to show you the full range of capabilities of merlin
, and discuss some future directions for the implementation in Stata
.
As part of my research I have developed a range of software packages in Stata
and R
. More details, including tutorials, can be found on the package-specific pages:
merlin
~ mixed effects regression for linear and non-linear models
Tutorials in Stata, Stata version history (stable release), Stata version history (development release)
multistate
~ multi-state survival analysis
Stata version history (stable release), Stata version history (development release)
survsim
~ simulation of simple and complex survival data
Tutorials in Stata, Stata version history (stable release), Stata version history (development release)
stmixed
~ multilevel parametric survival models
Stata version history (stable release), Stata version history (development release)
staft
~ flexible parametric accelerated failure time models
Stata version history (stable release), Github repo.
sankey
~ Sankey graphs in Stata using Python
Stata version history (stable release)
stjm
~ joint models of longitudinal and survival data
stgenreg
~ general parametric survival models
stmix
~ two-component mixture parametric survival models
extfunnel
~ extended funnel plots for meta-analysis
metapow
~ simulation-based sample size calculations for designing trials based on an existing meta-analysis
survsim
can now simulate survival data from a semi-Markov or multiple timescale multi-state model, and is about 70% faster
survsim
can now simulate survival data from a Markov multi-state model, defined by general transition-specific hazard functions
survsim
can now simulate survival data from a parametric distribution, a user-defined distribution, from a fitted merlin
model, or from a general cause-specific hazards competing risks model
Bringing together the strengths of stset and merlin
How merlin makes modelling of non-linear effects a whole lot simpler
I teach a number of short courses, some teaching material is made freely available on the course pages: