I had a question come in about how to alter a fitted model object to essentially conduct probabilistic sensitivity analysis within a multistate setting, using `predictms`

from my `multistate`

Stata package. So here goes.

$~$

### Probabilistic sensitivity analysis (PSA)

I’m not a health economist. I had to Google probabilistic sensitivity analysis when starting to write this to check it was what I thought it was…

Ok, disclaimers out the way.

This post is mainly about how to manipulate an existing model object in Stata, or to define one given a set of parameter values. Such things are needed when conducting PSA.

I’ll start with the `brcancer`

dataset:

```
. webuse brcancer, clear
(German breast cancer data)
. stset rectime, f(censrec) scale(365.25)
failure event: censrec != 0 & censrec < .
obs. time interval: (0, rectime]
exit on or before: failure
t for analysis: time/365.25
------------------------------------------------------------------------------
686 total observations
0 exclusions
------------------------------------------------------------------------------
686 observations remaining, representing
299 failures in single-record/single-failure data
2,111.978 total analysis time at risk and under observation
at risk from t = 0
earliest observed entry t = 0
last observed exit t = 7.279945
```

and fit a proportional hazards Weibull model adjusting for hormonal therapy.

```
. streg hormon, distribution(weibull) nohr nolog
failure _d: censrec
analysis time _t: rectime/365.25
Weibull PH regression
No. of subjects = 686 Number of obs = 686
No. of failures = 299
Time at risk = 2111.978097
LR chi2(1) = 10.36
Log likelihood = -694.53925 Prob > chi2 = 0.0013
------------------------------------------------------------------------------
_t | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
hormon | -.3932402 .1248267 -3.15 0.002 -.637896 -.1485844
_cons | -2.195131 .1093755 -20.07 0.000 -2.409504 -1.980759
-------------+----------------------------------------------------------------
/ln_p | .250997 .0496958 5.05 0.000 .1535949 .348399
-------------+----------------------------------------------------------------
p | 1.285306 .0638744 1.166018 1.416798
1/p | .7780247 .0386646 .7058172 .8576193
------------------------------------------------------------------------------
```

and we get a log hazard ratio of -0.393. Now I’m going to stick to using the `predictms`

function from my `multistate`

package in this post, so that you can hopefully see how to extend it to a multistate setting. `predictms`

doesn’t necessarily need a complex multistate setting to work - we can use the `survival`

option to use its prediction engine with a single event, standard survival analysis. We first store our model object:

```
. estimates store m1
```

mainly because I prefer the `models()`

syntax of `predictms`

, but also because it’s easier to extend to more complex settings. I create a time variable at which to calculate predictions at,

```
. range tvar 0 5 2
(684 missing values generated)
. list tvar in 1/2
+------+
| tvar |
|------|
1. | 0 |
2. | 5 |
+------+
```

I’m going to predict the survival probability for a patient in the treated group, at $t=5$. I also need the time 0 in there, because `predictms`

currently uses the minimum of your `timevar`

(when you use one) to define the at risk enter time (this is behaviour I will improve in a future update).

We call `predictms`

, predicting the transition probabilities for a patient in the hormonal therapy group.

```
. predictms , models(m1) at1(hormon 1) survival timevar(tvar) seed(398471)
```

Nice and simple. We get two variables:

```
. list _prob_at1* in 1/2, abbrev(15)
+-------------------------------+
| _prob_at1_1_1 _prob_at1_1_2 |
|-------------------------------|
1. | 1 0 |
2. | .55411 .44589 |
+-------------------------------+
```

which is the probability of still being in the starting state, i.e. alive, and the probability of being dead, at time 0 and at 5 years.

Now let’s say we want to alter my log hazard ratio to assess its impact. Well, I want to set it to be -1, and recalculate my predicted survival at 5 years.

First I’ll make a copy of my estimated coefficient vector, which is stored in `e(b)`

.

```
. mat b1 = e(b)
```

My log hazard ratio is in the first element, so I’ll replace it with a -1, and check it’s worked:

```
. mat b1[1,1] = -1
. mat list b1
b1[1,3]
_t: _t: /:
hormon _cons ln_p
y1 -1 -2.1951315 .25099699
```

All good. Now the tricky part. There are lots of options for Stata’s maximum likelihood (`ml`

) engine, that most of the time feels like only I know about…but not many spend their time delving into Stata’s source code. But everything here is fully documented! Essentially I want a Weibull `streg`

model object, with coefficients set to the values in `b1`

. So, we can pass those as initial values using the `from()`

option, but tell it not to proceed with estimation beyond the initial setup step, using `iter(0)`

:

```
. streg hormon, dist(weib) from(b1) iter(0) nohr
failure _d: censrec
analysis time _t: rectime/365.25
Fitting full model:
Iteration 0: log likelihood = -708.81541
convergence not achieved
Weibull PH regression
No. of subjects = 686 Number of obs = 686
No. of failures = 299
Time at risk = 2111.978097
Wald chi2(1) = 40.87
Log likelihood = -708.81541 Prob > chi2 = 0.0000
------------------------------------------------------------------------------
_t | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
hormon | -1 .1564199 -6.39 0.000 -1.306577 -.6934227
_cons | -2.195131 .1132968 -19.38 0.000 -2.417189 -1.973074
-------------+----------------------------------------------------------------
/ln_p | .250997 .0526692 4.77 0.000 .1477673 .3542267
-------------+----------------------------------------------------------------
p | 1.285306 .067696 1.159243 1.425078
1/p | .7780247 .0409779 .7017159 .8626318
------------------------------------------------------------------------------
Warning: convergence not achieved
```

which gives us exactly what we want. We of course get the `convergence not achieved`

message, but our ‘estimated’ parameter vector is set to the values we specified! Check in `e(b)`

as well:

```
. mat list e(b)
e(b)[1,3]
_t: _t: /:
hormon _cons ln_p
y1 -1 -2.1951315 .25099699
```

So now we have our `streg`

model object, with our new hormone coefficient, we can simply store our new model object, and run `predictms`

:

```
. estimates store m1
. predictms , models(m1) at1(hormon 1) survival timevar(tvar)
```

`predictms`

doesn’t care if your model converged or not, it simply uses the values it finds in `e(b)`

. Even if I hadn’t fitted a first model, I could’ve just passed a matrix of my own values to `streg`

, the only thing to make sure you have in memory is some form of survival data, as it will check your data is `stset`

, and a variable called `hormon`

. Those things can be easily created with some dummy data.

We can now generalise this to assess the assumption that our treatment effect comes from $N(-1,0.1^2)$, by making a large number of draws from the normal distribution, and calculating survival at 5 years for each run, as follows:

```
. local Nsim = 200
. capture set obs `Nsim'
. //to store our predictions
. gen s5 = .
(686 missing values generated)
. //loop over Nsim draws
. set seed 13490 // <- don't forget!
. forvalues i=1/`Nsim' {
2. local draw = rnormal(-1,0.1)
3. mat b1[1,1] = `draw'
4. qui streg hormon, dist(weib) from(b1) iter(0) nohr
5. //double check our hormon coefficient matches the random draw
. assert [_t][hormon]==`draw'
6. //store new model object
. estimates store m1
7. //predict survival at 5 years
. predictms , models(m1) at1(hormon 1) survival timevar(tvar)
8. //store my new estimate
. qui replace s5 = _prob_at1_1_1[2] if _n==`i'
9. }
```

which gives us our distribution of estimates of $S(5)$:

```
. summarize s5
Variable | Obs Mean Std. Dev. Min Max
-------------+---------------------------------------------------------
s5 | 200 .7226256 .0244609 .65435 .7742
```

Hopefully that shows the basis for how to manipulate a fitted model in Stata. Any parameter can be altered, and any prediction can be obtained, I simply tricked `streg`

into turning it back into an `estimates`

object, which means we can use a post-estimation function, such as `predict`

or `predictms`

.

- I didn’t bother changing
`n()`

in my`predictms`

calls. You should increase them to assess the impact of the simulation error, since the default method to calculate predictions is to use Monte Carlo simulation. - Next steps for PSA will be extracting the simulated times from
`predictms`

used to calculate the predictions, in order to apply costs to various components. I’ll add this in a future update.

**Reference**

Crowther MJ, Lambert PC. Parametric multi-state survival models: flexible modelling allowing transition-specific distributions with application to estimating clinically useful measures of effect differences. *Statistics in Medicine* 2017;36(29):4719-4742.