Survival Analysis
\[ \renewcommand{\P}{\mathsf{P}} \newcommand{\m}{\mathsf{m}} \newcommand{\p}{\mathsf{p}} \newcommand{\q}{\mathsf{q}} \newcommand{\bb}{\mathsf{b}} \newcommand{\g}{\mathsf{g}} \newcommand{\rr}{\mathsf{r}} \newcommand{\IF}{\mathbb{IF}} \newcommand{\dd}{\mathsf{d}} \newcommand{\Pn}{$\mathsf{P}_n$} \newcommand{\E}{\mathsf{E}} \]
Data considerations
Similar to when the outcome is binary, survival outcomes should be coded using 0’s and 1’s where 1 indicates the occurrence of an event and 0 otherwise.
Similar to how we encode censoring variables, we consider the outcome to be degenerate. Meaning that once an observation experiences an outcome, all future outcome variables should also be coded with a 1 (“last-observation-carried-forward”).
Point-treatment survival problems
Up to this point, we’ve been ignoring that the covid
dataset should be treated as a point-treatment survival problem. Let’s re-estimate the effect of the randomized treatment with a survival framework.
We need to transform the data from long to wide format
and impute the outcome using last-observation-carried-forward.
Our modified dataset should look like this:
Use the function event_locf()
to make sure the outcome variables are correctly recorded.
- Instead of just estimating the effect of treatment on the outcome at the last time point, we can estimate the effect of a treatment on an outcome at all follow-up intervals.
Let’s estimate the effect of the treatment on intubation at each day. To do so, we can use the function lmtp_survival()
.
We can now visualize our results using a survival plot.
The main result of an lmtp
object can be extracted using the tidy()
function from the broom
package.
Time-varying treatment
Hoffman et al. (2024) demonstrated the use of modified treatment policies for survival outcomes to assess the effect of delaying invasive mechanical ventilation (IMV) on mortality among patients hospitalized with COVID-19 in New York City during the first COVID-19 wave. A synthetic version of the data used for that analysis has been loaded into R as intubation
.
The data consists of \(n = 2000\) observations hospitalized with COVID-19 and who were followed for \(\tau = 14\) days.
There are 10 baseline confounders and 4 time-varying confounders.
The outcome of interest is an indicator for death on day \(t\).
Observations are subject to loss-to-follow-up due to either hospital discharge or transfer.
Let’s consider the following intervention
\[ \dd_t(a_t, h_t) = \begin{cases} 1 \text{ if } a_t = 2 \text{ and } a_s \leq 1 \forall s < t \\ a_t \text{ otherwise}, \end{cases} \]
where \(A_t\) is a 3-level categorical variable: 0, no supplemental oxygen; 1, non-IMV supplemental oxygen support; 2, IMV.
In words, this function corresponds to an intervention where patients who were naturally observed as receiving IMV on day \(t\) instead had IMV delayed by a day to day \(t+1\). Let’s translate this policy to an R function that we can use with lmtp
.
We can now estimate the effect of delaying intubation by 1-day on 14-day mortality.