Survival Analysis
\[ \renewcommand{\P}{\mathsf{P}} \newcommand{\m}{\mathsf{m}} \newcommand{\p}{\mathsf{p}} \newcommand{\q}{\mathsf{q}} \newcommand{\bb}{\mathsf{b}} \newcommand{\g}{\mathsf{g}} \newcommand{\rr}{\mathsf{r}} \newcommand{\IF}{\mathbb{IF}} \newcommand{\dd}{\mathsf{d}} \newcommand{\Pn}{$\mathsf{P}_n$} \newcommand{\E}{\mathsf{E}} \]
Data considerations
Similar to when the outcome is binary, survival outcomes should be coded using 0’s and 1’s where 1 indicates the occurrence of an event and 0 otherwise.
Similar to how we encode censoring variables, we consider the outcome to be degenerate. Meaning that once an observation experiences an outcome, all future outcome variables should also be coded with a 1 (“last-observation-carried-forward”).
Point-treatment survival problems
Up to this point, we’ve been ignoring that the covid
dataset should be treated as a point-treatment survival problem. Let’s re-estimate the effect of the randomized treatment with a survival framework.
We need to transform the data from long to wide format
and impute the outcome using last-observation-carried-forward.
Our modified dataset should look like this:
Use the function event_locf()
to make sure the outcome variables are correctly recorded.
- Instead of just estimating the effect of treatment on the outcome at the last time point, we can estimate the effect of a treatment on an outcome at all follow-up intervals.
Let’s estimate the effect of the treatment on intubation at each day. To do so, we can use the function lmtp_survival()
.
We can now visualize our results using a survival plot.
The main result of an lmtp
object can be extracted using the tidy()
function from the broom
package.
Time-varying treatment
Hoffman et al. (2024) demonstrated the use of modified treatment policies for survival outcomes to assess the effect of delaying invasive mechanical ventilation (IMV) on mortality among patients hospitalized with COVID-19 in New York City during the first COVID-19 wave. A synthetic version of the data used for that analysis has been loaded into R as intubation
.
The data consists of \(n = 2000\) observations hospitalized with COVID-19 and who were followed for \(\tau = 14\) days.
There are 10 baseline confounders and 4 time-varying confounders.
The outcome of interest is an indicator for death on day \(t\).
Observations are subject to loss-to-follow-up due to either hospital discharge or transfer.
Let’s consider the following intervention
\[ \dd_t(a_t, h_t) = \begin{cases} 1 \text{ if } a_t = 2 \text{ and } a_s \leq 1 \forall s < t \\ a_t \text{ otherwise}, \end{cases} \]
where \(A_t\) is a 3-level categorical variable: 0, no supplemental oxygen; 1, non-IMV supplemental oxygen support; 2, IMV.
In words, this function corresponds to an intervention where patients who were naturally observed as receiving IMV on day \(t\) instead had IMV delayed by a day to day \(t+1\). Let’s translate this policy to an R function that we can use with lmtp
.
We can now estimate the effect of delaying IMV by 1-day on 14-day mortality.
Competing risks
In the context of survival analysis, competing risks refer to events that preclude the occurrence of the primary event of interest. In the previous example, we treated hospital discharge or transfer as censoring events. These events, however, are actually competing risks because we know that if a patients was discharged or transferred out of the ICU on day \(t\) they didn’t die on that day.
Treating competing risks as censoring events involves estimating the effect of an intervention that eliminates the competing event. The identification assumptions for this intervention are stronger than intervention that doesn’t consider eliminating the competing event.
In the presence of competing risks, lmtp
can estimate cumulative incidence effects. Cumulative incidence effect can be interpreted as the total effect of treatment operating through pathways that include the competing events. Let’s re-evaluate the effect of a delaying IMV 1-day on 14-day mortality but instead treat discharge or transfer as a competing risk. We first need to modify the data:
- Flip the columns corresponding to discharge or transfer so that a 1 indicates a discharge or transfer occurred
- Impute missing values for discharge and death using last-observation carried forward because they are both deterministic variables once they have occurred (i.e., if a patient died, their probability of discharge or transfer is 0 and vice-versa)
We can now re-estimate the effect of delaying IMV by 1-day. Instead of passing the discharge
vector to the cens
argument, we pass it to the compete
argument.