Static Interventions

Modified

April 23, 2025

Synthetic COVID RCT Data

For our first example, we’ll use a synthetic dataset of \(n = 1000\) patients from a hypothetical clinical trial for a treatment to decrease intubation and death among patients hospitalized with COVID-19.

The data is based on a database of over 1,500 patients hospitalized at Weill Cornell Medicine New York Presbyterian Hospital prior to 15 May 2020 with COVID-19 confirmed through PCR.
To replicate a two-arm randomized clinical trial (RCT), we’ve simulated a hypothetical treatment variable (A) that is randomly assigned for each observation with probability 0.5.
The outcome of interest (event) is intubation or death by 15-days post-hospitalization.
Treatment is associated with increased survival.
Baseline covariates include: age, sex, BMI, smoking status, whether the patient required supplemental oxygen within 3 hours of presenting to the emergency department, number of comorbidities, number of relevant symptoms, presence of bilateral infiltrates on chest X-ray, dyspnea, and hypertension.

Population mean outcomes

Our goal is to estimate the average treatment effect (ATE) of the hypothetical treatment on intubation or death:

\[ \begin{align} \theta &= \E(Y^{A=1} - Y^{A=0}) \\ &= \textcolor{blue}{\E(Y^{A=1})} - \textcolor{red}{\E(Y^{A=0})} \end{align} \]

Notice that the ATE is composed of two parameters:

the proportion of patients who receive intubation or die in a hypothetical world where all patients receive treatment (\(\textcolor{blue}{\E(Y^{A=1})}\)), and
the proportion of patients who receive intubation or die in a hypothetical world where no patients receive treatment (\(\textcolor{red}{\E(Y^{A=0})}\)).

We refer to these parameters as population mean outcomes. Let’s rewrite the target parameter as a function of interventions using the framework we previously discussed. Let \(\dd_1(a, h) = 1\) and \(\dd_0(a, h) = 0\). Then,

\[ \theta = \textcolor{blue}{\E(Y^{\dd_1})} - \textcolor{red}{\E(Y^{\dd_0})}. \]

Remember, we refer to these interventions as static because they return the same value regardless of their input.

Writing shift functions

So, how do we translate the functions \(\dd_1\) and \(\dd_0\) into R code to be used with lmtp?

Policies can be specified using one of two arguments in lmtp estimators: shift or shifted.

If using shift, we supply a two-argument function of the form

d <- function(data, trt) {
  # Insert code here
}

The first argument should correspond to a dataset containing all the variables in \(O\).
The second argument should expect a string corresponding to the variable(s) \(A_t\) in \(O\).
This function should return a size \(n\) vector with the same class as \(A_t\) modified according to \(d_t\).

For \(\dd_1\) this function would be:

If we apply this function to our data we are returned a vector of 1s with the same length as the number of rows in the observed data.

Question: Which estimator should we use to estimate these parameters?

✅ Answer

Because this is a non-time-varying study, we should use TMLE.

Let’s now estimate \(\E(Y^{\dd_1})\) using lmtp.

Instead of specifying an intervention using shift, we could instead supply a modified version of data to the shifted argument where the variables \(A_t\) are replaced with \(A^d_t\).

Question: How can we interpret this result?

Answer

In a hypothetical world where all patients received treatment, the expected proportion of patients who receive invasive mechanical ventilation by day 15 of hospitalization is 0.2722 (95% CI: 0.24 to 0.30).

`lmtp` objects

A call to an lmtp estimator returns a list of class lmtp.

Values returned in an `lmtp` object.
Value	Description
estimator	The estimator used.
theta	The estimated population LMTP estimand.
standard_error	The estimated standard error of the LMTP estimand.
low	Lower bound of the 95% confidence interval of the LMTP estimand.
high	Upper bound of the 95% confidence interval of the LMTP estimand.
eif	The estimated, un-centered, influence function of the estimate.
shift	The shift function specifying the treatment policy of interest.
outcome_reg	An \(n \times \tau + 1\) matrix of outcome regression predictions. The mean of the first column is used for calculating theta.
density_ratios	An \(n \times \tau\) matrix of the estimated, non-cumulative, density ratios.
fits_m	A list the same length as `folds`, containing the fits at each time-point for each fold for the outcome regression.
fits_r	A list the same length as `folds`, containing the fits at each time-point for each fold of density ratio estimation.
outcome_type	The outcome variable type.

Let’s inspect some of these values.

Problem 1

Write a function to estimate the population mean outcome if no patients received treatment. Assign the result to an object named dont_treat.

✅ Solution

d0 <- function(data, trt) {
  rep(0, nrow(data))
}

dont_treat <- lmtp_tmle(
  data = covid, 
  trt = "A", 
  outcome = "event", 
  baseline = W, 
  outcome_type = "binomial", 
  shift = d0, 
  folds = 1, 
  learners_trt = "SL.glm", 
  learners_outcome = "SL.glm"
)

print(dont_treat)

lmtp already implements d1 and d0 as static_binary_on() and static_binary_off()!

Average treatment effect

With estimates of \(\E(Y^{\dd_1})\) and \(\E(Y^{\dd_0})\) we can now calculate the ATE. To do so, we’ll use the function lmtp_contrast().

In addition to the ATE, we can also calculate the causal risk ratio and the causal odds ratio.

Confirming what we already know, we’ve estimated the treatment as being associated with decreased intubation or death.

References

Goyal, Parag, Justin J Choi, Laura C Pinheiro, Edward J Schenck, Ruijun Chen, Assem Jabri, Michael J Satlin, et al. 2020. “Clinical Characteristics of Covid-19 in New York City.” New England Journal of Medicine 382 (24): 2372–74.

Williams, Nicholas, Michael Rosenblum, and Iván Dı́az. 2022. “Optimising Precision and Power by Machine Learning in Randomised Trials with Ordinal and Time-to-Event Outcomes with an Application to COVID-19.” Journal of the Royal Statistical Society Series A: Statistics in Society 185 (4): 2156–78.

Synthetic COVID RCT Data

Population mean outcomes

Writing shift functions

lmtp objects

Problem 1

Average treatment effect

References

`lmtp` objects