Multivariate exposures
\[ \renewcommand{\P}{\mathsf{P}} \newcommand{\m}{\mathsf{m}} \newcommand{\p}{\mathsf{p}} \newcommand{\q}{\mathsf{q}} \newcommand{\bb}{\mathsf{b}} \newcommand{\g}{\mathsf{g}} \newcommand{\rr}{\mathsf{r}} \newcommand{\IF}{\mathbb{IF}} \newcommand{\dd}{\mathsf{d}} \newcommand{\Pn}{$\mathsf{P}_n$} \newcommand{\E}{\mathsf{E}} \]
lmtp can estimate effects of simultaneous interventions on multiple variables
Practically, this is useful for assessing the effects of mixtures on environmental outcomes
NIEHS Simulation Data
For our example of estimating the effects of simultaneous interventions on multiple variables, we will use simulated data from the 2015 NIEHS Mixtures Workshop. The data has already been loaded into R in the background as mixtures
. You can view and download the raw data here.
The simulated data has \(n = 500\) observations and is intended to replicate a prospective cohort study.
The data is composed of 7 log-normally distributed and correlated exposures variables (
"X1", "X2", "X3", "X4", "X5", "X6", "X7"
), a single continuous outcome ("Y"
), and one binary confounder ("Z"
).There is no missing covariate data, no measurement error, and no censoring.
Only exposure variables
X1
,X2
,X4
,X5
, andX7
have an effect on the outcomeY
. However, the direction of the effects varies.X1
,X2
, andX7
are positively associated with the outcome.X4
andX5
are negatively associated with the outcome.
Multivariate shift functions
Only two things need to change when using lmtp estimators with multivariate treatments:
Instead of a vector, you should now pass a list to the
trt
argumentThe shift function should return a named list of vectors instead of a single vector.
Let’s use lmtp to estimate the effect of a modified treatment policy which intervenes on all 7 exposure simultaneously on the outcome:
\[ \dd(\mathbf{a}, h) = \begin{cases} \dd(a_1, h) = \begin{cases} a_1 - 0.2 &\text{ if } a_1 - 0.2 > 0 \\ a_1 &\text{ otherwise } \end{cases} \\ \dd(a_2, h) = \begin{cases} a_2 - 0.4 &\text{ if } a_2 - 0.4 > 0 \\ a_2 &\text{ otherwise } \end{cases} \\ \dd(a_3, h) = a_3 + 0.4 \\ \dd(a_4, h) = a_4 + 0.1 \\ \dd(a_5, h) = a_5 + 0.5 \\ \dd(a_6, h) = \begin{cases} a_6 - 0.2 &\text{ if } a_6 - 0.2 > 0 \\ a_6 &\text{ otherwise } \end{cases} \\ \dd(a_7, h) = \begin{cases} a_7 - 0.3 &\text{ if } a_7 - 0.3 > 0 \\ a_7 &\text{ otherwise } \end{cases} \end{cases} \]
Problem 1
Using TMLE, estimate the population mean outcome under the simultaneous intervention we just defined. Fit both the treatment mechanism and the outcome regression using this set of learners: c("SL.mean", "SL.glm", "SL.gam", "SL.rpart", "SL.rpartPrune", "SL.step.interaction")
. Assign the result to ans
. To save time, don’t use crossfitting; lmtp has already been loaded into the R session.
✅ Solution
set.seed(4363754)
<- c("SL.mean",
learners "SL.glm",
"SL.gam",
"SL.rpart",
"SL.rpartPrune",
"SL.step.interaction")
<- lmtp_tmle(data = mixtures,
ans trt = A,
outcome = "Y",
baseline = "Z",
shift = d,
mtp = TRUE,
outcome_type = "continuous",
learners_trt = learners,
learners_outcome = learners,
folds = 1)
print(ans)
Problem 2
Compared to what was observed under the natural course of exposure, how did intervening upon the set of exposures effect the outcome? Estimate this effect using lmtp_contrast()
.
✅ Answer
<- mean(mixtures$Y)
obs_y lmtp_contrast(ans, ref = obs_y)