Qual è il modo migliore per stimare l'effetto medio del trattamento in uno studio longitudinale?

In uno studio longitudinale, i risultati delle unità vengono misurati ripetutamente nei punti temporali con un totale di occasioni di misurazione fisse (fisse = le misurazioni sulle unità vengono eseguite contemporaneamente). $Y_{it}$ $i$ $t$ $m$

Le unità vengono assegnate in modo casuale a un trattamento, o a un gruppo di controllo, . Voglio stimare e testare l'effetto medio del trattamento, cioè cui le aspettative sono prese nel tempo e nei singoli individui. Considero l'utilizzo di un modello multilivello (effetti misti) per occasioni fisse a questo scopo: $G=1$ $G=0$

A T E = E (Y | G = 1) - E (Y | G = 0),

$ATE=E(Y | G=1) - E(Y | G=0),$

Y_{i t} = α + β G_{i} + u_{0 i} + e_{i t}

$Y_{it} = \alpha + \beta G_i + u_{0i} + e_{it}$

with $\alpha$ the intercept, $\beta$ the $ATE$ , $u$ a random intercept across units, and $e$ the residual.

Now I am considering alternative model

Y_{i t} = \tilde{β} G_{i} + \sum_{j = 1}^{m} κ_{j} d_{i j} + \sum_{j = 1}^{m} γ_{j} d_{i j} G_{i} + {\tilde{u}}_{0 i} + {\tilde{e}}_{i t}

$Y_{it} = \tilde{\beta} G_i + \sum_{j=1}^m \kappa_j d_{ij} + \sum_{j=1}^m \gamma_j d_{ij} G_i + \tilde{u}_{0i} + \tilde{e}_{it}$

which contains the fixed effects $\kappa_j$ for each occasion $t$ where dummy $d_t=1$ if $j=t$ and $0$ else. In addition this model contains an interaction between treatment and time with parameters $\gamma$ . So this model takes into account that the effect of $G$ may differ across time. This is informative in itself, but I believe that it should also increase precision of estimation of the parameters, because the heterogeneity in $Y$ is taken into account.

However, in this model the $\tilde{\beta}$ coefficient does not seem to equal the $ATE$ anymore. Instead it represents the ATE at the first occasion ( $t=1$ ). So the estimate of $\tilde{\beta}$ may be more efficient than $\beta$ but it does not represent the $ATE$ anymore.

My questions are:

What is the best way to estimate the treatment effect in this longitudinal study design?
Do I have to use model 1 or is there a way to use (perhaps more efficient) model 2?
Is there a way to have $\tilde{\beta}$ have the interpretation of the $ATE$ and $\gamma$ the occasion specific deviation (e.g. using effect coding)?

— tomka
fonte

In model 2, isn't the ATE equal to

\tilde{β}

$\tilde{\beta}$ plus the average of

γ_{j}

$\gamma_j$ ?

— jujae

If your purpose is exclusively estimating ATE, then model 1 will suffice, since it will be unbiased. Adding period or interaction in the model will reduce the variance of your estimation I believe. And I think you might want to try to code

γ

$\gamma$ as deviation coding (deviation from the average)?

— jujae

@jujae The primary reason for model 2 is variance reduction, yes. But I wonder how to get the ATE out of model 2. Your first comment seems to be a pointer. Can you show this or elaborate? Then this would be close to an answer to my question!

— tomka

When you fit model 2,

\tilde{β}

$\tilde{\beta}$ has the interpretation of ATE in period 1. The coefficients of the interaction term, for identifiablility consideration, will be coded with ATE at period 1 as the reference level. Therefore

γ_{j}

$\gamma_j$ is actually the difference between treatment at period

j

$j$ and treatment at period 1 from software output. So at each period

j

$j$ , the ATE is

\tilde{β} + γ_{j}

$\tilde{\beta} + \gamma_j$ and when average the period-specific ATE, it will lead to the grand mean ATE, which is

β

$\beta$ in your model 1.

— jujae

Risposte:

Addressing your question "I wonder how to get the ATE out of model 2" in the comments:

First of all, in your model 2, not all $\gamma_j$ is identifiable which leads to the problem of rank deficiency in design matrix. It is necessary to drop one level, for instance assuming $\gamma_j =0$ for $j=1$ . That is, using the contrast coding and assume the treatment effect at period 1 is 0. In R, it will code the interaction term with treatment effect at period 1 as the reference level, and that is also the reason why $\tilde{\beta}$ has the interpretation of treatment effect at period 1. In SAS, it will code the treatment effect at period $m$ as the reference level, then $\tilde{\beta}$ has the interpretation of treatment effect at period $m$ , not period 1 anymore.

Assuming the contrast is created in the R way, then the coefficients estimated for each interaction term (I will still denote this by $\gamma_j$ , though it is not precisely what you defined in your model) has the interpretation of treatment effect difference between time period $j$ and time period 1. Denote ATE at each period $\mathrm{ATE}_j$ , then $\gamma_j= \mathrm{ATE}_j - \mathrm{ATE}_1$ for $j=2,\dots, m$ . Therefore an estimator for $\mathrm{ATE}_j$ is $\tilde{\beta} + \gamma_j$ . (ignoring the notation difference between true parameter and estimator itself because laziness) And naturally your $\mathrm{ATE}=\beta=\frac{1}{m} \sum_{j=1}^m \mathrm{ATE}_j=\frac{\tilde{\beta}+(\tilde{\beta}+\gamma_2)+\cdots+(\tilde{\beta}+\gamma_m)}{m} = \tilde{\beta}+\frac{1}{m}(\gamma_2 + \cdots + \gamma_m)$ .

I did a simple simulation in R to verify this:

set.seed(1234)
time <- 4
n <-2000
trt.period <- c(2,3,4,5) #ATE=3.5
kj <- c(1,2,3,4)
intercept <- rep(rnorm(n, 1, 1), each=time)
eij <- rnorm(n*time, 0, 1.5)
trt <- rep(c(rep(0,n/2),rep(1,n/2)), each=time)
y <- intercept + trt*(rep(trt.period, n))+rep(kj,n)+eij
sim.data <- data.frame(id=rep(1:n, each=time), period=factor(rep(1:time, n)), y=y, trt=factor(trt))

library(lme4)
fit.model1 <- lmer(y~trt+(1|id), data=sim.data)
beta <- getME(fit.model1, "fixef")["trt1"]

fit.model2 <- lmer(y~trt*period + (1|id), data=sim.data)
beta_t <- getME(fit.model2, "fixef")["trt1"]
gamma_j <- getME(fit.model2, "fixef")[c("trt1:period2","trt1:period3","trt1:period4")]

results <-c(beta, beta_t+sum(gamma_j)/time)
names(results)<-c("ATE.m1", "ATE.m2")
print(results)

And the results verifies this:

  ATE.m1   ATE.m2 
3.549213 3.549213

I don't know how to directly change contrast coding in model 2 above, so to illustrate how one can directly use a linear function of the interaction terms, as well as how to obtain the standard error, I used the multcomp package:

sim.data$tp <- interaction(sim.data$trt, sim.data$period)
fit.model3 <- lmer(y~tp+ (1|id), data=sim.data)
library(multcomp)
# w= tp.1.1 + (tp.2.1-tp.2.0)+(tp.3.1-tp.3.0)+(tp.4.1-tp.4.0)
# tp.x.y=interaction effect of period x and treatment y
w <- matrix(c(0, 1,-1,1,-1,1,-1,1)/time,nrow=1)
names(w)<- names(getME(fit.model3,"fixef"))
xx <- glht(fit.model3, linfct=w)
summary(xx)

And here is the output:

 Simultaneous Tests for General Linear Hypotheses
Fit: lmer(formula = y ~ tp + (1 | id), data = sim.data)
Linear Hypotheses:
       Estimate Std. Error z value Pr(>|z|)    
1 == 0  3.54921    0.05589   63.51   <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Adjusted p values reported -- single-step method)

I think the standard error is obtained by $\sqrt{w \hat{V} w^T}$ with $w$ being the above linear combination form and $V$ the estimated variance-covariance matrix of the coefficients from model 3.

Deviation coding

Another way to make $\tilde{\beta}$ having directly the interpretation of $\mathrm{ATE}$ is to use deviation coding, so that later covariates represent $\mathrm{ATE}_j - \mathrm{ATE}$ comparison:

sim.data$p2vsmean <- 0
sim.data$p3vsmean <- 0
sim.data$p4vsmean <- 0
sim.data$p2vsmean[sim.data$period==2 & sim.data$trt==1] <- 1
sim.data$p3vsmean[sim.data$period==3 & sim.data$trt==1] <- 1
sim.data$p4vsmean[sim.data$period==4 & sim.data$trt==1] <- 1
sim.data$p2vsmean[sim.data$period==1 & sim.data$trt==1] <- -1
sim.data$p3vsmean[sim.data$period==1 & sim.data$trt==1] <- -1
sim.data$p4vsmean[sim.data$period==1 & sim.data$trt==1] <- -1


fit.model4 <- lmer(y~trt+p2vsmean+p3vsmean+p4vsmean+ (1|id), data=sim.data)

Output:

Fixed effects:
            Estimate Std. Error t value
(Intercept)  3.48308    0.03952   88.14
trt1         3.54921    0.05589   63.51
p2vsmean    -1.14774    0.04720  -24.32
p3vsmean     1.11729    0.04720   23.67
p4vsmean     3.01025    0.04720   63.77

— jujae
fonte

Good - but how to get a standard error estimate? And shouldn't it be possible to use coding of the interactions / period effects in a way that

\tilde{β}

$\tilde{\beta}$ (your beta_t) is the ATE directly (then with an S.E. estimate)?

— tomka

@tomka, it is possible, I don't know how to direct change the contrast matrix of the interaction term in model2, will do some research and comeback later.

— jujae

Thinking about your answer, I found this. I think the deviation coding does what I want. You could test it and include it in your answer. ats.ucla.edu/stat/sas/webbooks/reg/chapter5/…

— tomka

@tomka: That is exactly what's in my mind, see my original comment to your question where I mentioned the deviation coding :), I will try to implement this and update the answer later. (Having some trouble with doing it in R without manually create dummy variable for the coding, but looks like it is the only way to do so).

— jujae

@tomka: sorry for the delay, updated the deviation code part

— jujae

For the first question, my understanding is that "fancy" ways are only needed when it's not immediately obvious that treatment is independent of potential outcomes. In these cases, you need to argue that some aspect of the data allows for an approximation of random assignment to treatment, which gets us to instrumental variables, regression discontinuity, and so forth.

In your case, units are randomly assigned to treatment, so it seems believable that treatment is independent of potential outcomes. Then we can just keep things simple: estimate model 1 with ordinary least squares, and you have a consistent estimate of the ATE. Since units are randomly assigned to treatment, this is one of the few cases where a random-effects assumption is believable.

— Peter
fonte