Qual è il modo migliore per stimare l'effetto medio del trattamento in uno studio longitudinale?


9

In uno studio longitudinale, i risultati delle unità vengono misurati ripetutamente nei punti temporali con un totale di occasioni di misurazione fisse (fisse = le misurazioni sulle unità vengono eseguite contemporaneamente).Yititm

Le unità vengono assegnate in modo casuale a un trattamento, o a un gruppo di controllo, . Voglio stimare e testare l'effetto medio del trattamento, cioè cui le aspettative sono prese nel tempo e nei singoli individui. Considero l'utilizzo di un modello multilivello (effetti misti) per occasioni fisse a questo scopo:G=1G=0

ATE=E(Y|G=1)E(Y|G=0),

Yit=α+βGi+u0i+eit

with α the intercept, β the ATE, u a random intercept across units, and e the residual.

Now I am considering alternative model

Yit=β~Gi+j=1mκjdij+j=1mγjdijGi+u~0i+e~it

which contains the fixed effects κj for each occasion t where dummy dt=1 if j=t and 0 else. In addition this model contains an interaction between treatment and time with parameters γ. So this model takes into account that the effect of G may differ across time. This is informative in itself, but I believe that it should also increase precision of estimation of the parameters, because the heterogeneity in Y is taken into account.

However, in this model the β~ coefficient does not seem to equal the ATE anymore. Instead it represents the ATE at the first occasion (t=1). So the estimate of β~ may be more efficient than β but it does not represent the ATE anymore.

My questions are:

  • What is the best way to estimate the treatment effect in this longitudinal study design?
  • Do I have to use model 1 or is there a way to use (perhaps more efficient) model 2?
  • Is there a way to have β~ have the interpretation of the ATE and γ the occasion specific deviation (e.g. using effect coding)?

In model 2, isn't the ATE equal to β~ plus the average of γj ?
jujae

If your purpose is exclusively estimating ATE, then model 1 will suffice, since it will be unbiased. Adding period or interaction in the model will reduce the variance of your estimation I believe. And I think you might want to try to code γ as deviation coding (deviation from the average)?
jujae

@jujae The primary reason for model 2 is variance reduction, yes. But I wonder how to get the ATE out of model 2. Your first comment seems to be a pointer. Can you show this or elaborate? Then this would be close to an answer to my question!
tomka

When you fit model 2, β~ has the interpretation of ATE in period 1. The coefficients of the interaction term, for identifiablility consideration, will be coded with ATE at period 1 as the reference level. Therefore γj is actually the difference between treatment at period j and treatment at period 1 from software output. So at each period j, the ATE is β~+γj and when average the period-specific ATE, it will lead to the grand mean ATE, which is β in your model 1.
jujae

Risposte:


2

Addressing your question "I wonder how to get the ATE out of model 2" in the comments:

First of all, in your model 2, not all γj is identifiable which leads to the problem of rank deficiency in design matrix. It is necessary to drop one level, for instance assuming γj=0 for j=1. That is, using the contrast coding and assume the treatment effect at period 1 is 0. In R, it will code the interaction term with treatment effect at period 1 as the reference level, and that is also the reason why β~ has the interpretation of treatment effect at period 1. In SAS, it will code the treatment effect at period m as the reference level, then β~ has the interpretation of treatment effect at period m, not period 1 anymore.

Assuming the contrast is created in the R way, then the coefficients estimated for each interaction term (I will still denote this by γj, though it is not precisely what you defined in your model) has the interpretation of treatment effect difference between time period j and time period 1. Denote ATE at each period ATEj, then γj=ATEjATE1 for j=2,,m. Therefore an estimator for ATEj is β~+γj. (ignoring the notation difference between true parameter and estimator itself because laziness) And naturally your ATE=β=1mj=1mATEj=β~+(β~+γ2)++(β~+γm)m=β~+1m(γ2++γm).

I did a simple simulation in R to verify this:

set.seed(1234)
time <- 4
n <-2000
trt.period <- c(2,3,4,5) #ATE=3.5
kj <- c(1,2,3,4)
intercept <- rep(rnorm(n, 1, 1), each=time)
eij <- rnorm(n*time, 0, 1.5)
trt <- rep(c(rep(0,n/2),rep(1,n/2)), each=time)
y <- intercept + trt*(rep(trt.period, n))+rep(kj,n)+eij
sim.data <- data.frame(id=rep(1:n, each=time), period=factor(rep(1:time, n)), y=y, trt=factor(trt))

library(lme4)
fit.model1 <- lmer(y~trt+(1|id), data=sim.data)
beta <- getME(fit.model1, "fixef")["trt1"]

fit.model2 <- lmer(y~trt*period + (1|id), data=sim.data)
beta_t <- getME(fit.model2, "fixef")["trt1"]
gamma_j <- getME(fit.model2, "fixef")[c("trt1:period2","trt1:period3","trt1:period4")]

results <-c(beta, beta_t+sum(gamma_j)/time)
names(results)<-c("ATE.m1", "ATE.m2")
print(results)

And the results verifies this:

  ATE.m1   ATE.m2 
3.549213 3.549213  

I don't know how to directly change contrast coding in model 2 above, so to illustrate how one can directly use a linear function of the interaction terms, as well as how to obtain the standard error, I used the multcomp package:

sim.data$tp <- interaction(sim.data$trt, sim.data$period)
fit.model3 <- lmer(y~tp+ (1|id), data=sim.data)
library(multcomp)
# w= tp.1.1 + (tp.2.1-tp.2.0)+(tp.3.1-tp.3.0)+(tp.4.1-tp.4.0)
# tp.x.y=interaction effect of period x and treatment y
w <- matrix(c(0, 1,-1,1,-1,1,-1,1)/time,nrow=1)
names(w)<- names(getME(fit.model3,"fixef"))
xx <- glht(fit.model3, linfct=w)
summary(xx)

And here is the output:

 Simultaneous Tests for General Linear Hypotheses
Fit: lmer(formula = y ~ tp + (1 | id), data = sim.data)
Linear Hypotheses:
       Estimate Std. Error z value Pr(>|z|)    
1 == 0  3.54921    0.05589   63.51   <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Adjusted p values reported -- single-step method)

I think the standard error is obtained by wV^wT with w being the above linear combination form and V the estimated variance-covariance matrix of the coefficients from model 3.

Deviation coding

Another way to make β~ having directly the interpretation of ATE is to use deviation coding, so that later covariates represent ATEjATE comparison:

sim.data$p2vsmean <- 0
sim.data$p3vsmean <- 0
sim.data$p4vsmean <- 0
sim.data$p2vsmean[sim.data$period==2 & sim.data$trt==1] <- 1
sim.data$p3vsmean[sim.data$period==3 & sim.data$trt==1] <- 1
sim.data$p4vsmean[sim.data$period==4 & sim.data$trt==1] <- 1
sim.data$p2vsmean[sim.data$period==1 & sim.data$trt==1] <- -1
sim.data$p3vsmean[sim.data$period==1 & sim.data$trt==1] <- -1
sim.data$p4vsmean[sim.data$period==1 & sim.data$trt==1] <- -1


fit.model4 <- lmer(y~trt+p2vsmean+p3vsmean+p4vsmean+ (1|id), data=sim.data)

Output:

Fixed effects:
            Estimate Std. Error t value
(Intercept)  3.48308    0.03952   88.14
trt1         3.54921    0.05589   63.51
p2vsmean    -1.14774    0.04720  -24.32
p3vsmean     1.11729    0.04720   23.67
p4vsmean     3.01025    0.04720   63.77

Good - but how to get a standard error estimate? And shouldn't it be possible to use coding of the interactions / period effects in a way that β~ (your beta_t) is the ATE directly (then with an S.E. estimate)?
tomka

@tomka, it is possible, I don't know how to direct change the contrast matrix of the interaction term in model2, will do some research and comeback later.
jujae

Thinking about your answer, I found this. I think the deviation coding does what I want. You could test it and include it in your answer. ats.ucla.edu/stat/sas/webbooks/reg/chapter5/…
tomka

@tomka: That is exactly what's in my mind, see my original comment to your question where I mentioned the deviation coding :), I will try to implement this and update the answer later. (Having some trouble with doing it in R without manually create dummy variable for the coding, but looks like it is the only way to do so).
jujae

@tomka: sorry for the delay, updated the deviation code part
jujae

0

For the first question, my understanding is that "fancy" ways are only needed when it's not immediately obvious that treatment is independent of potential outcomes. In these cases, you need to argue that some aspect of the data allows for an approximation of random assignment to treatment, which gets us to instrumental variables, regression discontinuity, and so forth.

In your case, units are randomly assigned to treatment, so it seems believable that treatment is independent of potential outcomes. Then we can just keep things simple: estimate model 1 with ordinary least squares, and you have a consistent estimate of the ATE. Since units are randomly assigned to treatment, this is one of the few cases where a random-effects assumption is believable.

Utilizzando il nostro sito, riconosci di aver letto e compreso le nostre Informativa sui cookie e Informativa sulla privacy.
Licensed under cc by-sa 3.0 with attribution required.