Adding lagged dependent variables to differenced models
Reader Christopher Ordowich asks:
In sections 5.3-5.4, there is a great discussion of using
fixed effects vs. a lagged dependent variable with panel data. I am
having trouble reconciling some of this discussion with a section in a
recent paper by Imbens and Wooldridge (2008) titled “Recent
Developments in the Econometrics of Program Evaluation.” On page 68 of
their paper (as published by IZA in 2008) they suggest that it might
be better in some circumstances with two periods of data to use first
differencing and a lag of the dependent variable (assuming
unconfoundedness given lagged outcomes). I understand your discussion
of instrumenting for lagged variables if you have more than two
periods, but with two periods, how do you react to adding a lag (the
baseline value of the dependent variable) after first differencing
with only two periods of data? I have had difficulty finding support
for this approach elsewhere and given that you have given much thought
to this issue, I was wondering what your opinion might be.
The way I see it, once you add a lagged dependent variable to a differenced model, you are really doing lagged-dep-var control and not fixed effects. Steve may disagree (he’s generally less dogmatic than me). This is not always exactly true but it is a theorem for the simple example we use to contrast f.e. and lagged-dep-var control in Section 5.4
Here’s that again:
two periods
no covariates
the treatment, D_it, is zero for everybody in period 1 and switched on for some in period 2 (think of a training program that some people participate in between periods; period 1 is before, period 2 is after (similar to Ashenfelter and Card, 1985)
ignoring constants, fixed effects estimation fits
(1) Y_it – Y_it-1 = aD_it + error
lagged dependent variable estimation fits
(2) Y_it = gY_it-1 + bD_it + error
As I understand it, the Imbens-Wooldridge proposal is to throw Y_it-1 into equation (1):
(3) Y_it – Y_it-1 = dY_it-1 + cD_it + error
But in this case, c is (algebraically) the same as b. Why ? The coefficient c is
c= COV(Y_it – Y_it-1, D_it*)/V(D_it*)
where D_it* is the residual from a regression of D_it on Y_it-1. But this residual is orthogonal to Y_it-1, hence
c= COV(Y_it – Y_it-1, D_it*)/V(D_it*) = COV(Y_it, D_it*)/V(D_it*) = b in equation (2)
So I say: “You wanna do fixed effects? no lagged dependent variable, please (or at least be prepared to instrument it if you include one). You wanna control for lagged dependent variables? Then, just do it!
Adding lagged dependent variables to differenced models
Reader Christopher Ordowich asks:
In sections 5.3-5.4, there is a great discussion of using
fixed effects vs. a lagged dependent variable with panel data. I am
having trouble reconciling some of this discussion with a section in a
recent paper by Imbens and Wooldridge (2008) titled “Recent
Developments in the Econometrics of Program Evaluation.” On page 68 of
their paper (as published by IZA in 2008) they suggest that it might
be better in some circumstances with two periods of data to use first
differencing and a lag of the dependent variable (assuming
unconfoundedness given lagged outcomes). I understand your discussion
of instrumenting for lagged variables if you have more than two
periods, but with two periods, how do you react to adding a lag (the
baseline value of the dependent variable) after first differencing
with only two periods of data? I have had difficulty finding support
for this approach elsewhere and given that you have given much thought
to this issue, I was wondering what your opinion might be.
The way I see it, once you add a lagged dependent variable to a differenced model, you are really doing lagged-dep-var control and not fixed effects. Steve may disagree (he’s generally less dogmatic than me). This is not always exactly true but it is a theorem for the simple example we use to contrast f.e. and lagged-dep-var control in Section 5.4
Here’s that again:
two periods
no covariates
the treatment, D_it, is zero for everybody in period 1 and switched on for some in period 2 (think of a training program that some people participate in between periods; period 1 is before, period 2 is after (similar to Ashenfelter and Card, 1985)
ignoring constants, fixed effects estimation fits
(1) Y_it – Y_it-1 = aD_it + error
lagged dependent variable estimation fits
(2) Y_it = gY_it-1 + bD_it + error
As I understand it, the Imbens-Wooldridge proposal is to throw Y_it-1 into equation (1):
(3) Y_it – Y_it-1 = dY_it-1 + cD_it + error
But in this case, c is (algebraically) the same as b. Why ? The coefficient c is
c= COV(Y_it – Y_it-1, D_it*)/V(D_it*)
where D_it* is the residual from a regression of D_it on Y_it-1. But this residual is orthogonal to Y_it-1, hence
c= COV(Y_it – Y_it-1, D_it*)/V(D_it*) = COV(Y_it, D_it*)/V(D_it*) = b in equation (2)
So I say: “You wanna do fixed effects? no lagged dependent variable, please (or at least be prepared to instrument it if you include one). You wanna control for lagged dependent variables? Then, just do it!
— JDA