• Welcome to the MHE Blog. We'll use this space to post corrections and comments and to address any questions you might have about the material in Mostly Harmless Econometrics. We especially welcome questions that are likely to be of interest to other readers. The econometric universe is infinite and expanding, so we ask that questions be brief and relate to material in the book. Here is an example:
    "In Section 8.2.3, (on page 319), you suggest that 42 clusters is enough for the usual cluster variance formula to be fairly accurate. Is that a joke, or do you really think so?"
    To which we'd reply: "Yes."

Data up for Lee (2008)

Dave Lee has graciously contributed data and programs from his landmark RD study.  You can get the goods in the MHE Data Archive.

Published | Tagged , | Leave a comment

Just-identified IV

Gary Solon of Michigan State University pointed out to us that our claim on p. 209 that “just identified 2SLS is median unbiased” is not quite correct and that the claim should be qualified. Gary notes that if the first stage is really zero, the just identified IV estimator is centered at the same point as the biased OLS estimator.  Similarly, just identified IV is biased for instruments that are extremely weak, as has been shown in the literature.

Gary is right, of course, and we thank him for pointing this out.  Just-identified IV is approximatelyly median unbiased, but if the instruments are weak enough you’ll certainly have bias.  On the other hand, if a single instrument is really that weak, you’re unlikely to want to use it since a very low t-stat and high 2nd stage standard errors will warn you away.  See the attached note for details.

Published | Tagged | Leave a comment

Comments on Bad Control

Derek Neal of the University of Chicago comments that our discussion of bad control in section 3.2.3 leaves the impression that more control is always better as long as the controls are pre-determined relative to the causal variable of interest. The leading counter-example is the case of within-family or twins estimates that we discuss as the “baby with the bathwater problem” on p. 226. Here you might indeed increase omitted variables bias even though the controls are not bad in the section 3.2.3 sense:

Hi Guys:

I agree that the issue I am raising is conceptually different, but as a practical matter, the “bad control” issues and “baby with the bathwater problem” both fall under a larger heading of “can more controls ever make things worse.” Your discussion of bad control may lead some students to believe that the answer is “only if the extra controls are endogenous.”

If you ever have a second edition, I think there is an argument for dealing with all aspects of the “can more controls ever make things worse” question all in one place.

Point taken! We hope to fix this in the next edition . . .

Published | Tagged | 2 Comments

Typos and mistakes on pages 177 and 183

Careful reader Ian Gow from Stanford caught the following two typos/mistakes:

Assumption CA1 on p. 177 should read just like assumption A1 on p. 155, except conditional on X_i (the subscript 0 on Y_0i is incorrect).

On p. 183, the para beginning “The size of the group of compliers i given by . . .” First, the statement that P(S_1i=>s=>S_0i) is non-negative by virtue of monotonicity is silly: of course this non-negative, since its a probability! Monotonicity is needed, however, for this to be equal to the difference in the CDFs of S0_i and S1_i (as the sentence following should read).

Thanks Ian!

Published | Tagged | Leave a comment

Mistake on page 74

Careful reader Israel Arroyo caught this mistake:

Sir,

I’m reading the amazing “Mostly Harmless” and I’ve found what I believe to be a typo-though maybe is not and I’m just getting dumber- In Chapter 3, p.74, about the end of 2nd paragraph, it says “[…] regression of Yi on Di and Xi is the same as the regression of Yi on E[Yi|Di,Xi]” shouldn’t it say “the same as the regression of E[Yi|Di,Xi] on Di and Xi”?

Thank you very much for your patience and once again for a fantastic book,
Israel Arroyo

Indeed it should! Thank you Israel

Published | Tagged | Leave a comment

Josh found a confusing argument on page 67 (how did Steve let this slip in ?!)

The bottom of page 67 discusses proxy control and suggests that if you regress the proxy on the variable of interest (schooling) and find a zero effect, you should be less inclined to worry about the bias from proxy control. This doesn’t make much sense because what we care about when assessing the bias from proxy control is equation 3.2.14, the regression of the proxy on both schooling and the correct control. But of course we can’t run that regression because we don’t have the correct control. The regression of the proxy control on schooling isn’t informative about 3.2.14 unless the correct control is uncorrelated with schooling. In that case, however, we wouldn’t have needed to control for it in the long regression (3.2.13) in the first place.

We’ll clear all this up in the next edition as well!

Published | Tagged | Leave a comment