The Electric Commentary

Wednesday, May 18, 2005

Karlson on Freakonomics

Like Gordon, he has a few problems with the section on parenting, and he explains regression analysis:

Sometimes the researcher will set up a more complicated model, in which there is one slide switch for size of library and another slide switch for the size of library in a Black household and another slide switch for the size of library in a Latino household; that's called "interaction" and that becomes hazardous for two reasons. First, additional terms in the regression analysis use up degrees of freedom, which can be fatal to the project if there are more effects to estimate than there are observations to infer from, and which increases the standard error of the estimate, which can be fatal to sign and significance, and that's hard to get even on your six columns of specifications that worked best. Second, all statistical inference using a computer involves approximating rational numbers in base 10 (that's true even with exponential and logarithmic specifications; Mr. Spock had the right way to distract a computer years ago) with integers in binary or some other power of two, and more complicated switchboards such as my multiple-slide-switches create what we call "sparse" matrices with lots of zero values. The effect on the machinery is a combination of rounding problems and conditioning problems. Specification is thus a tradeoff of sufficient richness against economy of computing resources. It's possible to make inferences all the same, but it is not as easy as Levitt and Dubner make it sound. And to compare it to a golf handicap -- which is not that easy to work out -- is still to oversimplify.


Post a Comment

<< Home

Amazon Logo