There are two possibilities: either I’m crazy, or everyone else is wrong.

Well, I’m ’bout to prove I’m not wrong.

Yes, I know it’s hard for people to accept we should abandon the standard stats curriculum in favor of the GLM.

But you can’t deny you’ll get identical results doing a t-test as a GLM.

“Prove it!,” you say?

Well, I am happy to oblige.

Let’s go ahead and run a regular old t-test:

require(flexplot)
data(avengers)
t.test(ptsd~north_south, data=avengers, var.equal=TRUE)
## 
##  Welch Two Sample t-test
## 
## data:  ptsd by north_south
## t = -8.195, df = 810, p-value = 9.755e-16
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.4112410 -0.2523058
## sample estimates:
## mean in group north mean in group south 
##            3.834729            4.166502

(Note the default t-test in R, Welch’s, does a correction which will make it not equal to a regression, hence the var.equal=TRUE. Thanks for the reminder Michael!)

Now let’s do the same thing as a glm:

mod = lm(ptsd~north_south, data=avengers)
summary(mod)
## 
## Call:
## lm(formula = ptsd ~ north_south, data = avengers)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -2.0347 -0.3665 -0.0347  0.3653  3.2335 
## 
## Coefficients:
##                  Estimate Std. Error t value Pr(>|t|)    
## (Intercept)       3.83473    0.02863 133.954  < 2e-16 ***
## north_southsouth  0.33177    0.04048   8.195 9.76e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.5768 on 810 degrees of freedom
## Multiple R-squared:  0.07656,    Adjusted R-squared:  0.07542 
## F-statistic: 67.16 on 1 and 810 DF,  p-value: 9.755e-16

Oh, would you look at that. The t-statistic for the t-test (-8.195) is exactly the same as the t-statistic for the “slope” in the linear model (8.195), at least in absolute value. (The t-test version subtracts north from south, while the glm version does the opposite).

It’s. The. Same. Thing.

Bazinga.

Except the GLM approach gives you more than the t-test and is easily expandable; the t-test is NOT.

And, of course, we might as well look at a graphic of the glm (which you cannot do with the t-test, btw).

visualize(mod, plot="model")

plot of chunk unnamed-chunk-4

Leave a Reply

Your email address will not be published. Required fields are marked *