|
|
Analysis of Covariance> cath <- read.table("/afs/stat.lsa.umich.edu/users/student/kutsyy/.public_html/classes/data/cathedral.data")
> cath
style x y
Durham r 75 502
Canterbury r 80 522
etc.
WinchesterG g 103 530
Old.St.Paul g 103 611
Salisbury g 84 473
> summary(cath)
style x y
g:16 Min. : 45.00 Min. :182.0
r: 9 1st Qu.: 67.00 1st Qu.:370.0
Median : 75.00 Median :425.0
Mean : 74.76 Mean :425.5
3rd Qu.: 83.00 3rd Qu.:519.0
Max. :103.00 Max. :611.0
Now plot the data:
> motif() > plot(cath$x,cath$y,type="n",xlab="Nave height",ylab="Length") > text(cath$x,cath$y,as.character(cath$s))Now fit the seperate regression lines model. Note that y ~ x*style is equivalent. > g <- lm(y ~ x+style+x:style, data=cath) > summary(g) Residuals: Min 1Q Median 3Q Max -172.7 -30.22 23.75 55.78 89.5 Coefficients: Value Std. Error t value Pr(>|t|) (Intercept) 139.4718 173.6037 0.8034 0.4307 x 3.9728 2.3206 1.7120 0.1016 style 102.3610 173.6037 0.5896 0.5617 x:style -0.8347 2.3206 -0.3597 0.7227 Residual standard error: 79.11 on 21 degrees of freedom Multiple R-Squared: 0.5412 F-statistic: 8.257 on 3 and 21 degrees of freedom, the p-value is 0.0008072Because style is non-numeric, S-PLUS automatically treats it as a qualitative variables and sets up a coding - but which coding? > model.matrix(g) (Intercept) x style x:style Durham 1 75 1 75 Canterbury 1 80 1 80 etc. Old.St.Paul 1 103 -1 -103 Salisbury 1 84 -1 -84We can use another coding - for example: > dim(cath) [1] 25 3 > dummy <- rep(0,25) > dummy[cath$s == 'g'] <- 1 > dummy [1] 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1Refit the model - note the necessity of using I() so that * gets interpreted as multiply. > ga <- lm( y ~ x + dummy + I(x*dummy), data=cath) > summary(ga) Residuals: Min 1Q Median 3Q Max -172.7 -30.22 23.75 55.78 89.5 Coefficients: Value Std. Error t value Pr(>|t|) (Intercept) 241.8327 336.4711 0.7187 0.4802 x 3.1381 4.5059 0.6964 0.4938 dummy -204.7220 347.2075 -0.5896 0.5617 I(x * dummy) 1.6695 4.6411 0.3597 0.7227 Residual standard error: 79.11 on 21 degrees of freedom Multiple R-Squared: 0.5412 F-statistic: 8.257 on 3 and 21 degrees of freedom, the p-value is 0.0008072Compare to the previous regression result - what's the same and what's different? We see that the model can be simplified to > g <- lm(y ~ x+style, data=cath) > summary(g) Coefficients: Value Std. Error t value Pr(>|t|) (Intercept) 84.4942 80.6861 1.0472 0.3064 x 4.7116 1.0584 4.4516 0.0002 style 40.1963 16.1532 2.4884 0.0209 Residual standard error: 77.53 on 22 degrees of freedom Multiple R-Squared: 0.5384 F-statistic: 12.83 on 2 and 22 degrees of freedom, the p-value is 0.0002028Put the two parallel regression on the plot: > abline(84.4942+40.1963,4.7116) > abline(84.4942-40.1963,4.7116,lty=2)Check out the diagnostics: > par(mfrow=c(2,3)) > plot(g)What do you see? Our conclusion is that for cathedrals of the same height, Romanesque ones are about 80 (=2*40.2) feet longer. For each extra foot height, both types of cathedral are about 4.7 feet longer. Consider the summary subsetted by style > by(cath,cath$style,summary) cath$style:g style x y g:16 Min. : 45.00 Min. :182.0 r: 0 1st Qu.: 60.75 1st Qu.:298.8 Median : 73.50 Median :412.0 Mean : 74.94 Mean :397.4 3rd Qu.: 86.50 3rd Qu.:481.2 Max. :103.00 Max. :611.0 ------------------------------------------------------------ cath$style:r style x y g:0 Min. :64.00 Min. :344.0 r:9 1st Qu.:70.00 1st Qu.:425.0 Median :75.00 Median :502.0 Mean :74.44 Mean :475.4 3rd Qu.:80.00 3rd Qu.:530.0 Max. :83.00 Max. :551.0Notice that in this case the two groups have about the same average heights. Look at the difference in the lengths. With the alternative coding, we get: > ga <- lm( y ~ x + dummy, data=cath) > summary(ga) Coefficients: Value Std. Error t value Pr(>|t|) (Intercept) 124.6905 82.9216 1.5037 0.1469 x 4.7116 1.0584 4.4516 0.0002 dummy -80.3925 32.3063 -2.4884 0.0209 Residual standard error: 77.53 on 22 degrees of freedom Multiple R-Squared: 0.5384 F-statistic: 12.83 on 2 and 22 degrees of freedom, the p-value is 0.0002028What does the coefficient for dummy tell us? |