|
| |
DOE - One Vay Anova
Let's construct one way anova table for Pulp experiment.
First lets read data in:
> data<-read.table("/afs/stat.lsa.umich.edu/users/student/kutsyy/.public_html/classes/data/Table1.2",header=T)
> data
A
B C D
1 59.8 59.8 60.7 61.0
2 60.0 60.2 60.7 60.8
3 60.8 60.4 60.5 60.6
4 60.8 59.9 60.9 60.5
5 59.8 60.0 60.3 60.5
Now we need this data as one row:
> Y<-c(as.matrix(data))
> Y
[1] 59.8 60.0 60.8 60.8 59.8 59.8 60.2
60.4 59.9 60.0 60.7 60.7 60.5 60.9 60.3
[16] 61.0 60.8 60.6 60.5 60.5
Now we need to construct our X matrix. I will use contrast assuming
tau_1=0
> x1<-rep(c(0,1,0,0),c(5,5,5,5))
> x2<-rep(c(0,0,1,0),c(5,5,5,5))
> x3<-rep(c(0,0,0,1),c(5,5,5,5))
> X<-cbind(1,x1,x2,x3)
> X
x1 x2 x3
[1,] 1 0 0 0
[2,] 1 0 0 0
[3,] 1 0 0 0
[4,] 1 0 0 0
[5,] 1 0 0 0
[6,] 1 1 0 0
[7,] 1 1 0 0
[8,] 1 1 0 0
[9,] 1 1 0 0
[10,] 1 1 0 0
[11,] 1 0 1 0
[12,] 1 0 1 0
[13,] 1 0 1 0
[14,] 1 0 1 0
[15,] 1 0 1 0
[16,] 1 0 0 1
[17,] 1 0 0 1
[18,] 1 0 0 1
[19,] 1 0 0 1
[20,] 1 0 0 1
And now using regular matrix multiplication we can construct ANOVA table.
(do it on your own).
There is an easier way to do it. We can run linear model on Splus,
and than ask for anova:
> fit<-lm(Y~x1+x2+x3)
> summary(fit)
Call: lm(formula = Y ~ x1 + x2 + x3)
Residuals:
Min
1Q Median 3Q Max
-0.44 -0.195 -0.07 0.175 0.56
Coefficients:
Value Std. Error t value Pr(>|t|)
(Intercept) 60.2400
0.1458 413.2430 0.0000
x1 -0.1800 0.2062 -0.8731
0.3955
x2 0.3800 0.2062
1.8433 0.0839
x3 0.4400 0.2062
2.1343 0.0486
Residual standard error: 0.326 on 16 degrees
of freedom
Multiple R-Squared: 0.4408
F-statistic: 4.204 on 3 and 16 degrees of
freedom, the p-value is 0.02261
Correlation of Coefficients:
(Intercept)
x1 x2
x1 -0.7071
x2 -0.7071
0.5000
x3 -0.7071
0.5000 0.5000
> anova(fit)
Analysis of Variance Table
Response: Y
Terms added sequentially (first to last)
Df Sum of Sq Mean Sq F Value
Pr(F)
x1
1 0.770667 0.7706667 7.253333 0.0159931
x2
1 0.085333 0.0853333 0.803137 0.3834426
x3
1 0.484000 0.4840000 4.555294 0.0486371
Residuals 16 1.700000 0.1062500
Note I did not include constant in to lm formula. Also i could
use >fit<-lm(Y~X-1) to
get the same result.
Now. we can tell Splus to create X matrix for us. but first
we have to tell what kind of contrasts do we want to use.
>options(contrasts = c("contr.treatment",
"contr.poly"))
Would tell Splus to use treatment contrast (i.e. tau_1=0)
>options(contrasts = c("contr.sum", "contr.poly"))
Would tell Splus to use contrast with sum(tau_i)=0
>options(contrasts = c("contr.helmert", "contr.poly"))
Would be used for analyzing two level factorial design
(later in the course).
Now lets redo the same experiment using sum contrast:
> X<-rep(c(1,2,3,4),c(5,5,5,5))
> X
[1] 1 1 1 1 1 2 2 2 2 2 3 3 3 3 3 4
4 4 4 4
> is.factor(X)
[1] F
> X<-as.factor(X)
> is.factor(X)
[1] T
> fit<-lm(Y~X)
> summary(fit)
Call: lm(formula = Y ~ X)
Residuals:
Min
1Q Median 3Q Max
-0.44 -0.195 -0.07 0.175 0.56
Coefficients:
Value Std. Error t value Pr(>|t|)
(Intercept) 60.4000
0.0729 828.6812 0.0000
X1 -0.1600 0.1262 -1.2674
0.2232
X2 -0.3400 0.1262 -2.6932
0.0160
X3 0.2200 0.1262
1.7427 0.1006
Residual standard error: 0.326 on 16 degrees
of freedom
Multiple R-Squared: 0.4408
F-statistic: 4.204 on 3 and 16 degrees of
freedom, the p-value is 0.02261
Correlation of Coefficients:
(Intercept)
X1 X2
X1 0.0000
X2 0.0000 -0.3333
X3 0.0000 -0.3333
-0.3333
> anova(fit)
Analysis of Variance Table
Response: Y
Terms added sequentially (first to last)
Df Sum of Sq Mean Sq F Value
Pr(F)
X
3 1.34 0.4466667 4.203922 0.0226089
Residuals 16
1.70 0.1062500
> aov(fit)
Call:
aov(formula = fit)
Terms:
X Residuals
Sum of Squares 1.34
1.70
Deg. of Freedom 3
16
Residual standard error: 0.3259601
Estimated effects may be unbalanced
> model.matrix(fit)
(Intercept) X1 X2 X3
1
1 1 0 0
2
1 1 0 0
3
1 1 0 0
4
1 1 0 0
5
1 1 0 0
6
1 0 1 0
7
1 0 1 0
8
1 0 1 0
9
1 0 1 0
10
1 0 1 0
11
1 0 0 1
12
1 0 0 1
13
1 0 0 1
14
1 0 0 1
15
1 0 0 1
16
1 -1 -1 -1
17
1 -1 -1 -1
18
1 -1 -1 -1
19
1 -1 -1 -1
20
1 -1 -1 -1
Do the same analysis using treatment contrast.
Now lets compare this model to simple model:
> fit1<-lm(Y~1)
> summary(fit1)
Call: lm(formula = Y ~ 1)
Residuals:
Min 1Q Median
3Q Max
-0.6 -0.4 0.1 0.325
0.6
Coefficients:
Value Std. Error t value Pr(>|t|)
(Intercept) 60.4000 0.0894
675.2925 0.0000
Residual standard error: 0.4 on 19 degrees
of freedom
Multiple R-Squared: 5.979e-28
F-statistic: Inf on 0 and 19 degrees of freedom,
the p-value is NA
Warning messages:
One or more nonpositive parameters
in: pf(x$fstatistic[1], x$fstatistic[2],
x$fstatistic[3])
> anova(fit1,fit)
Analysis of Variance Table
Response: Y
Terms Resid. Df RSS Test Df Sum
of Sq F Value Pr(F)
1 1
19 3.04
2 X
16 1.70 3
1.34 4.203922 0.0226089
|