Two level factorial design
We will work with yield experiment (BHH - 308)
Let's set up data first:
> ntt<-factor(rep(c(160,180),8))
> ncc<-factor(rep(c(20,20,40,40),4))
> nkk<-rep(rep(c("A","B"),c(4,4)),2)
> yield<-c(60,72,54,68,52,83,45,80,59,74,50,69,50,81,46,79)
> options(contrasts = c("contr.helmert",
"contr.poly"))
>
Now let's fit model:
> fit<-lm(yield~ntt*ncc*nkk)
> model.matrix(fit)
(Intercept) ntt ncc nkk ntt:ncc
ntt:nkk ncc:nkk ntt:ncc:nkk
1
1 -1 -1 -1 1
1 1
-1
2
1 1 -1 -1 -1
-1 1
1
3
1 -1 1 -1 -1
1 -1
1
4
1 1 1 -1
1 -1 -1
-1
5
1 -1 -1 1 1
-1 -1
1
6
1 1 -1 1 -1
1 -1
-1
7
1 -1 1 1 -1
-1 1
-1
8
1 1 1 1
1 1
1 1
9
1 -1 -1 -1 1
1 1
-1
10
1 1 -1 -1 -1
-1 1
1
11
1 -1 1 -1 -1
1 -1
1
12
1 1 1 -1
1 -1 -1
-1
13
1 -1 -1 1 1
-1 -1
1
14
1 1 -1 1 -1
1 -1
-1
15
1 -1 1 1 -1
-1 1
-1
16
1 1 1 1
1 1
1 1
> anova(fit)
Analysis of Variance Table
Response: yield
Terms added sequentially (first to last)
Df Sum of Sq Mean Sq F Value Pr(F)
ntt
1 2256.25 2256.25 1128.125 0.0000000
ncc
1 100.00 100.00 50.000 0.0001050
nkk
1 6.25 6.25
3.125 0.1150771
ntt:ncc 1
9.00 9.00 4.500 0.0666880
ntt:nkk 1
306.25 306.25 153.125 0.0000017
ncc:nkk 1
4.00 4.00 2.000 0.1950155
ntt:ncc:nkk 1
0.00 0.00 0.000 1.0000000
Residuals 8
16.00 2.00
> summary(fit)
Call: lm(formula = yield ~ ntt * ncc * nkk)
Residuals:
Min 1Q
Median 3Q Max
-2 -0.625 -2.22e-16 0.625
2
Coefficients:
Value Std. Error t value Pr(>|t|)
(Intercept) 63.8750
0.3536 180.6658 0.0000
ntt 11.8750 0.3536 33.5876
0.0000
ncc -2.5000 0.3536 -7.0711
0.0001
nkk 0.6250 0.3536
1.7678 0.1151
ntt:ncc
0.7500 0.3536 2.1213
0.0667
ntt:nkk
4.3750 0.3536 12.3744
0.0000
ncc:nkk
0.5000 0.3536 1.4142
0.1950
ntt:ncc:nkk 0.0000
0.3536 0.0000 1.0000
Residual standard error: 1.414 on 8 degrees
of freedom
Multiple R-Squared: 0.9941
F-statistic: 191.6 on 7 and 8 degrees of
freedom, the p-value is 2.869e-08
Correlation of Coefficients:
(Intercept) ntt ncc nkk ntt:ncc ntt:nkk ncc:nkk
ntt 0
ncc 0
0
nkk 0
0 0
ntt:ncc 0
0 0 0
ntt:nkk 0
0 0 0 0
ncc:nkk 0
0 0 0 0
0
ntt:ncc:nkk 0
0 0 0 0
0 0
Let's make a few plots:
> X11()
> qqnorm.aov(fit,label=5)
Another model:
> fit<-lm(yield~(ntt+ncc+nkk)^2)
> model.matrix(fit)
(Intercept) ntt ncc nkk ntt:ncc
ntt:nkk ncc:nkk
1
1 -1 -1 -1 1
1 1
2
1 1 -1 -1 -1
-1 1
3
1 -1 1 -1 -1
1 -1
4
1 1 1 -1
1 -1 -1
5
1 -1 -1 1 1
-1 -1
6
1 1 -1 1 -1
1 -1
7
1 -1 1 1 -1
-1 1
8
1 1 1 1
1 1
1
9
1 -1 -1 -1 1
1 1
10
1 1 -1 -1 -1
-1 1
11
1 -1 1 -1 -1
1 -1
12
1 1 1 -1
1 -1 -1
13
1 -1 -1 1 1
-1 -1
14
1 1 -1 1 -1
1 -1
15
1 -1 1 1 -1
-1 1
16
1 1 1 1
1 1
1
> anova(fit)
Analysis of Variance Table
Response: yield
Terms added sequentially (first to last)
Df Sum of Sq Mean Sq F Value Pr(F)
ntt
1 2256.25 2256.250 1269.141 0.0000000
ncc
1 100.00 100.000 56.250 0.0000369
nkk
1 6.25 6.250
3.516 0.0935497
ntt:ncc 1
9.00 9.000 5.062 0.0510033
ntt:nkk 1
306.25 306.250 172.266 0.0000004
ncc:nkk 1
4.00 4.000 2.250 0.1678507
Residuals 9
16.00 1.778
> summary(fit)
Call: lm(formula = yield ~ (ntt + ncc + nkk)^2)
Residuals:
Min 1Q
Median 3Q Max
-2 -0.625 2.498e-16 0.625
2
Coefficients:
Value Std. Error t value Pr(>|t|)
(Intercept) 63.8750
0.3333 191.6250 0.0000
ntt 11.8750 0.3333 35.6250
0.0000
ncc -2.5000 0.3333 -7.5000
0.0000
nkk 0.6250 0.3333
1.8750 0.0935
ntt:ncc
0.7500 0.3333 2.2500
0.0510
ntt:nkk
4.3750 0.3333 13.1250
0.0000
ncc:nkk
0.5000 0.3333 1.5000
0.1679
Residual standard error: 1.333 on 9 degrees
of freedom
Multiple R-Squared: 0.9941
F-statistic: 251.4 on 6 and 9 degrees of
freedom, the p-value is 1.687e-09
Correlation of Coefficients:
(Intercept) ntt ncc nkk ntt:ncc ntt:nkk
ntt 0
ncc 0
0
nkk 0
0 0
ntt:ncc 0
0 0 0
ntt:nkk 0
0 0 0 0
ncc:nkk 0
0 0 0 0
0
> qqnorm.aov(fit,label=5)
> par(mfrow=c(2,3))
> plot(fit)
> par(mfrow=c(1,1))
Now let's look what would happend if we don't
have replicates. I will use the same dataset, but only first 8 replications.
> yield1<-yield[1:8]
Error: Object "yeild" not found
> yield1<-yield[1:8]
> nkk1<-factor(nkk[1:8])
> ncc1<-factor(ncc[1:8])
> ntt1<-factor(ntt[1:8])
> fit<-lm(yield1~nkk1*ncc1*ntt1)
> model.matrix(fit)
(Intercept) nkk1 ncc1 ntt1 nkk1:ncc1
nkk1:ntt1 ncc1:ntt1 nkk1:ncc1:ntt1
1
1 -1 -1 -1
1 1
1
-1
2
1 -1 -1 1
1 -1
-1
1
3
1 -1 1 -1
-1 1
-1
1
4
1 -1 1 1
-1 -1
1
-1
5
1 1 -1 -1
-1 -1
1
1
6
1 1 -1 1
-1 1
-1
-1
7
1 1 1 -1
1 -1
-1
-1
8
1 1 1 1
1 1
1
1
> anova(fit)
Analysis of Variance Table
Response: yield1
Terms added sequentially (first to last)
Df Sum of Sq Mean Sq
nkk1
1 4.5 4.5
ncc1
1 50.0 50.0
ntt1
1 1058.0 1058.0
nkk1:ncc1
1 0.0 0.0
nkk1:ntt1
1 200.0 200.0
ncc1:ntt1
1 4.5 4.5
nkk1:ncc1:ntt1 1
0.5 0.5
> summary(fit)
Call: lm(formula = yield1 ~ nkk1 * ncc1 *
ntt1)
Coefficients:
Value Std. Error t value Pr(>|t|)
(Intercept) 64.25
NA NA
NA
nkk1 0.75 NA
NA NA
ncc1 -2.50 NA
NA NA
ntt1 11.50 NA
NA NA
nkk1:ncc1
0.00 NA
NA NA
nkk1:ntt1
5.00 NA
NA NA
ncc1:ntt1
0.75 NA
NA NA
nkk1:ncc1:ntt1 0.25
NA NA
NA
Residual standard error: NA on 0 degrees of
freedom
Multiple R-Squared: 1
F-statistic: NA on 7 and 0 degrees of freedom,
the p-value is NA
Correlation of Coefficients:
(Intercept) nkk1 ncc1 ntt1 nkk1:ncc1 nkk1:ntt1 ncc1:ntt1
nkk1 0
ncc1 0
0
ntt1 0
0 0
nkk1:ncc1
0 0
0 0
nkk1:ntt1
0 0
0 0 0
ncc1:ntt1
0 0
0 0 0
0
nkk1:ncc1:ntt1 0
0 0 0 0
0 0
Warning messages:
One or more nonpositive parameters
in: pf(x$fstatistic[1], x$fstatistic[2],
x$fstatistic[3])
> qqnorm.aov(fit,label=5)
> par(mfrow=c(2,3))
> plot(fit)
Error in .Fortran(if(!cmplx) "dbksl" else
"zbksl",: subroutine dbksl: 24 missing
value(s) in argument 4
Dumped
> par(mfrow=c(1,1))
> fit<-lm(yield1~(nkk1+ncc1+ntt1)^2)
> model.matrix(fit)
(Intercept) nkk1 ncc1 ntt1 nkk1:ncc1
nkk1:ntt1 ncc1:ntt1
1
1 -1 -1 -1
1 1
1
2
1 -1 -1 1
1 -1
-1
3
1 -1 1 -1
-1 1
-1
4
1 -1 1 1
-1 -1
1
5
1 1 -1 -1
-1 -1
1
6
1 1 -1 1
-1 1
-1
7
1 1 1 -1
1 -1
-1
8
1 1 1 1
1 1
1
> summary(fit)
Call: lm(formula = yield1 ~ (nkk1 + ncc1 +
ntt1)^2)
Residuals:
1
2 3 4 5
6 7 8
-0.25 0.25 0.25 -0.25 0.25 -0.25 -0.25
0.25
Coefficients:
Value Std. Error t value Pr(>|t|)
(Intercept) 64.2500
0.2500 257.0000 0.0025
nkk1
0.7500 0.2500 3.0000
0.2048
ncc1
-2.5000 0.2500 -10.0000
0.0635
ntt1
11.5000 0.2500 46.0000
0.0138
nkk1:ncc1 0.0000
0.2500 0.0000 1.0000
nkk1:ntt1 5.0000
0.2500 20.0000 0.0318
ncc1:ntt1 0.7500
0.2500 3.0000 0.2048
Residual standard error: 0.7071 on 1 degrees
of freedom
Multiple R-Squared: 0.9996
F-statistic: 439 on 6 and 1 degrees of freedom,
the p-value is 0.03652
Correlation of Coefficients:
(Intercept) nkk1 ncc1 ntt1 nkk1:ncc1 nkk1:ntt1
nkk1 0
ncc1 0
0
ntt1 0
0 0
nkk1:ncc1 0
0 0 0
nkk1:ntt1 0
0 0 0 0
ncc1:ntt1 0
0 0 0 0
0
> anova(fit)
Analysis of Variance Table
Response: yield1
Terms added sequentially (first to last)
Df Sum of Sq Mean Sq F Value Pr(F)
nkk1
1 4.5 4.5
9 0.2048328
ncc1
1 50.0 50.0
100 0.0634510
ntt1
1 1058.0 1058.0 2116 0.0138374
nkk1:ncc1 1
0.0 0.0 0 1.0000000
nkk1:ntt1 1
200.0 200.0 400 0.0318045
ncc1:ntt1 1
4.5 4.5 9 0.2048328
Residuals 1
0.5 0.5
>
|