Example:
    > fertA
    [1] 29.9 11.4 25.3 16.5 21.1
    > fertB
    [1] 26.6 23.7 28.5 14.2 17.9 24.3
    > y_c(fertA,fertB)
    > one_rep(1,11)
    > one
     [1] 1 1 1 1 1 1 1 1 1 1 1
    > dumAB_c(rep(0,5),rep(1,6))
    > x_cbind(one,dumAB)
    > x
          one dumAB
    > x
          one dumAB
     [1,]   1     0
     [2,]   1     0
     [3,]   1     0
     [4,]   1     0
     [5,]   1     0
     [6,]   1     1
     [7,]   1     1
     [8,]   1     1
     [9,]   1     1
    [10,]   1     1
    [11,]   1     1
    > bhat_solve(t(x)%*%x)%*%t(x)%*%y
    > bhat
               [,1]
      one 20.840000
    dumAB  1.693333
    > mean(fertA)
    [1] 20.84
    > mean(fertB)-mean(fertA)
    [1] 1.693333
    > fit_x%*%bhat
    > fit
              [,1]
     [1,] 20.84000
     [2,] 20.84000
     [3,] 20.84000
     [4,] 20.84000
     [5,] 20.84000
     [6,] 22.53333
     [7,] 22.53333
     [8,] 22.53333
     [9,] 22.53333
    [10,] 22.53333
    [11,] 22.53333
    > regss_t(fit)%*%fit-11*mean(y)^2
    > regss
             [,1]
    [1,] 7.820121
    > resss_t(y-fit)%*%(y-fit)
    > resss
             [,1]
    [1,] 357.5253
    > fstat_(regss/1)/(resss/9)
    > fstat
              [,1]
    [1,] 0.1968562
    > 1-pf(.1968562,1,9)
    [1] 0.6677453
    > sqrt(fstat)
              [,1]
    [1,] 0.4436848
    > 2*(1-pt(.4436848,9))
    [1] 0.6677453

     

Problem to do:
    Using data from Pulp Experiment (in your notes, or see lab 2) compare operator A and operator B:
  1. Compute fitted values for each operator, using linear model approach.
  2. Compute F-test for that linear model.
Useful formulas, Functions:
  • Formulas
    • bhat=(X^t*X)^(-1)*X^t*Y - coefficient for linear model (1.10)
    • yhat=X*bhat -fitted values (1.8)
    • Fobs=((bhat^t*X^t*X*bhat-N*Ybar^2)/p)/((Y-X*bhat)^t*(Y-X*bhat)/(N-(p+1)))  Formula for f-observed (1.14)
  • Functions:
    • cbind(a,b) - bind columns a and b in to matrix
    • t(A) - transpose of matrix A
    • solve(A) - inverse of matrix A
    • A%*%B - matrix multiplication (also works on scalars)
    • pf(q, df1, df2) - probability function of F distribution with df1, df2 degrees of freedom (used to find p-value)
    • qf(p, df1, df2) -  quintile function of F distribution with df1, df2 degrees of freedom (used to find critical value)