Using S+ to fit a linear model

Let's use the Galapagos Islands tortoise as an example. The variables are

  • Species The number of species of tortoise found on the island
  • Endemics The number of endemic species
  • Elevation The highest elevation of the island (m)
  • Nearest The distance from the nearest island (km)
  • Scruz The distance from Santa Cruz island (km)
  • Adjacent The area of the adjacent island (km2)

If you did last weeks lab, you should still be able to type

> gala

and the data will still be there - if not you'll have to read the data in again as described in the last lab.

Regression modelling

Fitting a linear model in S+ is done using the lm() command. Note the syntax for specifying the predictors in the model. In this case, since all the variables are in the gala data frame, we must use the data= argument:

First lets do simple regression:
 
>gfit<-lm(Species~Area,data=gala)
>summary(gfit)
More complex Regression:
> gfit <- lm(Species ~ Area + Elevation + Nearest + Scruz + Adjacent, data=gala)
> summary(gfit)

Call: lm(formula = Species ~ Area + Elevation + Nearest + Scruz + Adjacent, data = 
        gala)
Residuals:
    Min    1Q Median    3Q   Max 
 -111.7 -34.9 -7.862 33.46 182.6

Coefficients:
               Value Std. Error  t value Pr(>|t|) 
(Intercept)   7.0682  19.1542     0.3690   0.7154
       Area  -0.0239   0.0224    -1.0676   0.2963
  Elevation   0.3195   0.0537     5.9532   0.0000
    Nearest   0.0091   1.0541     0.0087   0.9932
      Scruz  -0.2405   0.2154    -1.1166   0.2752
   Adjacent  -0.0748   0.0177    -4.2262   0.0003

Residual standard error: 60.98 on 24 degrees of freedom
Multiple R-Squared: 0.7658 
F-statistic: 15.7 on 5 and 24 degrees of freedom, the p-value is 6.838e-07 

Correlation of Coefficients:
          (Intercept)    Area Elevation Nearest   Scruz 
     Area  0.3271                                      
Elevation -0.5650     -0.8014                          
  Nearest -0.0431      0.2035 -0.2333                  
    Scruz -0.3389     -0.0378  0.0991   -0.6257        
 Adjacent  0.2533      0.4328 -0.6420    0.2839 -0.1910


Some Questions

  1. What was the value of the largest residual?
  2. What is the value of the regression coefficient associated with Scruz?