|
|
Residual Plots
> plot(g$fit,g$res) > abline(h=0) > plot(g$fit,abs(g$res))
Coefficients: Value Std. Error t value Pr(>|t|) (Intercept) 4.8397 1.1865 4.0790 0.0002 g$fit -0.2035 0.1185 -1.7166 0.0925 Residual standard error: 2.163 on 48 degrees of freedom Multiple R-Squared: 0.05784 F-statistic: 2.947 on 1 and 48 degrees of freedom, the p-value is 0.09251 It's often had to judge residual plots without prior experience so let's generate some of the artificial variety. The following four panels show
> par(mfrow=c(3,3)) Now look at the residuals against predictor plots:
[1] 2.785184 > length(g$res[p15 > 35]) [1] 23 > 1-pf(2.785184,22,26) [1] 0.006786117 Now try the partial regression (added variable) plot for p15: > d <- lm(sav ~ p75 + inc + gro)$res > m <- lm(p15 ~ p75 + inc + gro)$res > plot(m,d)Compare the slope on the plot to the original regression and show the line on the plot: > lm(d ~ m)$coef (Intercept) m -1.231709e-16 -0.461205 > g$coef (Intercept) p15 p75 inc gro 28.56661 -0.461205 -1.691576 -0.0003367615 0.4096998 > abline(0,g$coef['p15'])A partial residual plot is easier to do: > plot(p15,g$res+g$coef['p15']*p15) > abline(0,g$coef['p15'])Might there be a different relationship in the two groups: > g1 <- lm(sav ~ p15 + p75 + inc + gro,subset=(p15 > 35)) > g2 <- lm(sav ~ p15 + p75 + inc + gro,subset=(p15 < 35)) > summary(g1) Coefficients: Value Std. Error t value Pr(>|t|) (Intercept) -2.4340 21.1550 -0.1151 0.9097 p15 0.2739 0.4392 0.6235 0.5408 p75 -3.5485 3.0333 -1.1698 0.2573 inc 0.0004 0.0050 0.0842 0.9339 gro 0.3955 0.2901 1.3632 0.1896 Residual standard error: 4.454 on 18 degrees of freedom Multiple R-Squared: 0.1558 F-statistic: 0.8302 on 4 and 18 degrees of freedom, the p-value is 0.5233 > summary(g2) Coefficients: Value Std. Error t value Pr(>|t|) (Intercept) 23.9638 8.0836 2.9645 0.0072 p15 -0.3860 0.1954 -1.9755 0.0609 p75 -1.3279 0.9260 -1.4339 0.1657 inc -0.0005 0.0007 -0.6339 0.5327 gro 0.8844 0.2953 2.9945 0.0067 Residual standard error: 2.772 on 22 degrees of freedom Multiple R-Squared: 0.5073 F-statistic: 5.663 on 4 and 22 degrees of freedom, the p-value is 0.002733Can you see the difference? The graphical analysis has shown a relationship in the data that a purely numerical analysis might easily have missed.
|