AP Statistics Section 3.2 B Residuals

Download Report

Transcript AP Statistics Section 3.2 B Residuals

AP Statistics Section 3.2 B Residuals

One of the purposes of drawing a are errors in y, we would like to find the line that makes the vertical distances from possible. The predicted response will usually not be exactly the same as the actual observed response.

One of the first principals of data analysis is to look for an overall pattern and for striking deviations from that pattern.

Residuals determine how well the LSL fits the data. A residual is the difference between an observed value of the response variable and the value predicted by the regression line.

residual

y -

Example: Refer back to section 3. 2 A and find the residual for the subject whose NEA rose by 135 calories.

y

ˆ  3 .

505  .

0034 ( 135 )

y

ˆ  3 .

046

residual

2.7

3.046

-.346

The residual is negative because

The sum of the least squares

The graph at the right below is the residual plot of the NEA vs Fat Gain example in Section 3.2 A. A residual plot makes it easier to study the residuals by plotting them against the explanatory variable. Because the mean of the residuals is always 0, the horizontal line at zero helps orient us. This “residual = 0” line corresponds to the regression line we drew in the section 3.2 A notes.

The residual plot magnifies the deviations from the line to make patterns easier to see. If the regression line captures the overall pattern of the data, there should be graph to the left .

CALCULATOR: Put data in 2 lists and find LSL 2 nd Y= (STATPLOT) Xlist: Explanatory variable Ylist: RESID 2 nd STAT (List) NAMES-Scroll down to RESID ENTER GRAPH ZOOM 9

Here are two important things to look for when you examine a residual plot.

patterns fan-shaped

2. The residuals should be relatively small in size. Since smallness is relative, we could find the standard deviation of the residuals, which is given by the equation

s

  .

residuals

2

n

 2 The standard deviation of the residuals represents the amount of error that could “consistently” occur using the LSL to make predictions.