Econ 3780: Business and Economics Statistics

Download Report

Transcript Econ 3780: Business and Economics Statistics

Econ 3790: Statistics Business and Economics

Instructor: Yogesh Uppal Email: [email protected]

Chapter 14

 Covariance and Simple Correlation Coefficient  Simple Linear Regression

Covariance

 Covariance between x and y is a measure of relationship between x and y.

cov(

x

,

y

) 

SS xy n

 1   (

y

y

)(

x

x

)

n

 1

Covariance  Example: Reed Auto Sales Reed Auto periodically has a special week-long sale. As part of the advertising campaign Reed runs one or more television commercials during the weekend preceding the sale. Data from a sample of 5 previous sales are shown on the next slide.

Covariance  Example: Reed Auto Sales Number of TV Ads 1 3 2 1 3 Number of Cars Sold 14 24 18 17 27

Covariance

x 1 3 2 1 3 Total=10 Total = 100 y 14

x

x

-1 24 18 17 27 1 0 -1 1

y

y

-6 (

x

x

)(

y

y

) 6 4 -2 -3 7 4 0 3 7 SS xy =20 cov(

x

,

y

) 

SS xy n

 1  20  5 4

Simple Correlation Coefficient

Simple Population Correlation Coefficient

• •    1  cov(

x

, 

x

y y

)    1

If

< 0, a negative relationship between x and y.

If

> 0, a positive relationship between x and y .

Simple Correlation Coefficient

 Since population standard deviations of x and y are not known, we use their sample estimates to compute an estimate of  .

r

 cov(

x

,  1 

r s x s y

  1

y

)

Simple Correlation Coefficient

 Example: Reed Auto Sales x 1 3 2 1 3 Total=10 Total=98 14 y

x

x

-1

y

y

-6 24 18 1 0 4 -2 17 27 -1 1 SS x 1 1 0 SS y 36 16 4 -3 7 1 1 9 49 Total=4 Total= 114

Simple Correlation Coefficient

s x

 (

x n

x

 1 ) 2  4 4  1

s y

 (

y n

y

 1 ) 2  114 4  5 .

34

r xy

 cov(

x

,

y

)

s x s y

 5 1 * 5 .

34  0 .

936

Chapter 14 Simple Linear Regression

 Simple Linear Regression Model  Residual Analysis  Coefficient of Determination  Testing for Significance  Using the Estimated Regression Equation for Estimation and Prediction

Simple Linear Regression Model

  The equation that describes how y is related to x and an error term is called the regression model.

The simple linear regression model is: y = b 0 + b 1 x + e where: b 0 e and b 1 are called parameters of the model, is a random variable called the error term.

Simple Linear Regression Equation

 Positive Linear Relationship

y

Regression line

Intercept b 0 Slope b 1 is positive

x

Simple Linear Regression Equation  Negative Linear Relationship

y

Intercept b 0

Regression line

Slope b 1 is negative

x

Simple Linear Regression Equation  No Relationship

y

Regression line

Intercept b 0 Slope is 0 b 1

x

Interpretation of

b

0

and

b

1

  b 0 (intercept parameter): is the value of y when x = 0.

b 1 (slope parameter): is the change in y given x changes by 1 unit.

Estimated Simple Linear Regression Equation  The estimated simple linear regression equation 0  • • • • The graph is called the estimated regression line.

b b

1 0 is the y intercept of the line.

is the slope of the line.

y

ˆ is the estimated value of y for a given value of x.

Estimation Process

Regression Model y = b 0 + b 1 x + e Regression Equation E(y|x) = b 0 + b 1

x

Unknown Parameters b 0 , b 1 Sample Data:

x y x

1

y

1

. .

. .

x n y n

Estimated Regression Equation 

b

0 

b

1

x b 0 and b

1 provide

point estimates

b 0 and b 1 of

Least Squares Method

 Slope for the Estimated Regression Equation

b

1 

and SS xy Where SS y SS xy SS x

    ( (

x y

 

y

)(

x x

) 2 

x

)

Least Squares Method  y-Intercept for the Estimated Regression Equation

b

0

Estimated Regression Equation

 Example: Reed Auto Sales  Slope for the Estimated Regression Equation

b

1 

SS xy SS x

 20  5 4 

y

-Intercept for the Estimated Regression Equation

b

0 

y

b

1

x

 20  5 * ( 2 )  10  Estimated Regression Equation  10  5 *

x

Scatter Diagram and Regression Line

26 24

y

ˆ  10  5

x

22 20 18 16 14 12 .5

1.0

num ber of ads 1.5

2.0

2.5

3.0

3.5

Estimate of Residuals

x 1 3 2 1 3 y 14 24 18 17 27

y

ˆ 15 25 20 15 25

e

y

 -1.0

-1.0

-2.0

2.0

2.0

Decomposition of total sum of squares

 Relationship Among SST, SSR, SSE SST = SSR + SSE  ( ) 2   (

i

) 2   (  ˆ

i

) 2 where: SST = total sum of squares SSR = sum of squares due to regression SSE = sum of squares due to error

Decomposition of total sum of squares

e

y

y

ˆ -1 -1 -2 2 2

SSE

  (

y

y

) 2 SSE=14 4 1 1 4 4  Check if SST= SSR + SSE 15 25 20 15 25

y

ˆ 

y

-5 5 0 -5 5

SSR

  ( 

y

) 2 25 25 0 25 25 SSR=100

Coefficient of Determination  The coefficient of determination is:

r

2 = SSR/SST

r

2 = SSR/SST = 100/114 = 0.8772

• The regression relationship is very strong; about 88% of the variability in the number of cars sold can be explained by the number of TV ads.

• The coefficient of determination (r 2 ) is also the square of the correlation coefficient (r).

Sample Correlation Coefficient

r

 (sign of

b

1 )

r

 (sign of

b

1 ) C oefficien t of Determinat ion

r

2

r

  0 .

8772  0 .

936

Sampling Distribution of b

1

Estimate of σ

2

 The mean square error (MSE) provides the estimate of

σ

2 .

s

2 = MSE = SSE/(n  2) where: SSE   (

y i

y

ˆ

i

) 2

Interval Estimate of

b

1

:

Example:

Reed Auto Sales

 5  3 .

182  1 .

08  5  3 .

44

Testing for Significance: t Test

 Hypotheses

H

0

H a

: b 1 : b 1  0  0  Test Statistic

t

b

1  0

SE

(

b

1 )  Where b 1 is the slope estimate and SE(b 1 ) is the standard error of b 1 .

Testing for Significance: t Test  Rejection Rule Reject H 0 if p-value < or t < -t   or t > t    where:

t

  is based on a t distribution with n - 2 degrees of freedom

Testing for Significance: t Test 1. Determine the hypotheses.

H H

0

a

: b : b 1 1  0  0 2. Specify the level of significance.

 = .05

3. Select the test statistic.

4. State the rejection rule.

t

b

1

SE

(

b

1 ) Reject H 0 if p-value < .05

or t ≤ 3.182 or t ≥ 3.182

Testing for Significance: t Test 5. Compute the value of the test statistic.

t

b

1

SE

(

b

1 )  5 1 .

08  4 .

63 6. Determine whether to reject H 0 .

t = 4.63 > t  /2 = 3.182. We can reject H 0 .

Some Cautions about the Interpretation of Significance Tests

 Rejecting H 0 : b 1 = 0 and concluding that the relationship between x and y is significant does not enable us to conclude that a cause-and-effect relationship is present between x and y.

 Just because we are able to reject H between x and y.

0 : b 1 = 0 and demonstrate statistical significance does not enable us to conclude that there is a linear relationship