Chapter2_Simple Linear Regression_How

Download Report

Transcript Chapter2_Simple Linear Regression_How

Slide 1

Chapter 2 – Simple Linear
Regression - How

Y

Here is a perfect scenario of what we want reality to look
like for simple linear regression. Our two variables are
not perfectly related, as we can see, but nonetheless
there is a relationship.
The means of each distribution
is connected by a
straight line.

 yˆ  a + bx, the variable are
x and y w hich is currently
denoted by  yˆ .
X

Y

In order to discover what the actual
equation for the straight line is we
need to sample from the population.

 yˆ  a + bx, the variable are
x and y w hich is currently
denoted by  yˆ .
X

Y

And as far as we can tell there is a
scatter of ordered pairs. From these
ordered pairs we need to determine
the equation of the line.

(a,
a b)
b

 yˆ  a + bx, the variable are
x and y w hich is currently
denoted by  yˆ .
X

How do we determine the actual equation of the line?
Y

The formula used to determine the actual line will not be
very informative. So instead I will tell you
what the formula will achieve.

 yˆ  a + bx, the variable are
x and y w hich is currently
denoted by  yˆ .
X

The error is defined as the difference between the observed
y-value (blue dots are represented using the observed y)
and the predicted y-value (the predicted y-value is located
on the line).

Y

error  y 1  yˆ 1

Observations

(x 1 , yˆ 1 )

yˆ1  a  b ( x1 )

(x1, y1)
Observations

Observations



2
( y 1  yˆ 1 ) 



error

2

Calculate the error for every
observed y-value. Take the
square of all the results and
add them up. The least
squares regression line has
the property that no other line
will have a smaller squared
sum of the errors.
x1

X

Keep in mind that since we are going to attempt to find the
linear equation using a sample from our population, then
the linear equation that we calculate is an approximation
of the actual linear equation. In other words the slope and
y-intercept are estimates of the actual slope and

Y

y-intercept.

error  y 1  yˆ 1
(x 1 , yˆ 1 )

yˆ1  a  b ( x1 )

(x1, y1)



2
( y 1  yˆ 1 ) 



error

2

x1

X


Slide 2

Chapter 2 – Simple Linear
Regression - How

Y

Here is a perfect scenario of what we want reality to look
like for simple linear regression. Our two variables are
not perfectly related, as we can see, but nonetheless
there is a relationship.
The means of each distribution
is connected by a
straight line.

 yˆ  a + bx, the variable are
x and y w hich is currently
denoted by  yˆ .
X

Y

In order to discover what the actual
equation for the straight line is we
need to sample from the population.

 yˆ  a + bx, the variable are
x and y w hich is currently
denoted by  yˆ .
X

Y

And as far as we can tell there is a
scatter of ordered pairs. From these
ordered pairs we need to determine
the equation of the line.

(a,
a b)
b

 yˆ  a + bx, the variable are
x and y w hich is currently
denoted by  yˆ .
X

How do we determine the actual equation of the line?
Y

The formula used to determine the actual line will not be
very informative. So instead I will tell you
what the formula will achieve.

 yˆ  a + bx, the variable are
x and y w hich is currently
denoted by  yˆ .
X

The error is defined as the difference between the observed
y-value (blue dots are represented using the observed y)
and the predicted y-value (the predicted y-value is located
on the line).

Y

error  y 1  yˆ 1

Observations

(x 1 , yˆ 1 )

yˆ1  a  b ( x1 )

(x1, y1)
Observations

Observations



2
( y 1  yˆ 1 ) 



error

2

Calculate the error for every
observed y-value. Take the
square of all the results and
add them up. The least
squares regression line has
the property that no other line
will have a smaller squared
sum of the errors.
x1

X

Keep in mind that since we are going to attempt to find the
linear equation using a sample from our population, then
the linear equation that we calculate is an approximation
of the actual linear equation. In other words the slope and
y-intercept are estimates of the actual slope and

Y

y-intercept.

error  y 1  yˆ 1
(x 1 , yˆ 1 )

yˆ1  a  b ( x1 )

(x1, y1)



2
( y 1  yˆ 1 ) 



error

2

x1

X


Slide 3

Chapter 2 – Simple Linear
Regression - How

Y

Here is a perfect scenario of what we want reality to look
like for simple linear regression. Our two variables are
not perfectly related, as we can see, but nonetheless
there is a relationship.
The means of each distribution
is connected by a
straight line.

 yˆ  a + bx, the variable are
x and y w hich is currently
denoted by  yˆ .
X

Y

In order to discover what the actual
equation for the straight line is we
need to sample from the population.

 yˆ  a + bx, the variable are
x and y w hich is currently
denoted by  yˆ .
X

Y

And as far as we can tell there is a
scatter of ordered pairs. From these
ordered pairs we need to determine
the equation of the line.

(a,
a b)
b

 yˆ  a + bx, the variable are
x and y w hich is currently
denoted by  yˆ .
X

How do we determine the actual equation of the line?
Y

The formula used to determine the actual line will not be
very informative. So instead I will tell you
what the formula will achieve.

 yˆ  a + bx, the variable are
x and y w hich is currently
denoted by  yˆ .
X

The error is defined as the difference between the observed
y-value (blue dots are represented using the observed y)
and the predicted y-value (the predicted y-value is located
on the line).

Y

error  y 1  yˆ 1

Observations

(x 1 , yˆ 1 )

yˆ1  a  b ( x1 )

(x1, y1)
Observations

Observations



2
( y 1  yˆ 1 ) 



error

2

Calculate the error for every
observed y-value. Take the
square of all the results and
add them up. The least
squares regression line has
the property that no other line
will have a smaller squared
sum of the errors.
x1

X

Keep in mind that since we are going to attempt to find the
linear equation using a sample from our population, then
the linear equation that we calculate is an approximation
of the actual linear equation. In other words the slope and
y-intercept are estimates of the actual slope and

Y

y-intercept.

error  y 1  yˆ 1
(x 1 , yˆ 1 )

yˆ1  a  b ( x1 )

(x1, y1)



2
( y 1  yˆ 1 ) 



error

2

x1

X


Slide 4

Chapter 2 – Simple Linear
Regression - How

Y

Here is a perfect scenario of what we want reality to look
like for simple linear regression. Our two variables are
not perfectly related, as we can see, but nonetheless
there is a relationship.
The means of each distribution
is connected by a
straight line.

 yˆ  a + bx, the variable are
x and y w hich is currently
denoted by  yˆ .
X

Y

In order to discover what the actual
equation for the straight line is we
need to sample from the population.

 yˆ  a + bx, the variable are
x and y w hich is currently
denoted by  yˆ .
X

Y

And as far as we can tell there is a
scatter of ordered pairs. From these
ordered pairs we need to determine
the equation of the line.

(a,
a b)
b

 yˆ  a + bx, the variable are
x and y w hich is currently
denoted by  yˆ .
X

How do we determine the actual equation of the line?
Y

The formula used to determine the actual line will not be
very informative. So instead I will tell you
what the formula will achieve.

 yˆ  a + bx, the variable are
x and y w hich is currently
denoted by  yˆ .
X

The error is defined as the difference between the observed
y-value (blue dots are represented using the observed y)
and the predicted y-value (the predicted y-value is located
on the line).

Y

error  y 1  yˆ 1

Observations

(x 1 , yˆ 1 )

yˆ1  a  b ( x1 )

(x1, y1)
Observations

Observations



2
( y 1  yˆ 1 ) 



error

2

Calculate the error for every
observed y-value. Take the
square of all the results and
add them up. The least
squares regression line has
the property that no other line
will have a smaller squared
sum of the errors.
x1

X

Keep in mind that since we are going to attempt to find the
linear equation using a sample from our population, then
the linear equation that we calculate is an approximation
of the actual linear equation. In other words the slope and
y-intercept are estimates of the actual slope and

Y

y-intercept.

error  y 1  yˆ 1
(x 1 , yˆ 1 )

yˆ1  a  b ( x1 )

(x1, y1)



2
( y 1  yˆ 1 ) 



error

2

x1

X


Slide 5

Chapter 2 – Simple Linear
Regression - How

Y

Here is a perfect scenario of what we want reality to look
like for simple linear regression. Our two variables are
not perfectly related, as we can see, but nonetheless
there is a relationship.
The means of each distribution
is connected by a
straight line.

 yˆ  a + bx, the variable are
x and y w hich is currently
denoted by  yˆ .
X

Y

In order to discover what the actual
equation for the straight line is we
need to sample from the population.

 yˆ  a + bx, the variable are
x and y w hich is currently
denoted by  yˆ .
X

Y

And as far as we can tell there is a
scatter of ordered pairs. From these
ordered pairs we need to determine
the equation of the line.

(a,
a b)
b

 yˆ  a + bx, the variable are
x and y w hich is currently
denoted by  yˆ .
X

How do we determine the actual equation of the line?
Y

The formula used to determine the actual line will not be
very informative. So instead I will tell you
what the formula will achieve.

 yˆ  a + bx, the variable are
x and y w hich is currently
denoted by  yˆ .
X

The error is defined as the difference between the observed
y-value (blue dots are represented using the observed y)
and the predicted y-value (the predicted y-value is located
on the line).

Y

error  y 1  yˆ 1

Observations

(x 1 , yˆ 1 )

yˆ1  a  b ( x1 )

(x1, y1)
Observations

Observations



2
( y 1  yˆ 1 ) 



error

2

Calculate the error for every
observed y-value. Take the
square of all the results and
add them up. The least
squares regression line has
the property that no other line
will have a smaller squared
sum of the errors.
x1

X

Keep in mind that since we are going to attempt to find the
linear equation using a sample from our population, then
the linear equation that we calculate is an approximation
of the actual linear equation. In other words the slope and
y-intercept are estimates of the actual slope and

Y

y-intercept.

error  y 1  yˆ 1
(x 1 , yˆ 1 )

yˆ1  a  b ( x1 )

(x1, y1)



2
( y 1  yˆ 1 ) 



error

2

x1

X


Slide 6

Chapter 2 – Simple Linear
Regression - How

Y

Here is a perfect scenario of what we want reality to look
like for simple linear regression. Our two variables are
not perfectly related, as we can see, but nonetheless
there is a relationship.
The means of each distribution
is connected by a
straight line.

 yˆ  a + bx, the variable are
x and y w hich is currently
denoted by  yˆ .
X

Y

In order to discover what the actual
equation for the straight line is we
need to sample from the population.

 yˆ  a + bx, the variable are
x and y w hich is currently
denoted by  yˆ .
X

Y

And as far as we can tell there is a
scatter of ordered pairs. From these
ordered pairs we need to determine
the equation of the line.

(a,
a b)
b

 yˆ  a + bx, the variable are
x and y w hich is currently
denoted by  yˆ .
X

How do we determine the actual equation of the line?
Y

The formula used to determine the actual line will not be
very informative. So instead I will tell you
what the formula will achieve.

 yˆ  a + bx, the variable are
x and y w hich is currently
denoted by  yˆ .
X

The error is defined as the difference between the observed
y-value (blue dots are represented using the observed y)
and the predicted y-value (the predicted y-value is located
on the line).

Y

error  y 1  yˆ 1

Observations

(x 1 , yˆ 1 )

yˆ1  a  b ( x1 )

(x1, y1)
Observations

Observations



2
( y 1  yˆ 1 ) 



error

2

Calculate the error for every
observed y-value. Take the
square of all the results and
add them up. The least
squares regression line has
the property that no other line
will have a smaller squared
sum of the errors.
x1

X

Keep in mind that since we are going to attempt to find the
linear equation using a sample from our population, then
the linear equation that we calculate is an approximation
of the actual linear equation. In other words the slope and
y-intercept are estimates of the actual slope and

Y

y-intercept.

error  y 1  yˆ 1
(x 1 , yˆ 1 )

yˆ1  a  b ( x1 )

(x1, y1)



2
( y 1  yˆ 1 ) 



error

2

x1

X


Slide 7

Chapter 2 – Simple Linear
Regression - How

Y

Here is a perfect scenario of what we want reality to look
like for simple linear regression. Our two variables are
not perfectly related, as we can see, but nonetheless
there is a relationship.
The means of each distribution
is connected by a
straight line.

 yˆ  a + bx, the variable are
x and y w hich is currently
denoted by  yˆ .
X

Y

In order to discover what the actual
equation for the straight line is we
need to sample from the population.

 yˆ  a + bx, the variable are
x and y w hich is currently
denoted by  yˆ .
X

Y

And as far as we can tell there is a
scatter of ordered pairs. From these
ordered pairs we need to determine
the equation of the line.

(a,
a b)
b

 yˆ  a + bx, the variable are
x and y w hich is currently
denoted by  yˆ .
X

How do we determine the actual equation of the line?
Y

The formula used to determine the actual line will not be
very informative. So instead I will tell you
what the formula will achieve.

 yˆ  a + bx, the variable are
x and y w hich is currently
denoted by  yˆ .
X

The error is defined as the difference between the observed
y-value (blue dots are represented using the observed y)
and the predicted y-value (the predicted y-value is located
on the line).

Y

error  y 1  yˆ 1

Observations

(x 1 , yˆ 1 )

yˆ1  a  b ( x1 )

(x1, y1)
Observations

Observations



2
( y 1  yˆ 1 ) 



error

2

Calculate the error for every
observed y-value. Take the
square of all the results and
add them up. The least
squares regression line has
the property that no other line
will have a smaller squared
sum of the errors.
x1

X

Keep in mind that since we are going to attempt to find the
linear equation using a sample from our population, then
the linear equation that we calculate is an approximation
of the actual linear equation. In other words the slope and
y-intercept are estimates of the actual slope and

Y

y-intercept.

error  y 1  yˆ 1
(x 1 , yˆ 1 )

yˆ1  a  b ( x1 )

(x1, y1)



2
( y 1  yˆ 1 ) 



error

2

x1

X