Chapter 4: More on Two-Variable (Bivariate) Data
Download
Report
Transcript Chapter 4: More on Two-Variable (Bivariate) Data
Chapter 4:
More on
Two-Variable
(Bivariate) Data
4.1 Transforming Relationships
Animal’s Brain Weight vs. Weight of Body
Outliers
r=.86
Drop Outliers
Logarithm
r=.50
Plot Logarithm vs.
Logarithm
r=.96
The vertical spread
about the LSRL is
similar everywhere,
so the predictions of
brain weight from
body weight will be
pretty precise (high
r2) – in LOG SCALE
Working with a function of our original
measurements can greatly simplify
statistical analysis.
Transforming-
How?
Recall…
Chapter 1 we did Linear Transformations
Took a set of data and transformed
it linearly
4
6
Called:
6
SHIFTING
C to F
Meters to Miles
9
15
A Linear Transformation CANNOT make
a curved relationship between 2
variables “straight”
Resort to common non-linear functions
like the logarithm, positive & negative
powers
We can transform either one of the
explanatory/response variables OR
BOTH
when we do we will call the variable “t”
Real World Example:
We measure fuel consumption of a car in
miles per gallon
Engineers measure it in gallons per mile
(how many gallons of fuel the car
needs to travel 1 mile)
Reciprocal Transformation: 1/f(t)
My Car- 25 miles per gallon
1/25=.04 gallons per mile
Monotonic Function
A monotonic function f(t) moves in one
direction as its argument “t” increases
Monotonic Increasing
Monotonic Decreasing
Monotonic Increasing:
a + bt
t
slope b>0
2
t2
Positive “t”
logt
Monotonic Decreasing:
a + bt
slope b<0
1
t
t
1
2
Positive “t”
1 1
t
t
Nonlinear monotonic
transformations change data
enough to change form or
relations between 2 variables,
yet preserve order and allow
recovery of original data.
Strategy:
1. If the variable that you want to
transform has values that are 0 or
negative apply linear transformation
(add a constant) to get all positive.
2. Choose power or logarithmic
transformation that approximately
straightens the scatterplot.
Ladder of Power
Transformations:
Power Function: tP
Power Functions:
t
2
Monotonic Power Function
For t > 0….
1. Positive p – are monotonic
increasing
2. Negative p – are
monotonic decreasing
1
t
t
1
2
Monotonic Decreasing- Hard to
interpret because reversed order
of original data point
We want to make all tP therefore
monotonic increasing.
t 1
p
p
We can apply a
LINEAR TRANSFORMATION
Linear Transformation:
Original Data
(t)
Power Function Linear Trans:
1
t 1 1
t
1
0
undefined
Undefined
1
1
0
2
3
4
t p 1
p
1 1
t
t
t 1
p
p
1 1
t
t
1 1
t
t
1 1
t
t
t 1
p
p
This is a line
This is log t
Concavity of Power
Functions:
P is greater than 1 =
- Push out right tail & pull in left tail
- Gets stronger as power p moves up
away from 1
P is less than 1 =
- Push out left tail & pull in right tail
- Gets stronger as power p moves down
away from 1
Country’s GDB vs.
Life Expectancy
P=
P=
P=
Use
1
x
How do you know what
transformation will make the
scatterplot straight?
** DO NOT just push buttons!! **
We will develop methods of
selection
1. Logarithmic Transformation
2. Power Transformation
1. Logarithmic
Transformations
Exponential Growth
A variable grows…
Linearly:
Exponentially:
The King’s Chess Board…
y a bx
y a b
x
King’s Offer: 1,000,000 grains - 30 days
Wise Man: 1 grain per day and double for 30 days
Cell Phone Growth
Suspect Exponential Growth…
1. Calculate Ratios of Consecutive Terms
- IF approximately the same… continue
Suspect Exponential Growth…
2. Apply a Transformation that:
a. Transforms exponential growth into
linear growth
b. Transforms non-exponential growth
into non-linear growth
y a b
x
Logarithm Review…
log b x y if and only if
1.log(AB)=
2.log(A/B)=
3.logXp =
b x
y
The Transformation…
We hypothesize an exponential model of the
form y=abx
To gain linearity, use the (x, log(y))
transformation
log y log( ab )
x
log y
log y
Form? –
log y log a (log b) x
When our data is growing exponentially… if
we plot the log of y versus x, we should
observe a straight line for the
transformed data!
LOG (Y) = -263 + 0.134 (year)
R-sq = 98.2%
Eliminate first 4 years & perform regression
LOG (New Y) = -189 + 0.0970(New X)
R-sq = 99.99%
Predictions in
Exponential Growth Model
Regression is often used for predictions
In exponential growth, ________ rather
than actual values follow a linear pattern
To make a prediction of Exp. Growth we
must thus “undo” the logarithmic
transformation.
The inverse operation of a logarithm is
_____________________
LOG (New Y) = -189 + 0.0970(New X)
R-sq = 99.99%
Predict the number of cell phone
users in 2000.
10
log(NewY )
yˆ
yˆ
189.0970( NewX )
10
If a variable grows exponentially…
its ___________ grow linearly!
In other words… if (x, y) is
exponential, then (x, log(y)) is
linear!
Read and do Technology
Toolbox- Page 210-211
on your own!!
2. Power
Transformations
Power Law Model
Example:
Pizza Shop- order pizza by diameter
10 inch
12 inch
14 inch
2
x
x
area r 2 x 2
2
4 4
2
Amount you get depends
on the area of the pizza
Area circle = pi times the square of the radius
Power Law
Model
Power Laws
We expect area to go up with the square of
dimension
We expect volume to go up with the cube of a
dimension
Real Examples: Many Characteristics of Living
Things
Kleiber’s Law- The rate at which animals use
energy goes up as the ¾ power of their body
weight (works from bacteria to whales).
Power Laws Become Linear
Exponential growth becomes linear when
we apply the logarithm to the response
variable (y).
Power Laws become linear when we
apply the logarithm transformation to
BOTH variables.
To Achieve Linearity…
1. The power law model is y a x p
2. Take the logarithm of both sides of
equation (this straightens scatterplot)
3. Power p in the power law becomes the
slope of the straight line that links log(y)
to log(x)
4. Undo transformation to make prediction
Fish Example…
Read Example 4.9 page 216
Model: weight = a x length3
Log (weight) = log a + [3x log(length)]
Yes appears very linear- perform LSRL
on [log(length), log(weight)]
LSRL:
log(weight)= -1.8994 + 3.0494log(length)
r = .9926 r2 = .9985
log(weight)= -1.8994 + 3.0494log(length)
= -1.894 + log(length)3.0494
This is the final power equation for the
original data (note- look at p-value)!
Prediction…
Why did we do this?
weight = 10-1.8994 x length3.0494
Predict the weight of a 36cm fish
Summary- Order of
Checking…
1. Look to see if there is a
___________________if so use LSRL
2. If points are ____________plot (x, log
y) or (x, ln y) to gain linearity
3. If there is a _________ relationship
(power model) plot (logx, logy)
4. If the scatterplot looks ________ plot
(logx, y)