Chapter 4: More on Two-Variable (Bivariate) Data

Transcript Chapter 4: More on Two-Variable (Bivariate) Data

Chapter 4:
More on
Two-Variable
(Bivariate) Data
4.1 Transforming Relationships
 Animal’s Brain Weight vs. Weight of Body
Outliers
r=.86
Drop Outliers
Logarithm
r=.50
Plot Logarithm vs.
Logarithm
r=.96
The vertical spread
about the LSRL is
similar everywhere,
so the predictions of
brain weight from
body weight will be
pretty precise (high
r2) – in LOG SCALE
 Working with a function of our original
measurements can greatly simplify
statistical analysis.
 Transforming-
How?
Recall…
 Chapter 1 we did Linear Transformations
 Took a set of data and transformed
it linearly
4
6
Called:
6
SHIFTING
C to F
Meters to Miles
9
15
 A Linear Transformation CANNOT make
a curved relationship between 2
variables “straight”
 Resort to common non-linear functions
like the logarithm, positive & negative
powers
 We can transform either one of the
explanatory/response variables OR
BOTH
 when we do we will call the variable “t”
Real World Example:
We measure fuel consumption of a car in
miles per gallon
Engineers measure it in gallons per mile
(how many gallons of fuel the car
needs to travel 1 mile)
Reciprocal Transformation: 1/f(t)
My Car- 25 miles per gallon
1/25=.04 gallons per mile
Monotonic Function
 A monotonic function f(t) moves in one
direction as its argument “t” increases
 Monotonic Increasing
 Monotonic Decreasing
Monotonic Increasing:
a + bt
t
slope b>0
2
t2
Positive “t”
logt
Monotonic Decreasing:
a + bt
slope b<0
1
t
t
1
2
Positive “t”
1 1
t
t
 Nonlinear monotonic
transformations change data
enough to change form or
relations between 2 variables,
yet preserve order and allow
recovery of original data.
Strategy:
1. If the variable that you want to
transform has values that are 0 or
negative apply linear transformation
(add a constant) to get all positive.
2. Choose power or logarithmic
transformation that approximately
straightens the scatterplot.
Ladder of Power
Transformations:
Power Function: tP
Power Functions:
t
2
 Monotonic Power Function
For t > 0….
1. Positive p – are monotonic
increasing
2. Negative p – are
monotonic decreasing
1
t
t
1
2
 Monotonic Decreasing- Hard to
interpret because reversed order
of original data point
 We want to make all tP therefore
monotonic increasing.
t 1
p
p
We can apply a
LINEAR TRANSFORMATION
Linear Transformation:
Original Data
(t)
Power Function Linear Trans:
1
t 1  1
t
1
0
undefined
Undefined
1
1
0
2
3
4
t p 1
p
1 1
t
t
t 1
p
p
1 1
t
t
1 1
t
t
1 1
t
t
t 1
p
p
This is a line
This is log t
Concavity of Power
Functions:
P is greater than 1 =
- Push out right tail & pull in left tail
- Gets stronger as power p moves up
away from 1
P is less than 1 =
- Push out left tail & pull in right tail
- Gets stronger as power p moves down
away from 1
Country’s GDB vs.
Life Expectancy
P=
P=
P=
Use
1
x
How do you know what
transformation will make the
scatterplot straight?
** DO NOT just push buttons!! **
 We will develop methods of
selection
1. Logarithmic Transformation
2. Power Transformation
1. Logarithmic
Transformations
Exponential Growth
A variable grows…
Linearly:
Exponentially:
The King’s Chess Board…
y  a  bx
y  a b
x
King’s Offer: 1,000,000 grains - 30 days
Wise Man: 1 grain per day and double for 30 days
Cell Phone Growth
Suspect Exponential Growth…
1. Calculate Ratios of Consecutive Terms
- IF approximately the same… continue
Suspect Exponential Growth…
2. Apply a Transformation that:
a. Transforms exponential growth into
linear growth
b. Transforms non-exponential growth
into non-linear growth
y  a b
x
Logarithm Review…
log b x  y if and only if
1.log(AB)=
2.log(A/B)=
3.logXp =
b x
y
The Transformation…
We hypothesize an exponential model of the
form y=abx
To gain linearity, use the (x, log(y))
transformation
log y  log( ab )
x
log y 
log y 
Form? –
log y  log a  (log b) x
When our data is growing exponentially… if
we plot the log of y versus x, we should
observe a straight line for the
transformed data!
LOG (Y) = -263 + 0.134 (year)
R-sq = 98.2%
Eliminate first 4 years & perform regression
LOG (New Y) = -189 + 0.0970(New X)
R-sq = 99.99%
Predictions in
Exponential Growth Model
 Regression is often used for predictions
 In exponential growth, ________ rather
than actual values follow a linear pattern
 To make a prediction of Exp. Growth we
must thus “undo” the logarithmic
transformation.
 The inverse operation of a logarithm is
_____________________
LOG (New Y) = -189 + 0.0970(New X)
R-sq = 99.99%
Predict the number of cell phone
users in 2000.
10
log(NewY )
yˆ 
yˆ 
189.0970( NewX )
 10
If a variable grows exponentially…
its ___________ grow linearly!
In other words… if (x, y) is
exponential, then (x, log(y)) is
linear!
Read and do Technology
Toolbox- Page 210-211
on your own!!
2. Power
Transformations
Power Law Model
Example:
Pizza Shop- order pizza by diameter
10 inch
12 inch
14 inch
2

  
x
x


area  r 2            x 2
2
 4  4
2
Amount you get depends
on the area of the pizza
Area circle = pi times the square of the radius
Power Law
Model
Power Laws
 We expect area to go up with the square of
dimension
 We expect volume to go up with the cube of a
dimension
Real Examples: Many Characteristics of Living
Things
Kleiber’s Law- The rate at which animals use
energy goes up as the ¾ power of their body
weight (works from bacteria to whales).
Power Laws Become Linear
 Exponential growth becomes linear when
we apply the logarithm to the response
variable (y).
 Power Laws become linear when we
apply the logarithm transformation to
BOTH variables.
To Achieve Linearity…
1. The power law model is y  a  x p
2. Take the logarithm of both sides of
equation (this straightens scatterplot)
3. Power p in the power law becomes the
slope of the straight line that links log(y)
to log(x)
4. Undo transformation to make prediction
Fish Example…
Read Example 4.9 page 216
Model: weight = a x length3
 Log (weight) = log a + [3x log(length)]
 Yes appears very linear- perform LSRL
on [log(length), log(weight)]
LSRL:
log(weight)= -1.8994 + 3.0494log(length)
r = .9926 r2 = .9985
log(weight)= -1.8994 + 3.0494log(length)
= -1.894 + log(length)3.0494
This is the final power equation for the
original data (note- look at p-value)!
Prediction…
Why did we do this?
weight = 10-1.8994 x length3.0494
Predict the weight of a 36cm fish
Summary- Order of
Checking…
 1. Look to see if there is a
___________________if so use LSRL
 2. If points are ____________plot (x, log
y) or (x, ln y) to gain linearity
 3. If there is a _________ relationship
(power model) plot (logx, logy)
 4. If the scatterplot looks ________ plot
(logx, y)

Chapter 4: More on Two-Variable (Bivariate) Data

Transcript Chapter 4: More on Two-Variable (Bivariate) Data

Directory