Chapter 10: Re-expressing Data
Download
Report
Transcript Chapter 10: Re-expressing Data
Chapter 10:
Re-expressing Data
It’s easier than you think!
Goals of Re-expression
Goal 1:
• Make the distribution of a variable more
symmetric.
• Easier to summarize the center, using mean
and standard deviation.
• If distribution is unimodal, use the 68-95-99.7
Rule.
Goals of Re-expression
Goal 2:
• Make the spread of several groups mare
alike, even if their centers differ.
Goal 3:
• Make the form of the scatterplot more nearly
linear.
Goal 4:
• Make the scatter in the scatterplot spread out
evenly rather than following a fan shape.
The Ladder of Powers
Power
Name
2
y2
1
½
0
Comment
Unimodal, skewed left
Raw data Data that takes on +/- values
y1/2
Counted data
logarithm Measurements that cannot be -
-½
1/ y1/2
-1
-1/y
Preserves the direction of relationship
Ratios of two quantities
Attack of the Logarithms
Model Name
x-axis
Exponential
x
Logarithmic log(x)
Power
y-axis
Comment
log(y) Useful with values that
grow by % increase.
y
Useful with wide range of x
values or scatterplot
descending rapidly then
leveling off.
log(x) log(y) When one of the ladder’s
powers is too big and the
other is too small.
Let’s Try It! (Pg 192)
Shutter speed and
f/stop of the lens
•
•
L1: shutter speed
L2: f/stop
Curved stat plot
•
•
•
Try logarithms
Take log of L1→L3
Take log of L2→L4
Let’s Try It!
Scatterplot #1:
Xlist→L3, Ylist→L2
Scatterplot #2:
Xlist→L1, Ylist→L4
Let’s Try It!
Use Scatterplot #3:
LinReg L3, L4
LinReg L3, L4
Multiple Benefits
A single re-expression may improve
each of our goals at the same time.
Re-expression certainly simplifies efforts
to analyze and understand relationships.
Simpler explanations and simpler
models tend to give a true picture of the
relationship. (Occam’s Razor)
TI Tips
Regressions that automatically and
appropriately re-express the data:
Equivalent Models
Type of
Re-expression
Model
Equation
Calculator’s
Curve
Command
Equation
Logarithmic
LnReg
Exponential
ExpReg
Power
PwrReg
What Can Go Wrong?!?
Beware of multiple modes.
• Re-expression cannot pull separate modes
together.
Watch out for scatterplots that turn
around.
Watch out for negative data values.
• It is impossible to re-express negative values
by any power that is not positive.
What Can Go Wrong?!?
Watch for data far from one.
• Re-expressing data with a range from 1 to
1000 is far more effective than re-expressing
data with a range of 100,000 to 100,100.
Don’t stray too far from the ladder.
• Stick to powers between -2 and 2.
• Stick to the simpler powers contained in the
“ladder.”