Statistics Overview - Supplemental Teaching Resources

Download Report

Transcript Statistics Overview - Supplemental Teaching Resources

Statistics Overview
©2010 Dr. B. C. Paul
Why Are Statistics Important to
Engineers

Engineers build models (often mathematical
models) of systems and things that we
cannot screw-up on and learn the hard way.
Modeling


We build a mathematical model of the
situation and then do the math to see if it is
going to work for us in the real world
We may not think of it but most of our
engineering design equations are
mathematical models that were fit to actual
data long ago
–
–
Newtonian physics (we call them laws now)
Darcy’s law and the Bernoulli Equation
How do You Decide if a
Mathematical Model Fits What You
See?

Because you usually can’t measure 100%
accurate or don’t think of or can’t consider
every minor effect
–

Real results tend to be distributed around our
potential mathematical models
Statistical models consider a distribution of
answers around an underlying trend
–
You can know the shape and spread of the
variation without knowing the cause
Example


If I have a random number generator that
produces numbers between 1 and 100, what
value is most likely?
If I take 25 of those random numbers what
will the average value most likely be close
to?
What Did You Assume to Get
Those Answers?

You assumed how those values were
distributed
–
–
You considered what was called a uniform
distribution (all numbers are equally likely to come
up)
Statistics begins with a series of standard
mathematical distributions

We try to pick one that most nearly matches our reality
Getting Your Answers

You also assumed that the numbers were
taken from that distribution at random
–
–

ie no one is cherry picking any values
preferentially to any other
One of the reasons that statisticians get so crazy if
they think someone is Cherry Picking the sample
Root of all Statistics is that you assume reality
follows a standard mathematical distribution
and the part we see was picked at random
from that distribution
How Do We Come Up With What
Distribution Closely Resembles Our
Reality?



Process Starts with Figuring Out Which of Our
Standard Model Distributions it is
Three Levels of Effort
Say “I Believe” and assume one
–
–
–

Most commonly done with “Normal Distribution” - “Bell
Curve”
Many things tend to be normally distributed
Strength of past experience becomes rationale
Also have people who do it without having any idea
what they have done
–
Standard statistics is built around normal distribution
Levels of Effort

Level 2
–
–
Study the distribution to see if we are doing
something terrible
Common approach is called a “Histogram”

–
it’s a bar graph that we plot our data on so we can look
at it
Also have things like probability paper where you
plot your data and see if you get a straight line
Effort Level 3

Use statistical techniques to test whether our
sample data is like a set that could
reasonably be pulled from some standard
distribution
–

Often our goodness of fit tests
All three levels of effort have some degree of
custom for their use in some practices
Measuring Properties of
Distributions

Put sample data into a standard equation
that generates a number
–
–

Often actually call that number a statistic
Measures some property of the distribution that
the data was taken from
Some statistics have obvious tangible
meaning
–
Example - Mean - mathematical average value of
the sample or population
Calculating a Mean (or simple average)


Add up all the numbers and then divide by how ever many
numbers you added
Example
–
–

Numbers 5, 10, 15, 20, 25
What is the Mean?
Calculate
–
–
–
–
–
(5 + 10 + 15 + 20 + 25)/5
Numerator totals to 75
Denominator is the number of values I put in
Divide the total by the number of values put in
Answer is 15 (the Mean or Average Value)
Statisticians Need Confusing Ways to
Write Equations

Xi means a sample value
–
The i subscript tells you whether it was the first, second, third etc
sample




From example on last slide we know X2 was the second number we
looked at which was 10
Σ means the sum of a series of values
n means the number of samples considered
Thus we write the formula for mean as
n
X
–

i
1
n
We of course also have a special symbol for a mean
–
X
More Measurements

Mode
–

The value that has the greatest chance of coming
up
Example
–
–
–
–
If I have 10 people who are 5’10”
2 people who are 4’3”
2 people who are 6’10”
If you pick a person at random from my group
what height will person most likely be?
More Measures

Median
–


Mean, Median, and Mode all seem to have somewhat
obvious physical meanings
Other statistics are less obvious
–
–

Half of the values are higher - half are lower
Variance
A number that comes out of a formula that tells you how
spread out the distribution is
Square root of variance is Standard Deviation
–
Average difference between a sample and the mean value
The Standard Deviation

Standard Deviation is the average difference
between individual samples and the mean
s
2
(

X
)
 Xi
n 1
What does it mean?
Take each sample number, subtract the average sample
Value from it, square the result, do this for every number
And add up the result, then divide the result by one less
Than the number of samples you took, and then take the
Square root of that value.
As a Practical Matter That’s a Pain


I have to compute the average before I can do the
math for standard deviation
Alternative Formula
2


 X i
n
2
 1



Xi
n
1
s
n 1
Tells you keep track of two number
1- Take each number square it and then add the squares
up
2- Take each number and add them up and then square
the total
n
 
Getting Standard Deviation

Statistical Calculators have multiple
memories
–
–
–
–

They add up numbers in one memory
They square and add up numbers in another
They total entries in another
They then apply the standard deviation formula
Of course can also use SPSS
Types of Distributions

Idea is that we try to approximate reality with a
mathematically defined distribution
–

Then we can use mathematical operations to predict our
answers
Distributions that often fit reality
–
Normal Distribution (developed in 1733)

–
–
–
–
–
Bell Curve
Uniform Distribution
Binomial Distribution
T Distribution
Qui Square Distribution
Lognormal Distribution
Derived Distributions

T distribution, Qui Squared, and Lognormal
Distributions are all derived from the Normal
Distribution for specific types of situations
Normal Distribution

Shaped Like
Formula
 x   2

 2 2

 1 
Y  f ( x)  
e

 2 




Symmetric Distributions with a Central
Tendency

Normal Distribution is classic example
–
Most of the chances are right near the center of the
distribution


–
Distribution is mirror image about its center



Frequency drops off to sides
Mode is at the Center of the Distribution
Allows to just compute one side
Median is Mean is the Mode
A lot of reality has central tendency with relatively
symmetric sides
–
T distribution like that too

Sides slope off a little differently
Why the Normal Distribution

One of the first mathematically defined
distributions that was a real good fit
–
People developed other formulas and
distributions from calculations done on the normal
distribution

–
T distribution and Qui Square Distribution both result
from performing mathematical operations on samples of
a normal distribution
Normal Distribution was first to press with a
distribution that was heavy at the center and
symmetric
Reality 101 for Statistical Distributions



Probably no such thing as a real normal
distribution in life
Even if there were we almost never count
each and every member of the population so
you’d never know if it was
Statistical Distributions let us take limited
data – see what it approximately is
–
Then use the defined mathematical model to
suddenly know everything about it
Back to Why the Normal Distribution


Big part of Real World is Central Tendency
and Symmetric
Found that calculations done with a normal
distribution were robust
–
–
Minor lack of fit in real world data doesn’t change
the answers much
Thus works on almost anything with central
tendency and near symmetric
Most Common Lack of Fit

Not Symmetric
Robustness covers a
Little skewness
Taking square-root will normalize
A few others
This type of shape can be fit with a
Distribution adapted from normal called
lognormal
If you take averages of about 25 samples
From this – the averages will be normal
(averaging normalizes)
Taking logarithms of the data will make
The transformed distribution normal
Multi-Modal Distributions
These types of distributions are often 3 different normally
Distributed families over-lying each other
Finding what is causing the three families often helps us
To better understand our world
Uniform Distribution



All values within some range (which may or
may not be plus or minus infinity) are equally
likely
Distribution has no central tendency
Tends to be associated with truly random
events (or at least events where the
underlying cause is eluding our mathematical
modeling)
Characteristics of Uniform
Distribution





Because all values are equally likely it has no
mode
Mean is at the center of the range
Uniform is still symmetric about Mean so the
Median and Mean are the same
Standard Deviation is 1/4th the range (if
range is infinite obviously that’s not defined)
Variance is Standard Deviation Squared
Binomial Distribution

Outcomes that are either off or on
–

Clearly describes computers and digital data
Many things either work or they don’t
–
–
–
Mining dealing with whether our trucks are in
working order
Water treatment plant – water purification train is
working or not working
Coin tosses are heads or tails
New Problem


Can’t talk about means, modes, and medians
because outcome has no continuous distribution
Want to know what fraction of the outcomes are
“yes”
–

P = 0.85 85% of members of bimodal population are
positive
Usually interested in what chances are that we can
take 5 members out of the population and have them
all positive
–
Example if I have 5 mining trucks how much of the time will
all 5 be running?
The Ordinate Problem

How continuously distributed are our
outcomes?
–



Our number line is continuous so at first glance
we almost assumed everything was continuous
When and what if they are not
This usually doesn’t take a very smart
statistician to figure out
Some things are yes or no distributed
–
Use binomial distribution model Da!
Some Things are Integer
Distributed

Continuity really is a function of observational
scale
–
–
–
According to quantum physics everything is made
of integer numbers of discrete quanta
At our observation scale the little integer jumps
are perhaps so small we cannot even measure
them
Many times integer continuity is negligible
What If Integer Continuity is Not
Negligible?

Happens when have small numbers or
integer distributed data
–
How does one deal with teacher rankings in
classes of 5 students?




Our scale of observation is integer
Our sample size is small enough we can’t mask it
If it was a class of 500 students we could probably
model outcomes rather well as if continuous
Non-Parametric Statistical Models
Using Statistics




Confidence Intervals and Hypothesis Tests
What would you say if we did a coin toss and I came
up heads and won
What if I did it to you 50 times in a row
I something differs too much from the expected value
you question the things you assumed
–
–
Null hypothesis is nothing is going on
Rejecting the null hypothesis means you question the
fundamental assumptions.
Statistical Tests

What is a confidence interval?
If I take a sample where is it most
Likely to come from.
Suppose I pull a sample and its value is from
Way out here?
What do I know? - that was pretty unlikely to happen – in fact – at some
Point I’m going to wonder whether I really got it from that population
Confidence Interval Problems all have the flavor of deciding how far out in
The tails, how rare, the sample is or would be if you could get it.
Too Many Normal Distributions

Normal distribution is defined by its mean and standard
deviation
–

There are endless possibilities
We start by standardizing our results to a standard
normal distribution with a mean of 0 and an stdev of 1.
–
Has the form
Z
X 

Just Any Normal Distribution
Our formula converts that point
To an equal point on the standard
Normal distribution.
Stdev=1
0
Our Value X
Once We Are On A Standard Normal
Distribution we look at how extreme a
value we have
What % of the Values are
More Extreme than this?
Preparing for Rainfall

Wendy Wetone has just designed a storm sewer
system for a new housing project
–
–


Culverts and intakes will handle a 2.5 inch rainfall in 24
hours
The average big rainfall even in the area is only 1.25 inches
Wendy is ok Right?
If the roads and homes in an area are going to wash
out maybe being ready for an average rain isn’t good
enough
Reality for Major Rainfall Events

Average is 1.25 inches, but suppose there is
a 1 inch standard deviation
How would we know
Something like this?
We built a model
From weather
Records.
σ=1
μ = 1.25
Is there enough of a chance up
hear that I should be getting
heart-burn over this design?
We Know How to Solve This One


Normal Distribution is
fully defined by a
formula
We only need to know
the average (in this
case 1.25) and the
variance (standard
deviation squared –
easy when standard
deviation is one)
 1   x  2
Y  f ( x)  
 2
e

 2 
2




What That Formula Does



Y is a probability value (chance of occurrence)
X in this case is a rainfall event
Rather obviously we are interested in rainfall events
greater than 2.5 inches
–

Guess that means x is 2.5
Problem – Formula gives probability for only a
discrete value – ie it will give us the probability of a
2.5 inch rain event
–
We are in fact worried about any event that exceeds our
design capacity
That’s not a Problem for Us Smart
Engineers

Just Integrate the Function from 2.5 inches
on up
–

In fact most statistical modeling is done on
cumulative probability distributions (ie integrated
areas on the probability density function)
Just one little problem
–
Normal probability density function is one of those
beasts that the math teachers don’t like to talk
about – can’t get an analytical integrated solution
That’s Only a Problem for
Mathematicians


We have numeric integration
Ok maybe that is a problem if we have to
integrate that thing
–
Remember – desk top computers are recent
vintage


Do you have a numeric integration package on your
computer even now?
Normal Distribution dates from 1733 so know
someone created tables of numeric
integration
Normal Distribution Table
Converting to a Value on Standard
Normal Distribution

What we want to know is what are the
chances of a rainfall event exceeding our
drainage system design
–

Ie what percentage of big rainstorms will exceed
2.5 inches (on a distribution with mean of 1.25
and standard deviation of 1)
Convert 2.5 inches to an equivalent value on
standard normal distribution
–
The area above that value in the curve will be the
same as our actual distribution.
Magic Conversion Formula
Z
x

2.5  1.25
1.25 
1
Now Its Look Up Time
Prob = 0.8944
Results

Table shows that from minus infinity to 1.25
there is 0.8944
–

Ie 0.1056 is above 1.25
English Translation
–
–
There is a 10.56% chance that a large rainfall
event will exceed the design capacity of our
drainage system
Sounds like Wendy might be doing some design
work over
Basis for Rainfall Events




10% chance called a 10 year storm (distribution of
years largest storms)
0.05% chance called a 20 year storm
0.01% chance called a 100 year storm
When say it is designed for a 100 year flood it
doesn’t mean it only happens every 100 years
–
–
It means 1% chance in any given year
Problem with other thinking is if you had a big flood 5 years
ago that must mean there is no chance it will ever happen
again in your lifetime (Wrong!)
Ore Grade Control

Orville Orman is planning a truck fleet to haul
his copper ore out of his mine
–
Some rock will have so little copper in it that it
would cost more to process than its worth

–

This stuff is going to get put aside
Other pay rock will be carried to the processing
plant
Commonly have ore and waste truck fleets
but need to know how much of each type of
rock you will have to design.
Orville’s Ore




Orville knows average grade is 0.95% Cu
Standard Deviation is say 0.5% Cu
Cut-Off Grade (point at which ore costs more
to process than Cu will sell for) is 0.25%
What percentage of Orville’s ore is below cutoff grade?
The Situation
σ = .5
How much of
My rock is
Down here?
0.25
μ = 0.95
Oh We Are Hot




Our critical x value is 0.25
We will convert this to a “Z score” from the
standard normal distribution
We will then look up in the table how much of
our distribution is from minus infinity to our Z
We will then tell our truck planners how much
rock to prepare for
Crunch Away
0.25  0.95
 1.4 
0.5
Go to the Table
Table Says! 0.0808
About 8.1% of our Rock is Below Cut-Off
Previous Examples

Called One Tailed Tests
–
Our Civil Engineers were concerned about events
larger than some amount

–
Our Mining Engineers were concerned about
tonnage below cut-off


An upper tail test
A lower tail test
What if interest in either too much or too little
–
Typical of a machine tolerance problem
Tolerance

Benjamin Bidwell would like to bid on a DOD order
for machined shafts
–
–
–

The spec says 1 inch +/- 0.005 inches
Benjamin knows his men and equipment can put any
chosen part size within a standard deviation of 0.0025
inches
He figures he can put in a winning bid provided no more
than 3% of the pieces he makes have to be rejected
Can Benjamin put in a winner bid on this order?
The Situation
σ = 0.0025
How many
Products are
Out here
In the
Tails?
0.995
μ=1
1.005
We Know What to Do


Convert those limits to
Z scores
Start with the top limit
1.005  1
2
0.0025
Table Look Up Says 0.9773 or 2.27% will be too large
Now we use our knowledge – this distribution and tolerance is
Symmetric - ie 2.27% on the bottom end
That Sucks - about 4.54% of products will be out of Spec
The Hypothesis Test

Hubbert’s Hammers makes clobber balls for
use in a doll recycling plant. Hardness is
important to determining the longevity of the
hammer balls. Herby has been getting some
customer complaints about his balls not
holding up and pulls a few off the assembly
line for testing. He gets values of 3, 3.6, 4.2,
4.1, 2.7, 4.7, and 4.3. The balls are suppose
to have a hardness of 4.5. Does Herby have
a problem?

Herby Runs to SPSS, enters his
sample data
He gets an average of 3.8 and a Stdev of
0.73.
Interpreting the Data


Everyone knows not every ball will be 4.5
hardness, but on average they need to be.
Herby knows that if he ran to his assembly
line and grabbed another 7 balls at random
he would get a different number.
Herby’s World
Herby knows that 95% of the
Time a sample of 7 grab balls
Will be within 1.96 standard
Deviation units of the true mean.
X
σ
μ

1.96*

(He’s spent too much time
Looking at normal distribution
Tables)
Herby Formulates a “Hypothesis Test”


Herby thinks the endurance of his balls has gone
down.
The “null hypothesis” is that this one sad looking
sample is not enough to conclude the mean ball
hardness on the assembly line has changed
–
–

If the sample falls within 1.96 standard deviation units of the
target mean of 4.5 Herby can be 95% certain the spec on
his assembly line is still in tolerance
If not Herby will reject the “null hypothesis” and conclude
that his assembly line is screwed
Oh gosh – get out the crosses and garlic – where
starting to sound like statisticians.
The “Alpha Level”

In reality Herby could grab 7 balls on a perfectly
normal assembly line and get any value
–

Yet Herby is going to declare a disaster if he does not come
out within 1.96 standard deviation units of his target value
Because in the real world a sample could come from
anywhere, one of the decisions we have to make is
how willing are we to be wrong.
–
–
This is called setting our Alpha Level
How great is the chance that we will reject the null
hypothesis when we shouldn’t have
OK – Lets Get on With Herby’s Test

Plug into the Equation
 3.8  1.96 *

Wholly Marshmallows! What do we use for standard
deviation?
–
Our standard deviation was the standard deviation for individual
samples – not averages
What’s the Big Deal About Individual
Samples and Averages?



In a large general ed class what kind of
range do you get on peoples test scores?
Ever noticed that certain professors test
average scores tend to come out about the
same value year after year?
Point- In a random sample, the standard
deviation of an average will always be less
than the standard deviation of individual
values.
OK- I Believe – Now Get Me the
Dogone Standard Deviation

For a random sample the standard deviation of the
mean is

Mean


Where n= # samples
Used in the mean
samples
n
If you think I’m going to try showing you the proof your out of
Your mind.
OK – Let Roll

Our standard deviation of the mean is
0.73
0.276 
7

Plug into the magic equation
4.34  3.8  1.96 * 0.276
Oh Crud – The Assembly Line is Turning Out Weak Balls!
What if We Had Set A Higher Alpha
Level

Plug and Chug for 1% Alpha Level
4.51  3.8  2.575 * 0.276


Now we look ok
Note from standard deviation formula that larger
samples suck in the standard deviation
–
If there really is a problem with Herby’s balls – how big a
sample will it take to see the problem?
Figuring Out a Required Sample Size

Herby’s assembly line is suppose to turn out balls of
4.5 hardness
–
How far out of spec can Herby Tolerate Things?


Suppose Herby decides he needs his estimates to be good to
within 0.5 hardness units.
Next Herby has to decide how much of a chance he
is willing to take that he will shut down the line and
issue recalls when nothing is really wrong at all.
–
Suppose Herby wants 99% confidence (ie – alpha level is
1%)

99% of a normal distribution is within +/- 2.575 standard
deviation unit of the true mean
Herby’s Task


Herby needs to detect a 0.5 hardness unit departure from
the 4.5 target hardness but still have a less than 1% of
shutting the line down by mistake.
Formula is
Where L is the min error that must be
Z  *

n
L
Be detected
Z is the Z value for our alpha level
Note that this is just the plus or minus part of our
confidence interval formula
Doing the Math

First solve for our sample size needed
2
n


Z *
L
2
2
Then plug into the equation and solve
2
n
2.575
0.5
2
* 0.73
2
N=14.13 as a practical matter means need sample of 15
To actually achieve desired accuracy with an acceptable risk.
Note – this also implies that higher confidence requires more money spent
On sampling and testing.
Herby’s Assembly Line Analysis to
Date




Herby has grabbed a sample of 7 balls off the assembly line
With this sample Herby is 95% sure he has a problem with the
hardness of the balls being produced
When Herby checked for only a 1% chance that he was going
to shut the line down for no reason at all Herby’s sample could
not furnish him enough certainty
To detect a 0.5 unit departure from the target hardness of 4.5
and doing so with no more than a 1% chance of stopping the
line for a quirk of sampling Herby must take a grab of 15 balls
off the assembly line
Comparing Two Samples


Red Rooster Carburetor company would like
to claim that their carburetors improve fuel
economy by 20% when their replacement
carburetors are used.
Red Rooster assembles teams of drivers to
drive two sets of cars – one that has been
retrofit with Red Rooster Carburetors and
one that uses the manufactures original
carburetors
Data Begins Coming In


The standard vehicles came in with an
average of 21.4 mpg and stdev of 6.1 from
60 car and driver combinations
The Rooster Carburetor Vehicles came in
with 29.5 mpg and stdev of 6.2 from 41 car
and driver combinations
Setting Up A Test


If the average gas mileage for the no Rooster
set is improved 20% its adjusted mean is
25.68
The Null Hypothesis is that the mean of cars
gas mileage is the same (after the 20%
adjustment)
–
Set the test up to reject and conclude the Rooster
Carburetor set is more than 20% better if the test
statistic is extreme enough
The Test Statistic
Z 
Y

Y
1

2
1
n
1
We will let Y1 be our Rooster
carburetor
2


2
2
We will let Y2 be our Standard
Vehicles with 20% improvement
n
2
If Y1 is bigger than Y2 it will cause Z to become increasingly large. If Z is
So far out in the upper tail that there is little chance it could be a random
Event we will reject the null hypothesis and conclude that the Red Rooster
Carburetors do improve fuel economy by 20%
A Note on Our Test Statistic
Z
Y Y
1

2
1
n
1

The denominator is what we call
A pooled estimate of variance
2

2
2
n
2
Strictly speaking the test is assuming
That the two populations have
The same variance. If the variances
Are close it is accepted practice to
To allow the lye as close enough.
How much different can the variances be and still be about the same?
Actually a bit of a judgment call but I’m not worried about 6.1 and 6.2
Plug and Chug
Z
29.5  25.68
2
2
6.2  6.1
41
60
Z=3.06 do to the table to look up how much of
The normal distribution is beyond 3.06 standard
Deviation units
Do A Table Look Up
Area under the curve is 0.99889 or 0.00111 ie 0.111% of the distribution is
Further out. There is about 1/10th of 1% chance that the observed result is
A fluke.
Action – Reject the null hypothesis on conclude that the Red Rooster Carburetor
Does improve fuel economy by more than 20%
Paired Experiments

What if Red Rooster Carburetors is a group of students
who designed their carburetor in the machine shop at
school
–

The idea that they can go out and build 41 carburetors and send
101 cars and drivers out to burn up a bunch of gas is kind of “iffy”
One Way to Get Sample Size Down is to get rid of some
of that random variance
–
What if we used the same car and driver with and without the
Red Rooster Carburetor?

–
We just took out two sources of scatter in the data
This is called a Paired Experiment

Paired Experiments
Needs to be a solid basis for pairing
–
Can make the numbers crunch pairing up anything

Experiment – I want to show that students from Illinois are
smarter than students from Missouri. I give a test to 40 SIU
seniors that are Illinois residents. I then give the same test to
40 Kindergarteners from Missouri. I match the students up in
the order in which tests were turned in and do my test.
–
If my test statistic shows that my Illinois students scored higher
are you willing to believe that Illinois students are smarter than
Missouri students?
OK that last one raises some concerns
about the Intelligence of who ever
designed that experiment


The basis for pairing should be that we are pairing like
items to eliminate variation from what ever we are trying
to “write out” of the experiment by pairing.
Suppose we make one Red Rooster Carburetor to go on
a Dodge Neon and I have 10 students drive the vehicle
over the same road course before adding the carburetor.
I then add the carburetor and have the same 10 students
drive the same car over the same course. I will then pair
the results before and after adding the carburetor
Looking at My Results

Standard Dodge Neon
–
–
–
–
–
–
–
–
–
–
Don Dork 26.5
Kurt Kurtosis 25.7
Angela Airhead 25.2
Mark Maniac 23.9
Katty Careful 28.1
Jim Junkyard 26.2
Steve Stickshift 25.9
Burt Bunion 27.1
Saedy Sadist 26.7
Melvin Mizer 28.2

Neon with RR Carb
–
–
–
–
–
–
–
–
–
–
Don Dork 32.1
Kurt Kurtosis 30.1
Angela Airhead 31.8
Mark Maniac 29.8
Katty Careful 34.2
Jim Junkyard 30.6
Steve Stickshift 31.2
Burt Bunion 33.2
Saedy Sadist 32.8
Melvin Mizer 34.5
The test requires us to get the
differences within our pairing



Don Dork Result – 32.1- 26.5 = 5.6
Kurt Kurtosis - 30.1 – 25.7 = 4.4
And so on through the pairing.

Tuning in a Little More
Red Rooster actually wants to claim a 20% increase in gas
mileage so we may be able to normalize out some more variance
by directly measuring % improvement.
–

We also are interested in how much these values differ from 20%
improvement so we can subtract 20% from each value
–

Results 21.13%, 17.12%, 26.19%, 24.68%, 21.71%, 16.79%,
20.48%, 22.51%, 22.85%, 22.34%
1.13%, -2.88%, 6.19%, 4.68%, 1.71%, -3.21%, 0.48%, 2.51%,
2.85%, 2.34%
Plug the Data into SPSS to get Mean and Standard Deviation
–
Could also use Excel and function =average(data range) and
=stdev(data range) for standard deviation
The Hypothesis

Ho = there is no difference between our set of
numbers and 0
–

Specifically means we cannot be sure we have
over 20% improvement
Rejecting the null hypothesis means we are
sure we have over 20% improvement
The Test Statistic for a Paired
Experiment
D with the bar over it is the average
Difference (in this case 1.58%)
t
Sd is the standard deviation of the
Individual differences as calculated
(in this case 2.95%)
d
s
d
n
N is of course the number of samples
(in this case 10)
Crunching the number we get 1.69
Looking Up Our Result
We have n-1 degrees of freedom
(in this case 9)
1.69 is between 90 and 95%
Significant. We cannot reject
The null hypothesis at the 95%
Level, but we can at about
93% confidence.
Limitations of Our Results

93% confidence we have over 20% improvement
may fall short of the proof some people would
demand
–

One way to strengthen the conclusion is more samples (the
standard deviation shrinks with more samples and since it is
in the denominator that makes t bigger)
We may also be concerned that all our tests were on
a Dodge Neon which furnishes no data on whether
the result would be improved on other cars as well