No Slide Title

Download Report

Transcript No Slide Title

Lecture 4: Measures of Variation
Slide 1
Review of Lecture 3: Measures of Center
• Given a stem –and-leaf plot
Be able to find
» Mean
• (40+42+3*50+51+2*52+64+67)/10=46.7
» Median
• (50+51)/2=50.5
» mode
Stem (tens)
Leaves (units)
4
5
6
02
000122
47
5th
• 50
• Given a regular frequency distribution
Be able to find
» Sample size
•2+4+5+16+13=40
» Mean
•(8+12+10+16+0)/40=1.15
» Median:
•average of the two middle values=1
Median
group
6th
# of
phones (x)
f
fx
Cum
Freq
4
2
8
2
3
4
12
6
2
5
10
11
1
16 16
27
0
13 0
40=n
2.5 Measures of Variation
Slide 2
Measure of Variation (Measure of Dispersion):
A measure helps us to know the spread
of a data set.
Candidates: Range
Standard Deviation, Variance
Coefficient of Variation
Statistics handles variation. Thus this section one of
the most important sections in the entire book
Definition
Slide 3
The range of a set of data is the
difference between the highest
value and the lowest value
Range=(Highest value) – (Lowest value)
Example: Range of {1, 3, 14} is 14-1=13.
Standard Deviation
The standard deviation of a set of
values is a measure of variation of
values about the mean
We introduce two standard deviation:
• Sample standard deviation
• Population standard deviation
Slide 4
Sample Standard
Deviation Formula
Data value
S=
Formula 2-4
 (x - x)
n-1
Sample size
2
Slide 5
Sample Standard Deviation
(Shortcut Formula)
n (x ) - (x)
n (n - 1)
2
s=
Formula 2-5
2
Slide 6
Example: Publix check-out waitingSlide 7
times in minutes
Data: 1, 4, 10. Find the sample mean and sample
standard deviation.
Using the shortcut
xx
formula:
( x  x )2 x 2
x
n=3
x
15
 5.0 min
3
15 -4
-1
5
1
4
10
15
x
 x  x 
2
s
n 1
16
1
25
42
 ( x  x )2
s
16
100
117
x
n x 2   x 
2
1
2
42

 21  4.6 min
3 1
n(n  1)
3(117)  15

3(3  1)
2
351 225
126


6
6
 21  4.6 min
Standard Deviation Key Points
Slide 8
 The standard deviation is a measure of variation of
all values from the mean
 The value of the standard deviation s is usually
positive and always non-negative.
 The value of the standard deviation s can increase
dramatically with the inclusion of one or more
outliers (data values far away from all others)
 The units of the standard deviation s are the same as
the units of the original data values
Population Standard
Deviation
 =
Slide 9
 (x - µ)
2
N
This formula is similar to Formula 2-4, but
instead the population mean and population
size are used
Variance
Slide 10
 The variance of a set of values is a measure of
variation equal to the square of the standard
deviation.
 Sample variance s2: Square of the sample standard
deviation s
 Population variance: Square of the population
standard deviation 
Variance - Notation
Slide 11
standard deviation squared
}
Notation
s

2
2
Sample variance
Population variance
Round-off Rule
for Measures of Variation
Slide 12
Carry one more decimal place than
is present in the original set of
data.
Round only the final answer, not values in
the middle of a calculation.
Definition
Slide 13
The coefficient of variation (or CV) for a set of
sample or population data, expressed as a
percent, describes the standard deviation relative
to the mean
Sample
CV =
s
100%
x
Population
CV =

100%

• A measure good at comparing variation between populations
• No unit makes comparing apple and pear possible.
Example: How to compare the variability
in heights and weights of men?
Slide 14
Sample: 40 males were randomly selected. The
summarized statistics are given below.
Sample mean
Height
68.34 in
Sample standard
deviation
3.02 in
Weight
172.55 lb
26.33 lb
Solution: Use CV to compare the variability
s
3.02

100
%

 100%  4.42%
Heights:
x
68.34
s
26.33
Weights: CV   100% 
 100%  15.26%
x
172.55
CV 
Conclusion:
Heights (with
CV=4.42%) have
considerably less
variation than
weights (with
CV=15.26%)
Standard Deviation from a
Frequency Distribution
Formula 2-6
n [(f • x 2)] - [(f • x)]2
S=
n (n - 1)
Use the class midpoints as the x values
Slide 15
Example: Number of TV sets
Owned by households
Slide 16
• A random sample of 80 households was selected
• Number of TV owned is collected given below.
TV sets (x)
0
1
2
3
4
Total
(a) x 
# of Households (f)
4
33
28
10
5
80
0
33
56
30
20
139
fx2
0
33
112
90
80
315
Compute:
(a) the sample
mean
(b) the sample
standard
deviation
139
 1.7sets
80
n ( fx 2 )   fx 
2
(b) s 
fx
n(n  1)
80(315)  (139) 2
5879


 1.0 sets
80(80  1)
6320
Estimation of Standard
Deviation
Range Rule of Thumb
Slide 17
For estimating a value of the standard deviation s,
Use
s
Range
4
Where range = (highest value) – (lowest value)
Estimation of Standard
Deviation
Range Rule of Thumb
Slide 18
For interpreting a known value of the standard deviation s,
find rough estimates of the minimum and maximum
“usual” values by using:
Minimum “usual” value
 (mean) – 2 X (standard deviation)
Maximum “usual” value  (mean) + 2 X (standard deviation)
Definition
Slide 19
Empirical (68-95-99.7) Rule
For data sets having a distribution that is approximately
bell shaped, the following properties apply:
 About 68% of all values fall within 1 standard
deviation of the mean
 About 95% of all values fall within 2 standard
deviations of the mean
 About 99.7% of all values fall within 3 standard
deviations of the mean
The Empirical Rule
FIGURE 2-13
Slide 20
The Empirical Rule
FIGURE 2-13
Slide 21
The Empirical Rule
FIGURE 2-13
Slide 22
Recap
Slide 23
In this section we have looked at:
 Range
 Standard deviation of a sample and population
 Variance of a sample and population
 Coefficient of Variation (CV)
 Standard deviation using a frequency distribution
 Range Rule of Thumb
 Empirical Distribution
Homework Assignment 4
Slide 24
• problems 2.5: 1, 3, 7, 9, 11, 17, 23, 25, 27, 31
• Read: section 2.6: Measures of relative standing.