Chapter 3, Part A - Cameron University

Download Report

Transcript Chapter 3, Part A - Cameron University

Slides by
JOHN
LOUCKS
St. Edward’s
University
© 2009 Thomson South-Western. All Rights Reserved
Slide 1
Chapter 3, Part A
Descriptive Statistics: Numerical Measures


Measures of Location
Measures of Variability
© 2009 Thomson South-Western. All Rights Reserved
Slide 2
Measures of Location

Mean

Median
Mode



Percentiles
Quartiles
If the measures are computed
for data from a sample,
they are called sample statistics.
If the measures are computed
for data from a population,
they are called population parameters.
A sample statistic is referred to
as the point estimator of the
corresponding population parameter.
© 2009 Thomson South-Western. All Rights Reserved
Slide 3
Mean


The mean of a data set is the average of all the data
values.
The sample mean x is the point estimator of the
population mean m.
© 2009 Thomson South-Western. All Rights Reserved
Slide 4
Sample Mean x
x
x
Sum of the values
of the n observations
i
n
Number of
observations
in the sample
© 2009 Thomson South-Western. All Rights Reserved
Slide 5
Population Mean m
m
x
Sum of the values
of the N observations
i
N
Number of
observations in
the population
© 2009 Thomson South-Western. All Rights Reserved
Slide 6
Sample Mean
 Example: Apartment Rents
Seventy efficiency apartments were randomly
sampled in a small college town. The monthly rent
prices for these apartments are listed on the next slide.
© 2009 Thomson South-Western. All Rights Reserved
Slide 7
Sample Mean
 Apartment Rent Sample Data
445
440
465
450
600
570
510
615
440
450
470
485
515
575
430
440
525
490
580
450
490
590
525
450
472
470
445
435
435
425
450
475
490
525
600
600
445
460
475
500
535
435
460
575
435
500
549
475
445
© 2009 Thomson South-Western. All Rights Reserved
600
445
460
480
500
550
435
440
450
465
570
500
480
430
615
450
480
465
480
510
440
Slide 8
Sample Mean
x

x
34, 356

 490.80
n
70
445
440
465
450
600
570
510
615
440
450
470
485
515
575
430
440
525
490
580
450
490
590
525
450
472
470
445
435
i
435
425
450
475
490
525
600
600
445
460
475
500
535
435
460
575
435
500
549
475
445
© 2009 Thomson South-Western. All Rights Reserved
600
445
460
480
500
550
435
440
450
465
570
500
480
430
615
450
480
465
480
510
440
Slide 9
Median
 The median of a data set is the value in the middle
when the data items are arranged in ascending order.
 Whenever a data set has extreme values, the median
is the preferred measure of central location.
 The median is the measure of location most often
reported for annual income and property value data.
 A few extremely large incomes or property values
can inflate the mean.
© 2009 Thomson South-Western. All Rights Reserved
Slide 10
Median
 For an odd number of observations:
26
18
27 12
14
27
12
14
18
26
27 27
19
19
7 observations
in ascending order
the median is the middle value.
Median = 19
© 2009 Thomson South-Western. All Rights Reserved
Slide 11
Median
 For an even number of observations:
26
18
27
12
14
27
30
19
8 observations
12
14
18
19
26
27 27
30
in ascending order
the median is the average of the middle two values.
Median = (19 + 26)/2 = 22.5
© 2009 Thomson South-Western. All Rights Reserved
Slide 12
Median
Averaging the 35th and 36th data values:
Median = (475 + 475)/2 = 475
425
440
450
465
480
510
575
430
440
450
470
485
515
575
430
440
450
470
490
525
580
435
445
450
472
490
525
590
435
445
450
475
490
525
600
435
445
460
475
500
535
600
435
445
460
475
500
549
600
435
445
460
480
500
550
600
440
450
465
480
500
570
615
440
450
465
480
510
570
615
Note: Data is in ascending order.
© 2009 Thomson South-Western. All Rights Reserved
Slide 13
Mode
 The mode of a data set is the value that occurs with
greatest frequency.
 The greatest frequency can occur at two or more
different values.
 If the data have exactly two modes, the data are
bimodal.
 If the data have more than two modes, the data are
multimodal.
© 2009 Thomson South-Western. All Rights Reserved
Slide 14
Mode
450 occurred most frequently (7 times)
Mode = 450
425
440
450
465
480
510
575
430
440
450
470
485
515
575
430
440
450
470
490
525
580
435
445
450
472
490
525
590
435
445
450
475
490
525
600
435
445
460
475
500
535
600
435
445
460
475
500
549
600
435
445
460
480
500
550
600
440
450
465
480
500
570
615
440
450
465
480
510
570
615
Note: Data is in ascending order.
© 2009 Thomson South-Western. All Rights Reserved
Slide 15
Percentiles
 A percentile provides information about how the
data are spread over the interval from the smallest
value to the largest value.
 Admission test scores for colleges and universities
are frequently reported in terms of percentiles.
© 2009 Thomson South-Western. All Rights Reserved
Slide 16
Percentiles

The pth percentile of a data set is a value such that at
least p percent of the items take on this value or less
and at least (100 - p) percent of the items take on this
value or more.
© 2009 Thomson South-Western. All Rights Reserved
Slide 17
Percentiles
Arrange the data in ascending order.
Compute index i, the position of the pth percentile.
i = (p/100)n
If i is not an integer, round up. The pth percentile
is the value in the ith position.
If i is an integer, the pth percentile is the average
of the values in positions i and i+1.
© 2009 Thomson South-Western. All Rights Reserved
Slide 18
80th Percentile
i = (p/100)n = (80/100)70 = 56
Averaging the 56th and 57th data values:
80th Percentile = (535 + 549)/2 = 542
425
440
450
465
480
510
575
430
440
450
470
485
515
575
430
440
450
470
490
525
580
435
445
450
472
490
525
590
435
445
450
475
490
525
600
435
445
460
475
500
535
600
435
445
460
475
500
549
600
435
445
460
480
500
550
600
440
450
465
480
500
570
615
440
450
465
480
510
570
615
Note: Data is in ascending order.
© 2009 Thomson South-Western. All Rights Reserved
Slide 19
80th Percentile
425
440
450
465
480
510
575
“At least 80%
of the items
take on a value
of 542 or less.”
“At least 20%
of the items
take on a value
of 542 or more.”
56/70 = .8 or 80%
14/70 = .2 or 20%
430
440
450
470
485
515
575
430
440
450
470
490
525
580
435
445
450
472
490
525
590
435
445
450
475
490
525
600
435
445
460
475
500
535
600
435
445
460
475
500
549
600
© 2009 Thomson South-Western. All Rights Reserved
435
445
460
480
500
550
600
440
450
465
480
500
570
615
440
450
465
480
510
570
615
Slide 20
Quartiles
 Quartiles are specific percentiles
 First Quartile = 25th Percentile
 Second Quartile = 50th Percentile = Median
 Third Quartile = 75th Percentile
© 2009 Thomson South-Western. All Rights Reserved
Slide 21
Third Quartile
Third quartile = 75th percentile
i = (p/100)n = (75/100)70 = 52.5 = 53
Third quartile = 525
425
440
450
465
480
510
575
430
440
450
470
485
515
575
430
440
450
470
490
525
580
435
445
450
472
490
525
590
435
445
450
475
490
525
600
435
445
460
475
500
535
600
435
445
460
475
500
549
600
435
445
460
480
500
550
600
440
450
465
480
500
570
615
440
450
465
480
510
570
615
Note: Data is in ascending order.
© 2009 Thomson South-Western. All Rights Reserved
Slide 22
Measures of Variability
 It is often desirable to consider measures of variability
(dispersion), as well as measures of location.
 For example, in choosing supplier A or supplier B we
might consider not only the average delivery time for
each, but also the variability in delivery time for each.
© 2009 Thomson South-Western. All Rights Reserved
Slide 23
Measures of Variability

Range

Interquartile Range

Variance

Standard Deviation

Coefficient of Variation
© 2009 Thomson South-Western. All Rights Reserved
Slide 24
Range
 The range of a data set is the difference between the
largest and smallest data values.
 It is the simplest measure of variability.
 It is very sensitive to the smallest and largest data
values.
© 2009 Thomson South-Western. All Rights Reserved
Slide 25
Range
Range = largest value - smallest value
Range = 615 - 425 = 190
425
440
450
465
480
510
575
430
440
450
470
485
515
575
430
440
450
470
490
525
580
435
445
450
472
490
525
590
435
445
450
475
490
525
600
435
445
460
475
500
535
600
435
445
460
475
500
549
600
435
445
460
480
500
550
600
440
450
465
480
500
570
615
440
450
465
480
510
570
615
Note: Data is in ascending order.
© 2009 Thomson South-Western. All Rights Reserved
Slide 26
Interquartile Range
 The interquartile range of a data set is the difference
between the third quartile and the first quartile.
 It is the range for the middle 50% of the data.
 It overcomes the sensitivity to extreme data values.
© 2009 Thomson South-Western. All Rights Reserved
Slide 27
Interquartile Range
3rd Quartile (Q3) = 525
1st Quartile (Q1) = 445
Interquartile Range = Q3 - Q1 = 525 - 445 = 80
425
440
450
465
480
510
575
430
440
450
470
485
515
575
430
440
450
470
490
525
580
435
445
450
472
490
525
590
435
445
450
475
490
525
600
435
445
460
475
500
535
600
435
445
460
475
500
549
600
435
445
460
480
500
550
600
440
450
465
480
500
570
615
440
450
465
480
510
570
615
Note: Data is in ascending order.
© 2009 Thomson South-Western. All Rights Reserved
Slide 28
Variance
The variance is a measure of variability that utilizes
all the data.
It is based on the difference between the value of
each observation (xi) and the mean ( x for a sample,
m for a population).
© 2009 Thomson South-Western. All Rights Reserved
Slide 29
Variance
The variance is the average of the squared
differences between each data value and the mean.
The variance is computed as follows:
2  ( xi  x )
s 
n 1
for a
sample
2
 ( xi  m )
 
N
2
2
for a
population
© 2009 Thomson South-Western. All Rights Reserved
Slide 30
Standard Deviation
The standard deviation of a data set is the positive
square root of the variance.
It is measured in the same units as the data, making
it more easily interpreted than the variance.
© 2009 Thomson South-Western. All Rights Reserved
Slide 31
Standard Deviation
The standard deviation is computed as follows:
s  s2
  2
for a
sample
for a
population
© 2009 Thomson South-Western. All Rights Reserved
Slide 32
Coefficient of Variation
The coefficient of variation indicates how large the
standard deviation is in relation to the mean.
The coefficient of variation is computed as follows:
s


100

%
x

for a
sample


 100  %
m

for a
population
© 2009 Thomson South-Western. All Rights Reserved
Slide 33
Variance, Standard Deviation,
And Coefficient of Variation

Variance
s2

(x


2

x
)
i

n1
2, 996.16
Standard Deviation
s  s2  2996.16  54.74

Coefficient of Variation
the standard
deviation is
about 11%
of the mean
 s

 54.74

 100 %  11.15%
  100 %  
x

 490.80

© 2009 Thomson South-Western. All Rights Reserved
Slide 34
End of Chapter 3, Part A
© 2009 Thomson South-Western. All Rights Reserved
Slide 35