Transcript L3-Percentiles
McGraw-Hill/Irwin
Describing Data: Percentiles
Copyright © 2011 by the McGraw-Hill Companies, Inc. All rights reserved.
LEARNING OBJECTIVES
LO1. Compute and understand
quartiles, deciles
, and
percentiles
.
LO2. Construct and interpret
box plots
.
4-2
Quartiles, Deciles and
Learning Objective 2
Compute and understand
quartiles, deciles
, and
percentiles
.
Percentiles
The median splits the data into equal sized halves Quartiles split the data into quarters Deciles into tenths And percentiles can be any split of our choosing These measures include
quartiles, deciles,
a 4-3
Lowest Data Value Quartiles
25%
50% Median 50% value 50%
25% 25%
Q1 Q2 Q3
25%
Highest Data Value Deciles 1/10
10% 10% 10% 10% 10% 10% 10% 10% 10% 10%
4-4
Percentile Computation
To formalize the computational procedure, let percentile we would use
L
33
L p
refer to the location of a desired percentile. So if we wanted to find the 33rd and if we wanted the median, the 50th percentile, then
L
50 .
LO2
The number of observations is
n,
so if we want to locate the median, its position is at (
n +
1)/2, or we could write this as (
n +
1)(
P
/100), where
P
is the desired percentile.
4-5
Percentiles - Example
Listed below are the commissions earned last month by a sample of 15 brokers at Salomon Smith Barney’s Oakland, California, office. $2,038 $1,758 $1,721 $1,637 $2,097 $2,047 $2,205 $1,787 $2,287 $1,940 $2,311 $2,054 $2,406 $1,471 $1,460 Locate the median, the first quartile, and the third quartile for the commissions earned.
LO2
4-6
Percentiles – Example (cont.)
Step 1: Organize the data from lowest to largest value
LO2
$1,460 $1,758 $2,047 $2,287 $1,471 $1,787 $2,054 $2,311 $1,637 $1,940 $2,097 $2,406 $1,721 $2,038 $2,205 4-7
Percentiles – Example (cont.)
LO2
Step 2: Compute the first and third quartiles. Locate L 25 and L 75 using:
L
25 ( 15 1 ) 25 100 4
L
75 ( 15 1 ) 75 100 12 Therefore, the first and third quartiles are located at the 4th and 12th positions, respective ly
L
25
L
75 $ 1 , 721 $ 2 , 205 4-8
Learning Objective 3
Construct and interpret
box plots
.
Boxplots
A box plot is a graphical display, based on quartiles, that helps us picture a set of data. To construct a box plot, we need only five statistics: 1. the minimum value, 2. Q1(the first quartile), 3. the median, 4. Q3 (the third quartile), and 5. the maximum value.
4-9
Boxplot - Example
LO3
4-10
Boxplot Example
Step1: Create an appropriate scale along the horizontal axis. Step 2: Draw a box that starts at
Q1 (15 minutes) and ends at Q3 (22
minutes). Inside the box we place a vertical line to represent the median (18 minutes).
Step 3: Extend horizontal lines from the box out to the minimum value (13 minutes) and the maximum value (30 minutes).
LO3
4-11
Example: Draw a Box & Whisker for
Sample Data in Ordered Array: 11 12 13 16 16 17 18 21 22
(n = 9) Q 1 (L 25 ) is in the (9+1)*25/100 = 2.5 position of the ranked data so use the value half way between the 2 nd and 3 rd values, so
Q 1 = 12.5
Q 1 Q 2 and Q 3 are measures of non-central location = median, is a measure of central tendency 4-12
Quartile Measures Calculating The Quartiles: Example
Sample Data in Ordered Array: 11 12 13 16 16 17 18 21 22
(n = 9) Q 1 is in the (9+1)*25/100 = 2.5 position of the ranked data, so
Q 1 = (12+13)/2 = 12.5
Q 2 is in the (9+1)*50/100 = 5 th position of the ranked data, so
Q 2 = median = 16
Q 3 is in the (9+1)*75/100 = 7.5 position of the ranked data, so
Q 3 = (18+21)/2 = 19.5
Q 1 Q 2 and Q 3 are measures of non-central location = median, is a measure of central tendency 4-13
Quartile Measures-
Calculation Rules When calculating the ranked position use the following rules ― If the result is a whole number then it is the ranked position to use ― If the result is a fractional half (e.g. 2.5, 7.5, 8.5, etc.) then average the two corresponding data values.
― If the result is not a whole number or a fractional half then interpolate between the data points.
4-14
Quartile Measures: The Interquartile Range (IQR)
― The IQR is Q 3 – Q 1 and measures the spread in the middle 50% of the data ― The IQR is a measure of variability that is not influenced by outliers or extreme values ― Measures like Q 1 , Q 3 , and IQR that are not influenced by outliers are called resistant measures 4-15
The Interquartile Range
Example:
X minimum Q 1 Median (Q 2 ) Q 3 25% 25% 25% 25% X maximum
11 12.5 16 19.5 22
Interquartile range = 19.5 – 12.5 = 7 4-16
Distribution Shape and The Boxplot
Negatively-Skewed Symmetrical Positively-Skewed
Q 1 Q 2 Q 3 Q 1 Q 2 Q 3 Q 1 Q 2 Q 3
Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc..
4-17
Interpolation
If you found that the first quartile was the 13.75
th value then you interpolate like this: Take the 13 th and 14 th data values Find the difference |14 th -15 th | Multiply the difference by 0.75
Add the calculated value to the 13 th value 4-18
Exercises – To Do
Page 116 – Q4-21 Q4-23 Q4-25 4-19
16 27 29
Stem and Leaf Diagrams
22 37 28 18 27 17 31 25 33 25 32 20 41 27 40 1 2 3 4 8 3 5 1 6 7 5 0 2 7 3 0 7 1 2 7 7 9 8 1 8 means 18 1 2 3 4 6 0 1 0 7 8 2 3 5 7 2 1 3 5 7 7 7 8 9
5.2
7.5
6.6
3.8
Stem and Leaf Diagrams
6.6
8.6
5.8
2.5
4.3
7.1
3.5
2.7
8.3
7.8
7.5
8.8
5.1
2.2
6.1
4.8
Raw data
The following data were collected on the ages of cyclists involved in road accidents 4-22
66 6 62 19 20 26 35 26 61 13 15 61 21 8 28 21 21 63 44 10 44 7 10 52 13 52 19 22 64 11 39 22 9 13 9 17 64 32 8 62 28 36 37 18 138 16 67 45 10 55 14 66 49 9 18 20 23 12 11 25 9 7 18 15 18 17 31 37 42 14 7 29 36 6 9 88 46 12 59 60 60 16 50 16 22 14 34 20 9 67 61 34
Total 92
Ages of cyclists in road accidents
Always
include a
1 0 0 0 1 1 2 2 3 3 3 4 4 4 5 5 6 6 6 7 7 8 8 8 8 9 9 2 0 0 0 1 1 1 2 2 2 3 5 6 6 8 8 9 3 1 2 4 4 5 6 6 7 7 9 4 2 4 4 5 6 9 5 0 2 2 5 9 8 8
Key 6|7 means 67 years 4-24