L3-Percentiles

Download Report

Transcript L3-Percentiles

McGraw-Hill/Irwin

Describing Data: Percentiles

Copyright © 2011 by the McGraw-Hill Companies, Inc. All rights reserved.

LEARNING OBJECTIVES

LO1. Compute and understand

quartiles, deciles

, and

percentiles

.

LO2. Construct and interpret

box plots

.

4-2

Quartiles, Deciles and

Learning Objective 2

Compute and understand

quartiles, deciles

, and

percentiles

.

Percentiles

 The median splits the data into equal sized halves    Quartiles split the data into quarters Deciles into tenths And percentiles can be any split of our choosing  These measures include

quartiles, deciles,

a 4-3

Lowest Data Value Quartiles

25%

50%  Median 50% value  50%

25% 25%

Q1 Q2 Q3

25%

Highest Data Value Deciles 1/10

10% 10% 10% 10% 10% 10% 10% 10% 10% 10%

4-4

Percentile Computation

 To formalize the computational procedure, let percentile we would use

L

33

L p

refer to the location of a desired percentile. So if we wanted to find the 33rd and if we wanted the median, the 50th percentile, then

L

50 .

LO2

 The number of observations is

n,

so if we want to locate the median, its position is at (

n +

1)/2, or we could write this as (

n +

1)(

P

/100), where

P

is the desired percentile.

4-5

Percentiles - Example

Listed below are the commissions earned last month by a sample of 15 brokers at Salomon Smith Barney’s Oakland, California, office. $2,038 $1,758 $1,721 $1,637 $2,097 $2,047 $2,205 $1,787 $2,287 $1,940 $2,311 $2,054 $2,406 $1,471 $1,460 Locate the median, the first quartile, and the third quartile for the commissions earned.

LO2

4-6

Percentiles – Example (cont.)

Step 1: Organize the data from lowest to largest value

LO2

$1,460 $1,758 $2,047 $2,287 $1,471 $1,787 $2,054 $2,311 $1,637 $1,940 $2,097 $2,406 $1,721 $2,038 $2,205 4-7

Percentiles – Example (cont.)

LO2

Step 2: Compute the first and third quartiles. Locate L 25 and L 75 using:

L

25  ( 15  1 ) 25 100  4

L

75  ( 15  1 ) 75 100  12 Therefore, the first and third quartiles are located at the 4th and 12th positions, respective ly

L

25

L

75  $ 1 , 721  $ 2 , 205 4-8

Learning Objective 3

Construct and interpret

box plots

.

Boxplots

A box plot is a graphical display, based on quartiles, that helps us picture a set of data. To construct a box plot, we need only five statistics: 1. the minimum value, 2. Q1(the first quartile), 3. the median, 4. Q3 (the third quartile), and 5. the maximum value.

4-9

Boxplot - Example

LO3

4-10

Boxplot Example

Step1: Create an appropriate scale along the horizontal axis. Step 2: Draw a box that starts at

Q1 (15 minutes) and ends at Q3 (22

minutes). Inside the box we place a vertical line to represent the median (18 minutes).

Step 3: Extend horizontal lines from the box out to the minimum value (13 minutes) and the maximum value (30 minutes).

LO3

4-11

Example: Draw a Box & Whisker for

Sample Data in Ordered Array: 11 12 13 16 16 17 18 21 22

(n = 9) Q 1 (L 25 ) is in the (9+1)*25/100 = 2.5 position of the ranked data so use the value half way between the 2 nd and 3 rd values, so

Q 1 = 12.5

Q 1 Q 2 and Q 3 are measures of non-central location = median, is a measure of central tendency 4-12

Quartile Measures Calculating The Quartiles: Example

Sample Data in Ordered Array: 11 12 13 16 16 17 18 21 22

(n = 9) Q 1 is in the (9+1)*25/100 = 2.5 position of the ranked data, so

Q 1 = (12+13)/2 = 12.5

Q 2 is in the (9+1)*50/100 = 5 th position of the ranked data, so

Q 2 = median = 16

Q 3 is in the (9+1)*75/100 = 7.5 position of the ranked data, so

Q 3 = (18+21)/2 = 19.5

Q 1 Q 2 and Q 3 are measures of non-central location = median, is a measure of central tendency 4-13

Quartile Measures-

Calculation Rules  When calculating the ranked position use the following rules ― If the result is a whole number then it is the ranked position to use ― If the result is a fractional half (e.g. 2.5, 7.5, 8.5, etc.) then average the two corresponding data values.

― If the result is not a whole number or a fractional half then interpolate between the data points.

4-14

Quartile Measures: The Interquartile Range (IQR)

― The IQR is Q 3 – Q 1 and measures the spread in the middle 50% of the data ― The IQR is a measure of variability that is not influenced by outliers or extreme values ― Measures like Q 1 , Q 3 , and IQR that are not influenced by outliers are called resistant measures 4-15

The Interquartile Range

Example:

X minimum Q 1 Median (Q 2 ) Q 3 25% 25% 25% 25% X maximum

11 12.5 16 19.5 22

Interquartile range = 19.5 – 12.5 = 7 4-16

Distribution Shape and The Boxplot

Negatively-Skewed Symmetrical Positively-Skewed

Q 1 Q 2 Q 3 Q 1 Q 2 Q 3 Q 1 Q 2 Q 3

Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc..

4-17

Interpolation

If you found that the first quartile was the 13.75

th value then you interpolate like this: Take the 13 th and 14 th data values Find the difference |14 th -15 th | Multiply the difference by 0.75

Add the calculated value to the 13 th value 4-18

Exercises – To Do

 Page 116 – Q4-21 Q4-23 Q4-25 4-19

16 27 29

Stem and Leaf Diagrams

22 37 28 18 27 17 31 25 33 25 32 20 41 27 40 1 2 3 4 8 3 5 1 6 7 5 0 2 7 3 0 7 1 2 7 7 9 8 1 8 means 18 1 2 3 4 6 0 1 0 7 8 2 3 5 7 2 1 3 5 7 7 7 8 9

5.2

7.5

6.6

3.8

Stem and Leaf Diagrams

6.6

8.6

5.8

2.5

4.3

7.1

3.5

2.7

8.3

7.8

7.5

8.8

5.1

2.2

6.1

4.8

Raw data

The following data were collected on the ages of cyclists involved in road accidents 4-22

66 6 62 19 20 26 35 26 61 13 15 61 21 8 28 21 21 63 44 10 44 7 10 52 13 52 19 22 64 11 39 22 9 13 9 17 64 32 8 62 28 36 37 18 138 16 67 45 10 55 14 66 49 9 18 20 23 12 11 25 9 7 18 15 18 17 31 37 42 14 7 29 36 6 9 88 46 12 59 60 60 16 50 16 22 14 34 20 9 67 61 34

Total 92

Ages of cyclists in road accidents

Always

include a

1 0 0 0 1 1 2 2 3 3 3 4 4 4 5 5 6 6 6 7 7 8 8 8 8 9 9 2 0 0 0 1 1 1 2 2 2 3 5 6 6 8 8 9 3 1 2 4 4 5 6 6 7 7 9 4 2 4 4 5 6 9 5 0 2 2 5 9 8 8

Key 6|7 means 67 years 4-24