Measures of Central Tendency

Download Report

Transcript Measures of Central Tendency

Statistics
Statistics deal with collecting,
organizing, and interpreting data.
A Survey is a method of collecting
information.
•
→ Surveys use a small sample to
represent a large population.
•Populations:
the whole group; the
group being studied.
•Sample:
part of the population; the
group being surveyed.
For each survey topic; determine which
represents the population and which
represents a sample of the population.
Making Predictions and Drawing
Inferences
You can use survey results to predict
the actions of a larger group or draw
inferences on the entire population.
•Predictions:
A hypothesis made
based on survey results or past actions.
•Inference:
A prediction that is made
using observations, prior knowledge,
and experience.
•Use
proportions to help calculate
your predictions and inferences.
A survey found that 6 out of 10 students at
IMS have an IPod. Predict how many
students have IPods if there are 650
students at IMS.
cell
total
6
10
.
=
x
650
.
About 390 students have IPods
A researcher catches 60 fish from different
locations in a lake. He then tags the fish and
puts them back in the lake. Two weeks later,
the researcher catches 40 fish from the same
locations. 8 of these 40 fish are tagged. Predict
the number of fish in the lake.
tag
total
60
=
x
.
8
40
.
About 300 fish
A middle school has 1,800 students. A
random sample of 80 shows that 24 have
cell phones. Predict the number of
students in the middle school who have
cell phones.
phones
total
x
=
1800
.
24
80
.
About 540 students have cell phones
A tilapia fish hatchery selectively releases fish
when the populations have increased beyond a
certain target level. In order to estimate the current
fish population, workers at the hatchery catch 110 fish
and mark them with special paint. Then a little while
later, they catch 530 fish, among which 11 are
marked. To the nearest whole number, what is the
best estimate for the fish population?
marked
total
.
110
=
x
11
530
About 5,300 fish
.
In a random sample, 3 of 400 computer
chips are found to be defective. Based on
the sample, about how many chips out of
100,000 would you expect to be
defective?
_x_
3
defective
=
100,000
400
total
.
.
About 750 chips will be defective
Mali is starting her own beehive so that she can have
fresh honey straight from the hive. Mali decides to
check the current population of bees in the hive by
marking 52 bees with special bee-marking paint.
Later, Mali collects 190 bees and observes that 26 of
them are marked. To the nearest whole number, what
is the best estimate for the bee population?.
marked
total
26
190 =
.
52
x
About 380 bees
.
For a research project on rodents, 21
chipmunks were tagged and released. Later,
researchers counted 100 chipmunks in the
area. Of the chipmunks they counted, 14 had
tags. To the nearest whole number, what is the
best estimate for the chipmunk population?
tagged
total
14
21
=
100
x
.
.
About 150 chipmunks
While studying a gecko population, a group of
university scientists marked and released 38
geckos. Later, the group counted a total of 240
geckos, of which 24 were marked. To the
nearest whole number, what is the best
estimate for the gecko population?
marked
total
24
38
=
240
x
.
About 380 geckos
.
To determine the jackrabbit population in a
wildlife preserve, researchers tagged 110
jackrabbits. Later, they counted 200
jackrabbits. Out of the jackrabbits they counted,
22 had tags. To the nearest whole number,
what is the best estimate for the jackrabbit
population?
tagged
total
110
22
=
x
200
.
About 1,000 jackrabbits
Sampling
•Biased
Sample: A sample that
doesn’t truly represent the
population.
 Example: Surveying 6th graders about the
height of IMS students.
•Random
Sample: A sample where
every member of the population has an
equal chance of being picked.
 Example: Surveying using lockers numbers that
end in 2 about the height for IMS students.
Practice Problems
Tell if each sample is biased or random.
Explain your answer.
An airline surveys passengers from a flight that
is on time to determine if passengers on all
flights are satisfied.
Biased
If they are on-time, they are likely satisfied with
their experience.
A newspaper randomly chooses 100 names
from its subscriber database and then surveys
those subscribers to find if they read the
restaurant reviews.
Random
The names were randomly chosen in such a way
that everyone in the population has an equal
chance of being picked
The manager of a bookstore sends a survey to
150 customers who were randomly selected
from a customer list.
Random
The customers were randomly chosen so
everyone in the population has an equal chance
of being picked.
A team of researchers’ surveys 200 people at
a multiplex movie theater to find out how
much money state residents spend on
entertainment.
Biased
People who go to the movies likely spend more
money on entertainment then randomly selected
people.
Types of Random Sampling
•Simple
Random Sample: An
unbiased sample where each item or
person in the population is as likely to
be chosen as any other.
 Example: Each students’ name is on a piece of
paper in a bowl; names picked without looking
•Systematic
Random Sample: A
sample where the items or people are
selected according to a specific time or
time interval.
 Example: Every 20th person is chosen from an
alphabetical list of all students attending IMS.
•Stratified
Random Sample: A
sample where the population is divided
into groups; then choose a certain
number at random from each
group.
 Example: Alphabetical list of all students at IMS
divided into boys and girls. Then sampling every
20th person from that list.
Types of Biased Sampling
•Convenience
Sample: A biased
sample which consists of members of a
population that are easily accessed.
 Example: Only surveying one math class about
IMS students’ favorite letter day
•Voluntary
Sample: A biased sample
which involves only those who want to
participate in the sampling.
 Example: Students at IMS who wish to
participate in the survey can fill it out on-line
Try the Following
Use your knowledge of types of Random and
Biased Sampling methods to solve the
following problems.
To find how much money the average American
family spends to cool their home, 100 Alaskan
families are surveyed at random. Of the families,
85 said that they spend less than $75 per month on
cooling. The researcher concluded that the
average American spends less than $75 on cooling
per month. Is this conclusion valid? Explain.
The conclusion is not valid. This is a
biased convenience sample since people
in the United States would spend much
more than those in Alaska.
Zach is trying to decide which golf course is the
best of three golf courses. He randomly surveyed
people at a sports store and recorded the results
in the table. Which type of sampling method did
Zach use?
A simple Random Sample
Suppose Zach surveyed 150 more people. How
many people would be expected to vote for Rolling
green?
42 more people
Adults in every 100th household in the
phonebook are surveyed about which
candidate they plan to vote for. Which type of
sampling method is being described?
Systematic Random Sample
Use the organizer to determine whether the conclusion is
valid.
A computer program selects telephone
numbers at random for a survey on which
candidate people plan to vote for. Which type
of sampling method is being described?
Simple Random Sample
The researchers send a mail survey to apple
farmers asking them to please record the
number of their trees that are infected and
send the survey back. Which type of
sampling method is being described?
Biased – Voluntary Response Sample
To determine what people in California think
about a proposed law, 5,000 people from the
state are randomly surveyed. Of the people
surveyed, 5.8% are against the law. The
legislature concludes that the law should not
be passed. Which type of sampling method
is being described? Is this a valid
conclusion?
Yes it is valid. A Simple Random
Sample was used.
Types of Graphs
First, what is a graph?
Types of Graphs
►Pictographs
►Histograms
►Bar Graphs
►Double Bar
Graphs
►Line Graphs
►Double Line
Graphs
►Circle
(Pie)
graphs
►Line Plots
►Stem-and-leaf
plots
►Box-andWhisker Plots
Pictographs
►Use
pictures.
►What
does this graph
represent?
►How
many students
play hockey?
◦ 20
►How
many more
students played
soccer than hockey?
◦ 40
►Show
how often something
occurs in equal intervals.
►This histogram shows:
◦ The distance of long
jumps at a track meet
►What range occurred the
most? Least?
◦ 5’7” – 6’, 6’7” – 7’
►How many long jumps
were from 5’1” to 6’?
◦ 25 long jumps
►How many more students
jumped 5’7” – 6’ than 5’ –
5’6”?
◦ 5 students
Histograms
Bar Graphs
►Use
bars of different
lengths to display and
compare data in specific
categories.
►This bar graph shows:
►Which
grade raised
twice as much money as
8th?
◦
grade
►How much more money
did 7th grade raise than
8th ?
◦ $30
10th
Amount Raised ($)
◦ The amount of money
raised in a charity walk by
each of the grades.
Money Raised in Charity Walk
120
100
80
60
40
20
0
Grade 6
Grade 7
Grade 8
Grades
Grade 9
Grade 10
Double Bar Graphs
►Use
pairs of bars to
compare two sets of
categorical data
graph compares:
◦ Number of Sports & History
books in 3 different school
libraries
►Which
school has the
greatest difference
between sports & history?
◦ Oak
►Does
the Maple School
have more sports or history
books? How many?
◦ History books, 11
Number of Books
►This
Sports & History Books in Three School Libraries
30
25
20
Sports Books
15
History Book
10
5
0
Chestnut
Oak
School
Maple
Line Graphs
►Show
a change in data
over time.
►What
data does this
line graph present?
►Between
which 2
months was there the
greatest increase in the
number of rainy days?
◦ August & September
7
Rainy Days
◦ Number of rainy days
from May to December
Number of Rainy Days: May through December
6
5
4
3
2
1
0
May
June
July
Aug. Sept.
Months
Oct.
Nov.
Dec.
Double Line Graphs
►Uses
two lines to
compare two sets of
data over time
►What
is this double line
graph comparing?
►On
what day were the
temp’s the closest?
◦ Day 6
►On
what day were the
temp’s the furthest?
◦ Day 10
Degrees at 8am
◦ Temperatures for first ten
days of winter for two
different years
Temperature on First 10 Days of Winter
60
50
40
30
20
10
0
1
2
3
4
5
6
7
First 10 Days of Winter
First year
Second Year
8
9
10
►Compare
parts of a whole.
Each sector, or slice, is one
part of the entire data set.
►This
graph compares:
◦ The results of Leo’s survey
on pet ownership
►How
many people do not
own pets?
◦ 15 (50% of 30)
►How
many people have cats?
◦ 6 people (20% of 30)
Circle Graphs
► A graph
that uses x’s and a
number line to show frequency of
data
► How
many days did The
Lorax train?
◦ 18 days
► Which
number of miles
did he run most often?
least often?
◦ 5 miles, 2, 8, 16
► How
often did The Lorax run 6 mi?
◦ 3 days
► What
◦ 14
► What
is the range of the miles?
is the median miles?
◦ 5 miles
Line Plots
Number of miles The Lorax ran
per day during training
that uses digits of
Stem-and-Leaf
each number to organize and
Plots
display data
►A Stem: represents the lefthand digit of the data value
►A Leaf: represents the
remaining right-hand digits
►What’s the greatest amount of
time spent doing homework?
►A graph
\
◦ 64 minutes
►How
many students were surveyed?
◦ 18 students
►How
many students studied for 32 min?
◦ 2 students
►How many studied for more than 43 min?
◦ 7 students
Box-and-Whisker Plots
► Uses
a number line to show the distribution of a data set and
measures of variation. Also useful for large sets of data.
► These
plots are divided into four parts called: quartiles
► The
median of the entire data set is the middle
► The
lower quartile is the median of the lower half of the data set
► The
upper quartile is the median of the upper half of the data set
► The
range is the difference between the highest and lowest data
points
► The
interquartile range is the difference between the upper and
lower quartile
Measures of Central Tendency
Measures of central tendency
show what the middle of a data
set looks like.
•
The measures of central tendency
are the mean, median, and
mode.
•
•The Range is NOT a measure of
central tendency
Find the mean, median, mode, and
range of the following data set:
The ages of Mrs. Long’s
grandchildren: 8, 3, 5, 4, 2, 3, 1, and 4.
Mean is average.
1 + 2 + 3 + 3 + 4 + 4 + 5 + 8 = 30
30
= 3.75
8
The mean is 3.75
Range max minus min
Or largest minus smallest.
List in order: 1, 2, 3, 3, 4, 4, 5, 8
8–1=7
The range is 7
Mode the number that
occurs most often.
There can be several modes
or no mode
List in order: 1, 2, 3, 3, 4, 4, 5, 8
The mode here is 3 and 4
Median is the middle
data value when in order.
The middle two numbers
are 3 and 4
List in order: 1, 2, 3, 3, 4, 4, 5, 8
The median is 3.5
Often one measure of Central Tendency is
more appropriate for describing a data
set. Think about what each measure tells
you about the data.
Find the median, mode, mean and
range of each data set. Determine the
measure of Central Tendency that
best describes the data set.
6, 5, 3, 6, 8
List in order: 3, 5, 6, 6, 8
Median: 6
Mode: 6
Mean: 28/5 = 5.6
Range: 5
Best measure of center: 6
(median & mode)
7, 6, 13, 16, 15, 9
List in order: 6, 7, 9, 13, 15, 16
Median: 13+9 = 22
22/2 =11
Mode: none
Mean: 66/6 = 11
Range: 10
Best measure of center: 11
(median & mean)
12, 15, 17, 9, 17
List in order: 9, 12, 15, 17, 17
Median: 15
Mean: 70/5 = 14
Mode: 17
Range: 8
Best measure of center: 15 (possibly 14)
(median and possibly mean)
51, 62, 68, 55, 68, 62
List in order: 51, 55, 62, 62, 68, 68
Median: 62
Mean: 366/6 = 61
Mode: 62 & 68
Range: 17
Best measure of center: 62
(median, mean & mode)
List in order: 36, 41, 42, 44, 47
Median: 43
Mean: 210/5 = 42
Mode: none
Range: 11
Best measure of center: 42 or 43
(median or mean)
An outlier is an extreme value –
either much less than the lowest
value or much greater than the
highest value.
•
Use the data set to answer the questions below:
4, 6, 3, 6, 25, 3, 2
List in order: 2, 3, 3, 4, 6, 6, 25
Is
there an outlier? Yes
If so, what is it? 25
How
does the outlier affect the mean and median?
With outlier: Median= 4; Mean 49/7 = 7
Without outlier: Median= 3.5; Mean: 24/6 = 4
Which
measure of central tendency is most
effected by an outlier in a data set? Mean!
Which measure of CT bests describes the data?
Explain.
Median – it is not dramatically affected by outliers
What does misleading mean?
 To
lead in the wrong direction.
 To
manipulate statistics without
lying.
 Misleading
 To
= Dishonesty
intentionally deceive someone.
Mrs.
Long’sStable!!
Salaries
Salaries
40000
35000
Salary
30000
25000
20000
15000
10000
5000
0
1
2
3
4
Year
5
6
Mrs.
Long’s
Salaries
Salaries
Rising!!
36000
Salary
35500
35000
34500
34000
1
2
3
4
Year
5
6
Salaries
Rising!!
Mrs.
Long’s
Salaries
36000
40000
35000
30000
25000
20000
15000
10000
5000
0
35500
Salary
Salary
Salaries
Stable!!
Mrs. Long’s
Salaries
35000
34500
1
2
3
4
Year
5
6
34000
1
2
3
4
5
Year
What is the difference between the two graphs?
Do these two graphs appear to show the same
information?
Why do you think someone would want to present the
same information in different ways?
6
A Day's Activities
10
Hours
8
6
4
2
Sleep
Eat
School
Activity
Homework
Play
Ho
m
Activity
ay
Pl
ew
or
k
ho
ol
Sc
t
Ea
ee
p
10
8
6
4
2
0
Sl
Hours
A Day's Activities
Which pet is most popular?
Key =
or
= 5 pets
Average
Batting Average
400
350
300
250
200
150
100
50
0
What 2004
is misleading
about
bar1999
graph?
2003 2002
2001this
2000
1998
Year
Average
Batting Average
400
350
300
250
200
150
100
50
0
1998 1999 2000 2001 2002 2003 2004
Year
Eye Color
Blue
Hazel
Brow n
Green
What eye color is the most frequent?
Eye Color
Blue
Hazel
Brow n
Green
Eye Color
Green
9%
Blue
27%
Brow n
28%
Hazel
36%
Why would someone want to mislead
you?
 To
make it appear that they are
correct.
 Change
the way the data is
interpreted
 To
persuade someone
 To
influence an opinion
Ways to Manipulate Statistics
 Change
 Do
the values on the x- or y-axis.
not start the graph at zero.
 Use
different bar widths on a bar graph
 Change
the way you conduct your
survey
◦ Example: Survey only 6th graders when you are
collecting data on the height of middle school students
at IMS.
◦ Survey’s should be random.
Try the examples in your notes..
Graphs let readers analyze data
easily, but are sometimes made to
influence conclusions by
misrepresenting the data.
•

Explain how the graphs differ.
◦Which graph appears to show a sharper
increase in price? Graph B
◦Which graph might the Student Council use
to show that while ticket prices have risen,
the increase is not significant? Why?
They might use Graph A. The y-axis scale
makes the increase appear less significant.
 The line graphs show monthly profits of a
company from October to March. Which graph
suggests that the business is extremely
profitable? Is this a valid conclusion? Explain.
Although both graphs show a profit, Graph A’s
profit increase is exaggerated due to the y-axis
scale beginning with $500 intervals and
changing to $100 intervals
Statistics can also be used to
influence conclusions.
•
An amusement park boasts that the average
height of their roller coasters is 170 feet.
Explain how this might be misleading.
◦Mean: 850/5 = 170
◦Median: 126
◦Mode: No mode
The mean has been affected by the
outlier of 365, therefore using the
average to describe this data set is
misleading.

How is this graph misleading?
The y-axis scale does not have equal spacing

How could you redraw the graph so it would
not be misleading?
Draw the y-axis scale
starting at 0 with equal
spacing so that the distance
between 0 and 18,000 equal
distance between 18,000 and
36,000.
• How is this graph misleading?
The y-axis scale has a break so the differences in
jump distances appear greater
• How could you redraw the graph so it
would not be misleading?
Draw the y-axis scale
starting at 0 and
continuing to 7.5 using
equal spacing.
• How is this graph misleading?
The y-axis scale has a break so the differences in
fare appear greater.
• How could you redraw the graph so it
would not be misleading?
Taxicab Fares
Draw the y-axis scale
starting at 0 and
continuing to 7.5 using
equal spacing.
13
12
11
10
Mon Tue Wed Thu Fri
• How is this graph misleading?
The y-axis scale does not start at zero so the
differences in water consumed seem greater.
• How could you redraw the graph so it
would not be misleading?
Water Consumed
48
Ounces of Water
Draw the y-axis scale
starting at 0 and
continuing to 48 using
equal spacing.
40
32
24
16
Mark
Frank
Mila
Yvonne
Mean Absolute Deviation
Mean Absolute Deviation: The
average amount each number is
away from the mean of a a data
set.
•
Step
1.
Find the mean.
Step
2. Find the absolute value of the difference
between each data value and the mean.
Step
3.
Find the average of those differences.
1.Find the mean:
52 + 48 + 60 + 55 + 59 + 54 + 58 + 62=448
2.Find differences of mean and data points:
56 – 52=4 56 – 48=8 56 – 60=4 56 – 55=1
56 – 59=3 56 – 54 =2 56 – 58=2 56 – 62=6
Find the mean of the differences:
4 + 8 + 4 + 1 + 3 + 2 + 2 + 6 = 30
30/8 = 3.75
Try one on your own
1. 58 + 88 + 40 + 60 + 72 + 66 + 80 + 48=512
512/8 = 64
2. 64 – 58=6 64 – 88=24 64 – 40=24 64 – 60=4
64 – 72=8 64 – 66 =2 64 – 80=16 64 – 48=12
3.. 6+ 24 + 24 + 4 + 8 + 2 +16 + 12 = 100
100/8 = 12.5
The top five salaries and bottom five salaries for the 2010
New York Yankees are shown in the table below. Salaries
are in millions of dollars and are rounded to the nearest
hundredth.
1. 33 + 24.29 + 22.6 + 20.63
+ 16.5 = 117.02
117.02/5 = 24.404 (24.4)
2. 9.6 + 0.89 + 0.8 + 2.77 +
6.9 = 20.96
3. 20.96/ =5 = 4.192
$4.19 million
1. 0.45 + 0.44 + 0.41 + 0.41
= 2.14
2.14/5 = 0.428 (0.43)
2. 0.02 + 0.01 + 0 + 0.02 +
0.02 = 0.07
3. 0.07/ =5 = 0.014
$0.1 million
The table shows the running time in minutes for two
kinds of movies. Find the mean absolute deviation for
each set of data. Round to the nearest hundredth. Then
write a few sentences comparing their variation.
1. 90 + 95 + 88 + 100 + 98 =
471
471/5 = 94.2
1. 115 + 120 + 150 + 135 +
144 = 664
664/5 = 132.8
2. 0.2 + 0.8 + 6.2 + 5.8 +
3.8 = 20.8
2. 17.8 + 12.8 + 17.2 + 2.2 +
11.2 = 61.2
3. 20.8/ =5 = 4.16
4.16 minutes
3. 61.2/ =5 = 12.24
12.24 minutes
Find the mean absolute deviation. Round to the nearest
hundredth if necessary. Then describe what the mean
absolute deviation represents.
1. 112 + 145 + 108 + 160 + 122 = 647
647/5 = 129.4
2. 17.4 + 15.6 + 21.4 + 30.6 + 7.4 = 92.4
3. 92.4/5 = 18.48
18.48 daily visitors
The MAD is large. The average distance from each
point is away from the mean is about 18.
Find the mean absolute deviation. Round to the nearest
hundredth if necessary. Then describe what the mean
absolute deviation represents.
1. 9.50 + 9.00 + 8.25 + 9.25 + 8.00 + 8.50 = 52.50
52.50/6 = $8.75
2. 0.75 + 0.25 + 0.5 + 0.5 + 0.75 + 0.25 = $3.00
3. $3.00/6 = $0.50
$0.50 difference in
admission prices
The MAD is small. The difference is only $0.50
The table shows the height of waterslides at two different
water parks. Find the mean absolute deviation for each
set of data. Round to the nearest hundredth. Then write
a few sentences comparing their variation.
1. 75 + 95 + 80 + 110 + 88 =
448
448/5 = 89.6 (24.4)
1. 120 + 108 + 94 + 135 +
126
583/5 = 116.6
2. 14.6 + 5.4 + 9.6 + 20.4 +
1.6 = 51.6
2. 3.4 + 8.6 + 22.6 +18.4 +
9.4 = 62.4
3. 51.6/ =5 = 10.32
3. 62.4/ =5 = 12.48
10.32 feet
12.48 feet
The water slides at Splash Lagoon are closer
together in terms of height. There is less
variability in the height at Splash Lagoon when
compared to Wild Water Bay
Box and Whisker Plots
A box-and-whisker plot uses a
number line to show the
distributions of a data set.
•
To make a box-and-whisker plot,
first divide the data into four equal
parts using quartiles.
•
•The
median or 2nd quartile,
divides the data into a lower half
and an upper half.
The median of the lower half is
the lower quartile, and the
median of the upper half is the
upper quartile.

Example: Use the data to make a box-andwhisker plot: 73 67 75 81 67 75 85 69

Order the data from least to greatest.
Calculate/determine the following:
Median: 74

Lower Quartile: 68

Upper Quartile: 78

Lowest Value:

Greatest Value: 85


67




Draw a box from the lower to the upper quartile.
Inside the box, draw a vertical line through the
median.
Then draw the “whiskers” from the box to the least
and greatest values.
Be sure to title and label your graph.
Title
Label
42 22 31 27 24 38 35





Median: 31
Lower Quartile: 24
Upper Quartile: 38
22, 24, 27, 31, 35, 38, 42
Lowest Value: 22
Greatest Value: 42
Title
Label
Measures of Variability
Measures of Variability: Is how
spread out group of data is.
• Measures of Variability are range
and interquartile range.
Inter-quartile range (IQR): This is
the difference between the upper
quartile and the lower quartile.
•
22, 24, 27, 31, 35, 38, 42
•What
is the range for the above
data set?
•42 – 22 = 20
What is the interquartile range for
the above data set?
•
•38 – 24 = 14
Measures
of Variation are range,
Interquartile range, upper
quartile, and lower quartile.
Practice
16, 19, 19, 23, 24, 25, 31, 37, 42, 46, 47
Median: 25
Lower Quartile: 19
Upper Quartile: 42
16
Lowest Value: 22
Greatest Value: 47
Range: 47- 25 = 31
IQR: 42- 19 = 23
Title
Label
26, 27, 28, 29, 30, 32, 36, 38, 40, 42
Median: 31
Lower Quartile: 28
Upper Quartile: 38
26
Lowest Value: 22
Greatest Value: 42
Range: 42- 26 = 16
IQR: 38- 28 = 10
Halle’s basketball team
scored the following points
in their past games.
38, 42, 26, 32, 40, 28,
36, 27, 29, 30
Points Scored in a Game
Points
4 , 8, 9, 10, 10, 12, 12, 12, 15, 18, 20, 21, 24, 25, 35
Median: 12
Lower Quartile: 10
Upper Quartile: 21
4
Lowest Value: 22
Greatest Value: 35
Range: 35- 4 = 31
IQR: 21- 10 = 11
Kyle is helping your school
librarian conduct a survey of
how many books students
read during the year. He gets
the following results: 12, 24,
10, 12, 4, 35, 10, 8, 12, 15,
20, 18, 25, 21, and 9.
Books Students read in a year
Number of Books
•
Describe the center, shape, spread,
and outliers of the distribution.
• The typical student reads about 12
books.
• There is a slight right skew.
• The IQR is 11 so there a lot of
variability in the number of books read.
3, 5, 5, 6, 8, 9, 12, 15, 15, 17, 22, 26, 35, 42, 42, 43, 46, 47, 54, 55
Median: 19.5
Lower Quartile: 8.5
Upper Quartile: 42.5
Lowest Value: 3
Greatest Value: 55
Range: 55- 3 = 52
IQR: 42.5- 8.5 = 34
Ms. Carpenter asked each
of her students to record how
much time it takes them to get
from school to home this
afternoon. The next day,
students came back with this
data, in minutes: 15, 12, 5,
55, 6, 9, 47, 8, 35, 3, 22, 26,
46, 54, 17, 42, 43, 42, 15, 5.
Time to get home from school
Minutes
18, 20, 22, 24, 24, 25, 28, 29, 30, 30, 32, 35, 38
Range: 19.5
3rd Quartile: 31
24, 20, 18, 25, 22, 32,
30, 29, 35, 30, 28,
24, 38
Create a box-andwhisker plot for the data.
Comparing Populations (Skipper is
Kewl :D)
A
double box plot consists of two
graphed on the same number line.
→
You can draw inferences about the two
populations in the double box plot by
comparing their centers and variations.
Ian surveyed a different group of students in his science and
math classes. The double box plot shows the results for both
classes. Compare their centers and variations. Write an
inference you can draw bout the two populations.
•
•
•
•
•
What does the does the plot show?
• Number of times each class
posted a blog this month
Is either plot symmetric?
• No
Which measure of center should you
use to compare the data?
• Median (Math: 10; Science: 20)
Which measure of variation should
you use to compare the data?
• IQR (Math: 15; Science: 10)
Which class posts more blogs?
• Science
•
•
•
Which class has a greater spread
of data around the median?
• Math
Use the comparisons to write an
inference:
Science students posed more blogs
than the math class. The median for
science is twice the median for math.
There is also a greater spread of
data around the median for the Math
class than the Science class
The double dot plot shows the daily high temperatures for two
cities for thirteen days. Compare the centers and variations for
the two populations. Write an inference you can draw about the
two populations.
•
•
•
Is either plot symmetric?
• No
Which measure of center should you
use?
• Mean
• (Springfield: 81; Lake City: 84)
Which measure of variation should
you use?
• MAD
• (Springfield: 1.4; Lake City: 1.4)
•
•
Use the comparisons to write an
inference:
Both cities have the same variation
or spread around their means.
Lake City has a greater mean
temperature than Springfield.
Reading Box-and-Whisker Plots
The students at Dolan Middle School are competing in after-school activities in
which they earn points for helping out around the school. Each team consists of
the 30 students in a homeroom. Halfway through the competition, here are the
scores from the students in two of the teams.
The champion team is the one with the most points when the scores of the 30
students on the team are added. Which team would you rather be on? Explain.
•
•
•
Is either plot symmetric?
• No
Which measure of center should you
use?
• Median
• (Team 1: 100; Team 2: ≈115)
Which measure of variation should
you use?
• IQR
• (Team 1: ≈55; Team 2: ≈105)
•
•
Use the comparisons to write an
inference:
Team 1 is more consistent and has
fewer low scores. Although, Team
2 has a slightly higher median,
Team 1 is the better choice
•
•
Which group has a larger interquartile range?
• Basketball
Which group of players has more predictability in their height?
• Baseball - Range and IQR is smaller and also symmetric
•
•
•
Which shoe store has a greater median?
• Sage’s
Which shoe store has a greater interquartile range?
• Maroon’s
Which shoe store appears to be more predictable in the
number of shoes sold per week?
• Sage’s – the range and IQR are smaller
1.The table below shows the golf scores for two people. Make
two box and whisker plots of the data on the same number line.
•
•
•
Which golfer has the lower median score?
• Henry
Which golfer has the lesser interquartile range of scores
• Trish
Which golfer appears to be more consistent?
• Trish – her range and IQR are smaller
Now that was easy!