Section 3.2 ~ Picturing Distributions of Data

Download Report

Transcript Section 3.2 ~ Picturing Distributions of Data

Section 3.2 ~
Picturing Distributions of Data
Introduction to Probability and Statistics
Ms. Young
Sec. 3.2
Objective

In this section, we will look at the most common
methods for displaying distributions of data


A distribution of data values refers to the way the values
are spread out over the chosen categories
You will be able to create and interpret:







Basic bar graphs
Dotplots
Pie charts
Histograms
Stem-and-leaf plots
Line charts
Time-series diagrams
Sec. 3.2
Bar Graphs



A bar graph is typically used for qualitative data (categorical)
Each bar represents the frequency (or relative frequency of
one category)
The bars can either be vertical or horizontal
Sec. 3.2
Characteristics of Bar Graphs



When the data is qualitative, the widths of the bars have no special
meaning, so there is no reason for them to be touching and they
should be drawn with uniform widths
The graph should have a title or caption that explains what is being
shown
The vertical axis should be labeled and scaled appropriately



The tick marks should be evenly spaced and the range of values between
each mark should be the same
The horizontal axis should be labeled and each category should be
indicated (there is no need for tick marks if the data is qualitative)
The graph should include a legend if multiple data sets are displayed
on a single graph
Sec. 3.2
Example 1

Create a vertical bar graph from the essay grade data in section 3.1.
Grade
Frequency
A
4
B
7
C
9
D
3
F
2
Total
25
Sec. 3.2
Dotplots

A dotplot is a variation of a bar graph in which each
dot represents one data value and the total number
of dots represents the frequency

Dotplots are convenient when making graphs of raw data,
because you can tally the data by making a dot for each
value and then you can choose to convert the graph to a bar
chart so it looks more formal
Sec. 3.2
Pareto Chart

A pareto chart is a bar graph with the bars arranged in
frequency order (either high to low or low to high)

Pareto charts make sense only for data at the nominal level

Bar graph
Ex. ~ It wouldn’t make sense to create a pareto chart for the
essay grade data because then it would put the grades out of
order (C, B, A, D, F)
Pareto chart
Sec. 3.2
Pie Charts

A pie chart is a circle divided so that each wedge represents
the relative frequency of a particular category
The wedge size is proportional to the relative frequency
 The entire pie represents the total relative frequency of 100%


When the wedge sizes represent simple fractions, it’s easy to
create a pie chart, but when there are numerous categories, a
pie chart may not be the best way to represent the data
Sec. 3.2
Example 2

Create a pie chart from the essay grade data in section 3.1. Each
sector is 10 degrees.
Grade
Frequency
A
4
B
7
C
9
D
3
F
2
Total
25
Sec. 3.2
Histograms

A histogram is a bar graph that shows a distribution of quantitative
data (numerical)
Not only does the y-axis have numerical meaning, but so does the x-axis
(therefore the bar widths have meaning)
 Just like the tick marks on the y-axis, the tick marks on the x-axis must
be evenly spaced and represent the same range of values between each one
 The bars touch each other because there are no gaps in between the
categories


The start of the bar includes the number (ex. the red bar is from 0 up to 20, but
not including 20 and the pink bar is from 20 up to 40, but not including 40, and so
on)
Refer to table
3.4 on p.91
Sec. 3.2
Example 3

Create a histogram from the data below that shows the ages of
Academy Award-winning actresses from 1970 to 2007.
Age
Number of
Actresses
20-29
8
30-39
18
40-49
8
50-59
0
60-69
2
70-79
1
80-89
1
Sec. 3.2
Stem-and-leaf Plot for Qualitative Data

A stem-and-leaf plot looks somewhat like a histogram turned
sideways, except in place of bars we see a listing of data

This gives us a more detailed look at the data

Ex. ~ The histogram for the state energy use only shows us how many
states fall into each category, but the stem-and-leaf plot not only tells
us how many fall into each category, but more specifically the name of
the state
Sec. 3.2
Stem-and-leaf Plot for Quantitative Data

A stem-and-leaf plot for quantitative data lists the data
values using the first digit as the stem and the remaining
digit(s) as the leaf

You must also include either a decimal point where it’s appropriate
in the stem or a key stating the equivalency of the stem and the
leaf put together
Sec. 3.2
Example 4

The following data values represent the ages of recent Academy
Award-winning male actors at the time when they won the award.
Make a stem-and-leaf plot for the data.
32 37 36 32 51 53 33 61 35 45
55 39 76 37 42 40 32 60 38 56
48 48 40 43 62 43 42 44 41 56
Sec. 3.2
Line Charts

A line chart shows a distribution of data using a series of dots
connected by lines
When qualitative, the dot is positioned horizontally by placing it on
the tick mark of the category and vertically by placing it at the
appropriate value that corresponds to the frequency
 When quantitative, the dot is positioned horizontally in the middle
of the bin and vertically at the appropriate value that corresponds
to the frequency

Sec. 3.2
Time-Series Diagrams

A time-series diagram is a histogram or line chart in which the
horizontal axis represents time
Sec. 3.2
Summary








Bar graph: each bar represents the frequency of a category
Dotplot: similar to a bar graph, but there is a dot for each
piece of data that falls into a certain category

All the dots added up give the frequency for that category

Remember that this would only make sense for a nominal level of
measurement
Pareto chart: is a bar graph arranged in frequency order
Pie chart: a circle that is divided into wedges that represent
the relative frequency of a category
Histogram: is a bar graph in which the data is quantitative
Stem-and-leaf plot: is a table that represents either
qualitative data or quantitative data by dividing that data into
two parts
Line chart: a series of points connected by line segments in
which the point represents the frequency of the category
Time-series diagram: is a histogram or line chart in which the
x-axis represents time