Transcript Chapter 1

What is Statistics
Section 1.1, Page 4
1
Definition: Statistics
Statistics: The science of collecting, describing and interpreting
data.
Why Study Statistics?
Statistics helps us make better decisions as businesses,
governments and individuals.
Section 1.1, Page 4
2
Definitions
Population: A collection, or set, of individuals, objects, or
events whose properties are to be analyzed.
Sample: A subset of the population.
We desire knowledge about an entire population but is
most often the case that it is prohibitively expensive, so
we select representative sample from the population and
study the individual items in the sample.
Descriptive Statistics: The collection, presentation, and
description of of the sample data.
Inferential Statistics: The technique of of interpreting the
values resulting from the descriptive techniques and
making decisions and drawing conclusions about the
population.
Section 1.1, Page 4
3
Definitions
Parameter: A numerical value summarizing all the data of a
population. For example, the average high school grade
point of all Shoreline Students is 3.20. We often use Greek
letters to identify parameters, μ = 3.20.
Statistic: A numerical value summarizing the sample data.
For example, the average grade point of a sample of
Shoreline Students is 3.18. We would use the symbol,
x  3.18
The statistic corresponds to the parameter. We usually don’t
know the value of the parameter, so we take a sample and
estimate it with the corresponding statistic.
Sampling Variation: While the parameter of a population is
considered a fixed number, the corresponding statistic will
vary from sample to sample. Also, different populations give
rise to more or less sampling variability. Considering the
variable age, samples of 60 students from a Community
college would have less variability than samples of a Seattle
neighborhood.
Section 1.1, Page 4
4
Problems
Objective 1.1, Page 18
5
Problems
Objective 1.1, Page 18
6
Variables
Variable: A characteristic of interest about each
element of a population.
Data: The set of values collected for the variable from
each of the elements that belong to the sample.
Numerical or Quantitative Variable: A variable that
quantifies an element of the population. The HS
grade point of a student is a numerical variable.
Numerical variables are numbers for which math
operations make sense. The average grade point of a
sample makes sense.
Continuous Numerical Variable: The variable can take
on take on an uncountable number of values between
to points on the number line. An example is the
weight of people.
Discrete Numerical Variable: The variable can take on
a countable number of values between two points on
a number line. An example is the price of statistics
text books.
Section 1.1, Page 8
7
Variables (2)
Categorical or Qualitative Variable: A variable that describes or
categorizes an element of a population. The gender of a person
would be a categorical variable. The categories are male and
female.
Nominal Categorical Variable: A categorical variable that uses a
number to describe or name an element of a population. An
example is a telephone area code. It is a number, but not a
numerical variable used on math operations. The average area
code does not make sense.
Ordinal Categorical Variable: A categorical variable that
incorporates an ordered position or ranking. An example would
be a survey response that ranks “very satisfied” ahead of
“satisfied” ahead of “somewhat satisfied.” Limited math
operations may be done with ordinal variables.
Section 1.1, Page 8
8
Problems
Identify each of the following examples of variables as to
categorical or numerical. If categorical, indicate the
categories. If numerical, indicate discrete or continuous.
Objective 1.1, Page 18
9
Problems
Objective 1.1, Page 19
10
Data Collection
Section 1.2, Page 11
11
Data Collection Process
Section 1.2, Page 12
12
Data Collection Process
Section 1.2, Page 12
13
Observational Studies and
Experiments
Observational Study: Researchers collect data without
modifying the environment or controlling the process being
observed. Surveys and polls are observational studies.
Observational studies cannot establish causality.
Example: For a randomly selected high school
researchers collect data on each student, grade point and
whether the student has music training, to see if there is a
relationship between the two variables.
Experiments: Researchers collect data in a controlled
environment. The investigator controls or modifies the
environment and observes the effect of a variable under
study. Experiments can establish causality.
Example: Randomly divide a sample of people with
migraine headaches into a control and treatment groups.
Give the treatment group a experimental medication and
the control group a placebo, and then measure and
compare the reduction of frequency and severity of
headaches for both groups.
Section 1.3, Page 12
14
Sampling Frame
Sample Frame: A list, or set, of the of the elements
belonging to the population from which the sample will be
drawn. Ideally, the sample frame is equal to the
population.
Example: For a 1936 Presidential Election Poll
Literary Digest sent out 10 million “straw ballots” prior to
the election and got back 2.4 million.
Straw Ballots
Franklin Roosevelt
43%
Alf Landon
57%
Actual Results
62%
37%
The sampling frame used was telephone records. What
could have gone so wrong to misjudge the final result?
Section 1.3, Page 13
15
Sample Designs
Convenience Samples
Volunteer Samples
Judgmental samples (chosen for some specific
reason), voluntary samples (respondents select
themselves), and convenience samples (chosen
because convenient) are usually not acceptable methods
for formal statistical procedures!
Probability Samples: The elements are drawn on the
basis of probability – randomly. Each element of the
population has a certain probability of being selected.
Section 1.3, Page 13
16
Single-Stage Sampling Methods
Single-stage sampling: A sample design in which the
elements of the sampling frame treated equally and there
is no subdividing or partitioning of the frame.
Simple Random Sample: Sample selected in such a way
that every element of the population has an equal
probability of being selected and all samples of size n have
an equal probability of being selected.
Example: Select a simple random sample of 6
students from from a class of 30.
1.Number the students from 1 to 30 on the roster.
2.Get 6 non-recurring random numbers between 1
and 30.
3.The six students who match the six random
numbers are the sample.
Section 1.3, Page 13
17
Single-Stage Sampling Methods
Systematic Sample: A sample in which every k-th
item from the sampling frame is selected which is
randomly selected from the first k elements.
Example: Select a systematic sample of six
students from a class of 30.
1. K = 30/6 = 5
2. Select a random number between 1 and
5. Say 3 is selected.
3. The sample will include the 3rd, 8th, 13th,
18th, 23rd , and 28th students on the roster.
Section 1.1, Page 8
18
Multistage Sampling Designs
Multistage Sampling: A sample design in which the
elements of the sampling frame are subdivided and the
sample is chosen in more than one stage.
Stratified Random Sampling: A sample is selected by
stratifying the population, or sampling frame, and then
selecting a number of items from each of the strata by
means of a simple random sampling technique.
The strata are usually subgroups of the sampling frame
that are homogeneous but different from each other.
Example: Select a sample of six students from a
class of 30 so that the sample contains an equal
number of males and females.
1.List the males and females separately
2.Take a simple random sample of 3 students
from each group.
3.The six students selected are the sample.
Section 1.3, Page 15
19
Multi-Stage Sampling Designs
Cluster Sample: A sample obtained stratifying the
population, or sampling frame, and then selecting
some or all of the items from some, but not all of the
strata.
The strata are usually easily identified subgroups of
the sampling frame that are similar to each other.
This is often the most economical way to sample a
large population.
Example: Take a sample of 300 Catholics in
the Seattle Area.
1. Get a list of the Catholic Parishes in the
Seattle area.
2. Take a random sample of 3 parishes.
3. In each parish, select a simple random
sample of 100 parishioners.
Section 1.3, Page 16
20
Problems
Section 1.3, Page 20
21
Problems
Section 1.3, Page 20
22
Problems
a. What kind of study was this – experiment or
observational study?
b. What sampling method was used?
c. Can these results be use for statistical inference?
Why or why not?
Problems, Page 20
23
Probability vs. Statistics
If a chip is drawn at random from a bag
containing these chips, the probability that
it will be green is 20/60 =1/3.
A sample of ten 10 is drawn from the bag.
There were 3 green chips. We are 95%
sure that the true proportion of green chips
is between .25 and .35.
Section 1.4, Page 16
24
Problems
Problems, Page 21
25
Karl Pearson
Father of Modern Statistics
“In the 20th century, the role
of mathematics has become
increasingly decisive, and
studies
of
these
new
statistical tools and practices
are gradually being written,
episode by episode discipline
by discipline. In the end, a
picture will emerge of a
powerful
body
of
mathematics,
allied
to
schemes for data gathering
and designing experiments, that has become one of the
most important sources of scientific expertise and
guarantors of objectivity in the modern world. It is the
narrow
gate
through
which
must
pass
new
pharmaceuticals,
manufacturing
processes,
official
measures of all descriptions, and empirical findings of
psychologists, economists, biologists and many others. In
that sense, its import goes far beyond the history of a
mathematical discipline. Statistics has functioned as no
narrow specialty, but as a vital if often invisible element of
the cultural history of government, business, and the
professions, as well as science.”
”Karl Pearson, The Scientific Life in a statistical age” by Theodore Porter, 2004. page 4.
26