Math 227 Elementary Statistics Bluman 6th edition CHAPTER 1 The Nature of Probability and Statistics.

Download Report

Transcript Math 227 Elementary Statistics Bluman 6th edition CHAPTER 1 The Nature of Probability and Statistics.

Math 227 Elementary Statistics
Bluman 6th edition
CHAPTER 1
The Nature of Probability and Statistics
Objectives
• Demonstrate knowledge of statistical
terms.
• Differentiate between the two branches of
statistics.
• Identify types of data.
• Identify the measurement level for each
variable.
Objectives (cont.)
• Identify the five basic sampling techniques.
• Explain the difference between an
observational and an experimental study.
• Explain how statistics can be used and
misused.
• Explain the importance of computers and
calculators in statistics.
Section 1.1 Descriptive and
Inferential Statistics
Statistics is the science of conducting
studies to collect, organize, summarize,
analyze, and draw conclusions from data.
Variables and Types
of Data
• In order to gain knowledge about seemingly random
events, statisticians collect information for variables that
describe the events.
• A variable is a characteristic or attribute that can assume
different values.
• Data are the values that variables can assume.
• A data set is a collection of data values.
• Random variables have values that are determined by
chance.
• Descriptive statistics consists of the collection,
organization, summarization, and presentation
of data.
• Inferential statistics consists of generalizing from
samples to populations, performing estimations
and hypothesis tests, determining relationships
among variables, and making predictions.
Example: Determine whether the results given are
examples of descriptive or inferential statistics.
a)
In the 1996 presidential election, voters in Massachusetts cast
1,571,763 votes for Bill Clinton, 718,107 for Bob Dole, and
227,217 for H. Ross Perot.
Answer: Descriptive Statistics
b)
Allergy therapy makes bees go away.
Answer: Inferential Statistics
Basic Vocabulary
• A population consists of all subjects that are being
studied.
• A sample is a group of subjects selected from a
population.
• A parameter is a characteristic or measure obtained by
using all the data values for a specific population
• A statistic is a characteristic or measure obtained by
using the data values from a sample.
Example:
Consider the problem of estimating the average grade
point average (GPA) of the 750 seniors at a college.
a) What is the population? How many data
values are in the population?
Answer: Population – seniors at a college
Data values – 750
b) What is the parameter of interest?
Answer: Their GPA
c)
Suppose that a sample of 10 seniors is selected, and their GPAs
are 2.72, 2.81, 2.65, 2.69, 3.17, 2.74, 2.57, 2.17, 3.48, 3.10.
Calculate a statistic that you would use to estimate the parameter.
Answer:
d)
Suppose that another sample of 10 seniors was selected. Would it
be likely that the value of the statistic is the same as in part (c)?
Why or why not? Would the value of the parameter remain the
same?
Answer: No, because another group of 10 seniors would
have different GPA’s.
Answer: Yes, the parameter would be the same because
we’re still looking at the GPA of all seniors.
1.2 Variables and Types of Data
Variables can be classified as qualitative
(categorical) or quantitative (numerical).
• Qualitative variables can be placed into distinct
categories according to some characteristic or
attribute.
• Quantitative variables are numerical in nature
and can be ordered or ranked.
Classification of Variables
(Cont.)
Quantitative variables can be further classified into
two groups.
• Discrete variables assume values that can be
counted. (e.g. # of books, # of desks)
• Continuous variables can assume all values
between any two specific values.
(e.g. length, time, etc)
Classification of Variables
Data
Qualitative
Quantitative
Discrete
Continuous
Example:
Classify each variable as qualitative or quantitative. If the variable
is quantitative, further classify it as discrete or continuous.
a) Number of people in a classroom
Answer: Quantitative – Discrete because # of people
can be counted.
b) Weights of new born babies in a hospital
Answer: Quantitative – Continuous because the
measurements are within a range.
c) Eye colors of students in Math 227
Answer: Qualitative
Boundaries of a continuous data
Continuous Data must be rounded because of the limits
of the measuring device. Answers are rounded to the
nearest given unit.
Ex) Heights might be rounded to the nearest inch.
*73 inches could mean any measure from 72.5
inches up to but not including 73.5 inches. So
the boundaries of 73 inches is given as
72.5 – 73.5 inches.
(All values up to but not including 73.5 inches)
Recorded Values and
Boundaries
Variable
Length
Recorded Value
15 centimeters
(cm)
Temperature 86 Fahrenheit
(F)
Time
0.43 second
(sec)
Mass
1.6 grams (g)
Bluman Chapter 1
17
Boundaries
14.5-15.5 cm
85.5-86.5 F
0.425-0.435
sec
1.55-1.65 g
Levels of Measurement:
• Variables can also be classified by how they are
categorized, counted, or measured.
• The level of measurement of the data is useful in
deciding what procedure to take to apply statistics to
real problems.
Ex) Can the data be organized into specific categories, such as area
of residence (rural, suburban, or urban)?
Can the data values be ranked, such as first place, second place,
etc.?
Are the values obtained from measurement, such as heights,
IQs, or temperature?
• Four common types of measurement scales are used to
classify variables: nominal, ordinal, interval, and ratio.
Levels of Measurement:
• Nominal—classifies data into mutually exclusive
(nonoverlapping), exhausting categories in which no
order or ranking can be imposed on the data.
• Ordinal—classifies data into categories that can be
ranked; however, precise differences between the ranks
do not exist.
• Interval—ranks data, and precise differences between
units of measure do exist; however, there is no
meaningful zero.
• Ratio—possesses all the characteristics of interval
measurement, and there exists a true zero.
Levels of Measurement:
Nominallevel data
Zip code
OrdinalInterval-level
level data data
Grade
SAT score
Ratio-level
data
Height
Gender
Rating
IQ
Weight
Eye color
Ranking
Temperature
Time
**Table is missing from the handout (book page 8)
Example 1: Classify each as nominal-level, ordinallevel, interval-level, or ratio level data.
a) Sizes of cars
Answer: Categorical – ordinal
b) Nationality of each student
Answer: Categorical – nominal
c) IQ of each student
Answer: Numerical – interval
d) Weights of new born babies
Answer: Numerical – ratio
Section 1.3 Data Collection and
Sampling Techniques
Surveys are the most common method of
collecting data. Three methods of
surveying are:
• Telephone surveys
• Mailed questionnaire surveys
• Personal interviews
Methods to obtain samples:
Random samples are selected using
chance methods or random methods.
e.g. Lottery
Random Sampling - selection so that each
has an equal chance of being selected
Methods to obtain samples (cont.):
• Systematic samples are obtained by
numbering each subject of the population
and then selecting every kth number.
e.g. A quality control engineer selects every
200th TV remote control from an assembly
line and conducts a test of qualities.
• Systematic Sampling
– Select some starting point and then select
every Kth element in the population
Methods to obtain samples (cont.):
• Stratified samples are obtained by
dividing the population into groups
according to some characteristic that is
important to the study, then sampling from
each group.
– e.g. A General Motors researcher has
partitioned all registered cars into categories
of subcompact, compact, mid-size, and fullsize. He is surveying 200 car owners from
each category.
• Stratified Sampling - subdivide the
population into at least two different
subgroups that share the same
characteristics, then draw a sample from
each subgroup (or stratum)
Methods to obtain samples (cont.):
• Cluster samples are obtained by using
intact groups called clusters.
– e.g. Two of the nine colleges in the L.A.
district are randomly selected, then all faculty
from the two selected college are interviewed.
• Cluster Sampling - divide the population
into sections (or clusters); randomly
select some of those clusters; choose all
members from selected clusters
• Convenience Samples are obtained due
to the ease of getting
– e.g. An NBC television news reporter gets a
reaction to a breaking story by polling people
as they pass the front of his studio.
Example 1: Identify which of these types of sampling is
used: random, systematic, convenience, stratified, or
cluster.
a) A marketing expert for MTV is planning a survey in which 500
people will be randomly selected from each age group of 1019, 20-29, and so on.
Answer: Stratified
b) A news reporter stands on a street corner and obtains a
sample of city residents by selecting five passing adults about
their smoking habits.
Answer: Convenience
c) In a Gallup poll of 1059 adults, the interview subjects were
selected by using a computer to randomly generate
telephone numbers that were then called.
Answer: Random
Example 1: Identify which of these types of sampling is
used: random, systematic, convenience, stratified, or
cluster. (Cont.)
d) At a police sobriety checkpoint at which every 10th driver was
stopped and interviewed.
Answer: Systematic
e) A market researcher randomly selects 10 blocks in the Village
of Newport, then asks all adult residents of the selected
blocks whether they own a DVD player.
Answer: Cluster
f) Foods plans to conduct a marketing survey of 100 men and
100 women in Orange County.
Answer: Stratified
Example 1: Identify which of these types of sampling is
used: random, systematic, convenience, stratified, or
cluster. (Cont.)
g) CNN is planning an exit poll in which 100 polling stations will
be randomly selected and all voters will be interviewed as
they leave the premises.
Answer: Cluster
h) An executive mixes all the returned surveys in a bin, then
obtains a sample group by pulling 50 of those surveys.
Answer: Random
i) The Dutchess County Commissioner of Jurors obtains a list of
42,763 car owners and constructs a pool of jurors by selecting
every 150th name on that list.
Answer: Systematic
Section 1-4 Observational and
Experimental Studies
• Observational Study – The experimenter records the
outcomes of an experiment without control.
• Experimental Study – The experimenter intervenes by
administering treatment to the subjects in order to study
its effects on the subject.
– An Independent Variable – the variable that is being manipulated
by the researcher.
– A Dependent Variable – the outcome variable.
– A Treatment Group – the group that is being treated.
– A Controlled Group – the group that is not being treated.
– Confounding Factors – factors other than the treatment that can
influence a study.
Example 1: Lipitor is a drug that is supposed to lower
the cholesterol level. To test the effectiveness of the
drug, 100 patients were randomly selected and 50
were randomly chosen to use Lipitor. The other 50
were given a placebo that contained no drug at all.
a) What is the treatment?
Answer: Lipitor
b) Identify the treatment group and the control group.
Answer: Treatment group – The group given Lipitor
Control group – The group given a placebo
c) Is this an observational or experimental study?
Answer: Experimental
d) What factor could confound the result?
Answer: Change eating habits, diet, exercise, smoking, genes.
Section 1.5 Uses and Misuses of
Statistics
Statistics can be misused in ways that are
deceptive:
1) using samples that are not representative of the
population;
2) questionnaire or interview process may be
flawed;
3) conclusions are based on samples that are far
too small;
4) using graphs that produce a misleading
impression; etc.
Section 1.6 Computers and
Calculators
• In the past, statistical calculations were done
with pencil and paper. However, with the advent
of calculators, numerical computations became
easier.
• Excel, MINITAB, and the TI-83 graphing
calculator can be used to perform statistical
computations.
• Students should realize that the computer and
calculator merely give numerical answers and
save time and effort of doing calculations by
hand.
SUMMARY
• The two major areas of statistics are descriptive and
inferential.
• When the populations to be studied are large,
statisticians use subgroups called samples.
• The five basic methods for obtaining samples are:
random, systematic, stratified, cluster, convenience.
• Data can be classified as qualitative or quantitative.
• The four basic types of measurement are nominal,
ordinal, interval, and ratio.
• The two basic types of statistical studies are
observational and experimental.