Document 7622144

Download Report

Transcript Document 7622144

Unit 7
Statistical Inference - 1
Estimation
FPP Chapters 21,23, 26-29
Point Estimation
Margin of Error
Interval Estimation
- Confidence Intervals
Sample Size Computations
Next:
Statistical Tests of Hypotheses
7-1
Estim
-Hypothesis Testing
A.05
Estimation
Box models:
If we know what goes in the box, then we
can say how likely various outcomes are.
In practice,
we do not know what is in the box.
We do not know the population
parameters.
Instead
we use data to estimate the population
parameters, such as average, %, sd, …
That is, we infer the population
parameters, based on the sample of data.
We make INFERENCE from the SAMPLE
(data) to the POPULATION.
7-2
Estim
A Model for Estimation
Sample Value =
Parameter Value + (Bias) + Chance Error
Thus,
Estimate =
Parameter Value + (Bias) + Chance Error
Size of Chance Errors depends partly upon the
sampling procedure
-----------------------
Recall:
Population
Parameter
Sample
Statistic
7-3
Estim
For now, assume random sampling,
100% response rate, and correct responses.
Margin of Error
Point estimate:
To estimate the population average (mean) with a
single value, use
The likely size of your estimation error is
Margin of Error = some multiple of SE
Ex 1: Margin of error for estimating the
population average by the sample average is
proportional to SE(avg).
Ex 2: Margin of error for estimating the
population percent by the sample percent is
proportional to SE(percent).
7-4
Estim
Newspaper Survey
Example
About the poll
This poll was conducted for
The Seattle Times by Elway
Research of Seattle. Pollsters
contacted 403 randomly
selected adults across the
state by telephone April 6-11.
The geographic distribution of
the respondents was reflective
of the population statewide.
The poll has a margin of
error of 5 percent, meaning
that, in theory, results have a
95 percent chance of coming
within 5 percentage points of
results that would have been
obtained had all adults in the 7-5
Estim
state been interviewed.
7-6
Estim
Interval Estimation
Combining Point Estimation & Margin of Error
Interval estimate:
Rather than give a single estimated value for
the parameter, give instead an estimated
interval of values.
This combines
point estimation
margin of error
Approximate level 68% confidence interval:
sample estimate +/- 1 SE(estimate)
Approximate level 95% confidence interval:
sample estimate +/- 2 SE(estimate)
Approximate level ____% confidence interval:
sample estimate +/- 2 ___ SE(estimate)
7-7
Estim
Confidence Intervals
A confidence interval is used when
estimating an unknown parameter from
sample data.
The interval gives a range for the
parameter - and a confidence level that
the range covers the true value.
The width of the interval depends upon
how confident you want to be that your
interval includes the population parameter
value.
Chances are in the sampling procedure,
not in the parameter.
7-8
Estim
Confidence Interval
Example-1
Course credits
7-9
Estim
Confidence Intervals- Ex 1
Point estimate:
Our group’s point estimate of the
population average is:
The likely size of our estimation error is:
Interval estimate:
Our approximate level 68% confidence
interval for the population average is:
7-10
Estim
Confidence Interval
Example- 2
Upper division
7-11
Estim
Confidence Intervals- Ex 2
Point estimate:
Our group’s point estimate of the
population proportion is:
The likely size of our estimation error is:
Interval estimate:
Our approximate level 68% confidence
interval for the population proportion is:
7-12
Estim
The Bootstrap
When estimating a population
percentage (i.e. when sampling from
a 0-1 box), the fraction of 0’s and 1’s
in the box is unknown.
The SD of the box can be estimated
by substituting the fraction of 0’s and
1’s in the sample for the unknown
fractions in the box.
The estimate is good when the
sample is reasonably *large*.
7-13
Estim
Confidence Interval
Example- 3
40,000 students enrolled
Simple random sample of 900,
of whom 630 “grew up in Washington”
Want to estimate
% UW students who “grew up in WA”
Use as estimate:
7-14
Estim
Confidence Intervals- Ex 3
The Box Model
7-15
Estim
Confidence Intervals- Ex 3
If we draw another sample of 900
students, what is the chance that the
observed sample % is between 65% and
75%?
7-16
Estim
The Bootstrap again
We don’t know what fraction of the
population “grew up in WA”.
So we estimate it
& substitute our estimate into the
formulae in place of the actual “true”
fraction.
7-17
Estim
Bootstrapping
.
7-18
Estim
Confidence Intervals- Ex 3
An approximate level ____ % confidence
level for the percent of UW students who
“grew up in Washington” is …
7-19
Estim
Assumptions for a
Confidence Interval
Before using this procedure for
constructing an approx level ____ % CI,
check that the following conditions are
met.
• simple random sample,
• either the population histogram is
approximately normal
or
the sample size is sufficiently large for the
Central Limit Theorem to give us
approximate normality
7-20
Estim
Interpreting a
Confidence Interval
IF
the appropriate conditions are met, & we
construct an approximate ____%
confidence interval.
We can be about ___% confident that this
interval contains the true population
parameter value.
7-21
Estim
Interpreting a
Confidence Interval
The CI depends on the sample.
The confidence level depends upon the
procedure used (multiplier for the SE).
For about 95% of all samples, the interval
sample ____ +/- 2 SE(____)
covers the population _____, and for the
other 5% it fails.
The chances are in the sampling
procedure, not in the parameter.
The parameter is a fixed number.
7-22
Estim
No !! NO !! NO !!
NO !! NO !! NO !! NO !! NO !! NO !!
“There is a 95% chance that the
parameter (population %) falls inside
my interval, 46% +/ 3%.”
NO !! NO !! NO !! NO !! NO !! NO !!
7-23
Estim
Warning !
Check carefully that the appropriate
conditions are met BEFORE applying any
statistical procedure - including the
construciton of Confidence Intervals.
These confidence inteval methods are for
simple random samples, and should NOT
be used for other kinds of samples !
7-24
Estim
Interpreting a
Confidence Interval
One of these is WRONG and one of these
is CORRECT.
UW enrollment - % “Washingtonians” ex.
Approx level 95% Confidence interval
[67%, 73%]
“There is a 95% chance that the
population % is between 67% and 73%.”
“For about 95% of all samples, the interval
sample % +/- 2 SE(%)
Covers the population percentage.
7-25
For the other 5% it fails.”
Estim
Confidence Levels
are Approximate
Because
(1) Using the Normal Approximation
(2)SE is estimated (unknown)
To use these procedures, check
- good simple reandom sampel
- If the percentage is near 0% or 100%,
then need much larger sample than if
the percentage is near 50%.
CHECK THESE CONDITIONS !!!
7-26
Estim
Sample Size
Computations
7-27
Estim
Another Confidence
Interval Example
A manufacturing process for bricks is
known to give an output whose weights
have sd 0.12 pounds, regardless of the
mean weight. A random sample of 100
bricks is selected from today’s output.
The sample mean is 4.07 pounds.
Construct an approximate level 95%
confidence interval for the mean weight
of today’s brick production.
7-28
Estim
Newb.88.295
And Another Confidence
Interval Example
A personnel manager knows that historically, the
scores on aptitude tests given to candidates
for trainee positions have vollowed a normal
distribution with SD 28.2.
A simple random sample of 30 test scores for the
curent year’s applicants was taken and found
to have a sample mean of 122.0.
Construct an approximate level 95% confidence
interval for the mean score for all of this year’s
applicants.
Newb.8 295
7-29
Estim
Bootstrap Example
The cloze readability procedure is
designed to measure the effectiveness
of a written communication. Research
has indicated that a score of 0.57 or
more on the cloze test demonstrates
adequate understandability of the
written material. A random sample of
352 certified public accountants was
asked to read financial report
messages. The sample mean score
was 0,6041 and the sample SD was
0.1128.
Construct an approximate level 95%
confidence interval for the population
mean score.
7-30
Estim