Transcript Chapter 8

Statistical Inference:
Estimation for Single
Populations
Chapter 8
MSIS 111
Prof. Nick Dedeke
PowerPoint presentations prepared by Lloyd Jaisingh,
Morehead State University
Learning Objectives
• Know the difference between point
and interval estimation.
• Estimate a population mean from a
sample mean when s is known.
• Estimate a population mean from a
sample mean when s is unknown.
• Estimate the population variance from
a sample variance.
• Estimate the minimum sample size
necessary to achieve given statistical
goals.
Concept of Inferential
Statistics
• In inferential statistics, the
objective is to estimate
parameters of a large sample
using the statistics of a smaller
sample drawn from it.
Known
Statistics:
sample mean
sample variance
z-value
Unknown
Parameters:
Population mean
population variance
Example: Concept of
Estimation
• Three managers wanted to
investigate absenteeism in their
organization. Each of them took a
random sample of 2,000 employees.
Here are the results:
– Bill’s sample yield average of 4 days per
year.
– Chen’s sample yielded average of 3.2
days per year.
– Ayo’s sample yielded average of 3.7
days per year.
What should we accept as the average
absenteeism for all the 10,000
employees of the firm?
Concept of Confidence Level
• After one has specified an interval,
the question becomes the following:
How confident one is that the
population parameter will truly lie
in the range we define?
• This is an area where central limit
theorem may help us.
•
Central limit theorem states that, given a
sufficiently large sample size, the
distribution of the sample means would
be normally distributed.
Confidence Level
Distribution
of the means
of all samples
drawn from the
population.


X
X
X
 Z
2
X
Z
Z
2
If we picked three different samples, and calculated the sample means
and intervals, we could have the intervals shown above.
We see that the three different intervals, of same width,
would include the population mean. 
Confidence Level
95% confidence lines
are defined to
ensure that the
area between
mean and z is
0.95/2. The area in the
grey area is 0.95.


X
X
X
 Z
X
Z
2
2
Z
95% Confidence level means that if one took several different samples
from the population, and calculated the sample mean, 95 out of 100
sample means would fall within the area.  

Z
2
Z
2
Confidence Level and Interval
Estimates
40%
confidence
interval line
60%
confidence
interval line


 Z
2
X
Z
2
95%
confidence
interval line
Z
We see that the three different intervals presented are of different
width. Specifically, to have larger confidence, the interval estimate
is wider. Narrower interval estimates reduce our confidence that
population mean parameter would lie in interval.
Known Population Standard
Deviation
• The following presents two samples that were taken
from the same population. In the first case the mean
is higher than the population mean in the other case
it is lower.
xmax
xmax
xmax X
s
μ
X
s
xmax
xmax
xmin
Confidence Interval Estimates
• Interval estimate approach defines upper and
lower limits around the sample mean using
confidence levels. If the acceptable mean of
population falls within the limits, the population is
accepted, if not it is rejected.
Confidence
interval #1
xmax
Confidence
interval #2
xmax
xmax X
s
μ
X
s
xmax
xmax
xmin
Inferential Statistics
Assumptions
• For interferential statistics to be accurate, some
assumptions must be fulfilled:
– The process that the objects or entities passed
through are stable, i.e. the variations in attribute
observations are not due to special causes
– The sample is statistically drawn from the
population.
– The sample is large enough to represent the
population.
– The distribution of values for the attribute of the
sample and population could be assumed to be
normal.
– Having statistical estimates about a population
can be reasonably used as a basis for decisionmaking
Statistical Estimation
• Point estimate -- the single value of a
statistic calculated from a sample which is
used to estimate a population parameter
• Interval Estimate -- a range of
values calculated from a sample
statistic(s) and standardized
statistics, such as the z
– Selection of the standardized statistic
is determined by the sampling
distribution.
– Selection of critical values of the
standardized statistic is determined
by the desired level of confidence.
Concept of Inferential
Statistics
Z statistic can be used if both the
sample mean and sample
standard deviation and the
population standard deviation are
known.
Known
Statistics:
sample mean
sample variance
z-value
Unknown
Parameters:
Population mean
known
population
standard
deviation
Confidence Interval
Estimate for  when s is
Known
x
x
n
• Point estimate
• Interval Estimate
x  z / 2
s
n
or
x  z / 2
s
n
   x  z / 2
s
n
Distribution of Sample
Means
for (1-)% Confidence


2
2


 Z
2
0
X
Z
2
Z
Areas Under Curve: (1-)%
Confidence


2
2
.5

.5
2

2

 Z
2
0
X
Z
2
Z
Distribution of Sample
Means
for (1-)% Confidence


2
1
2
2
1
2

 Z
2
0
X
Z
2
Z
Distribution of Sample
Means
for 95% Confidence
.025
.025
95%
.4750
.4750

X
Z
-1.96
0
1.96
Example: 95% Confidence
Interval for  (s known)
x  510, s  46, n  85, z / 2  1.96
x  z / 2
s
   x  z / 2
s
n
n
46
46
510 1.96
   510 1.96
85
85
510 9.78    510 9.78
500.22    519.78
95% Confidence Intervals for 
95%

X
X
X
X
X
X
X
95% Confidence Intervals for 
Is our interval,
95%

X
X
X
X
X
X
X
500.22   
519.78, in the
red?
Example: Interval
Estimates 90% confidence
(Text 8.1)
x  10.455, s  7.7, n  44.
90% confidence z / 2  1.645
x  z / 2
s
   x  z / 2
s
n
n
7.7
7.7
10.455 1.645
   10.455 1.645
44
44
10.455 1.910    10.455 1.910
8.545    12.365
Concept of Inferential
Statistics
Z statistic can not be used if the
population standard deviation is
unknown. If distribution is not
normal
and sample size exceeds 30.
We can estimate the
Unknown
parameter.
Parameters:
Known
Statistics:
sample mean
sample variance
z-value
Population mean
Population
standard
deviation
Confidence Interval to
Estimate  and s is
Unknown
x  z
/2
s
n
/2
s
   x  z
n
or
x  z
/2
s
n
Car Rental Firm Example
x  85.5, sample st. dev. (s)  19.3, and n  110.
99% confidence  z  2.575
s
s
xz
   x z
n
n
19.3
19.3
85.5  2.575
   85.5  2.575
110
110
85.5  4.7    85.5  4.7
80.8    90.2
Exercise: Derive Z Values for
Common Levels of Confidence
Confidence
Level
90%
95%
P(z/2)
z/2 Value
??
1.96
98%
???
99%
???
= [0.5 –(1-0.95)/2)] = 0.5 – 0.025 = 0.475
= from page 788 Table A5.
z/2 = 1.96
Estimating the Mean of a
Normal Population:
Unknown s
• The population has a normal distribution.
• The value of the population standard
deviation is unknown.
• z distribution is not appropriate for these
conditions
• t distribution is appropriate
The t Distribution
• Developed by British statistician,
William Gosset
• A family of distributions -- a unique
distribution for each value of its
parameter, degrees of freedom
(d.f.)
• Symmetric, Unimodal, Mean = 0,
Flatter than a z
x
t
• t formula
s
n
Comparison of Selected t
Distributions
to the Standard Normal
Standard Normal
t (d.f. = 25)
t (d.f. = 5)
t (d.f. = 1)
-3
-2
-1
0
1
2
3
Table of Critical Values of t
df
1
2
3
4
5
t0.100 t0.050 t0.025 t0.010 t0.005
3.078
1.886
1.638
1.533
1.476
6.314
2.920
2.353
2.132
2.015
12.706
4.303
3.182
2.776
2.571
31.821
6.965
4.541
3.747
3.365
63.656
9.925
5.841
4.604
4.032
1.714
25
1.319
1.318
1.316
1.708
2.069
2.064
2.060
2.500
2.492
2.485
2.807
2.797
2.787
29
30
1.311
1.310
1.699
1.697
2.045
2.042
2.462
2.457
2.756
2.750
40
60
120
1.303
1.296
1.289
1.282
1.684
1.671
1.658
1.645
2.021
2.000
1.980
1.960
2.423
2.390
2.358
2.327
2.704
2.660
2.617
2.576
23
24

1.711


t
With df = 24 and  = 0.05,
t = 1.711.
Confidence Intervals for  of a
Normal Population: Unknown s
x  t / 2,n 1
s
n
or
x  t / 2,n 1
df  n  1
s
s
   x  t / 2,n 1
n
n
Solution for Demonstration
Problem 8.3
x  2.14, s  1.29, n  14, df  n  1  13
 1  .99

 0.005
2
2
t .005,13  3.012
s
x  t / 2, n 1

n
1.29
2.14  3.012

14
2.14  1.04  
1.10  
s
 x  t / 2, n 1
n
1.29
 2.14  3.012
14
 2.14  1.04
 3.18
Determining Sample Size
when Estimating 
• z formula
z
x
s
n
• Error of Estimation
(tolerable error)
• Estimated Sample
Size
E  x
z s
n
E
2
2
2
• Estimated s
1
s  range
4
2
 z s

 E

2
2



