슬라이드 1 - KAIST 수리과학과
Download
Report
Transcript 슬라이드 1 - KAIST 수리과학과
Chapter 8.
Inferences
on a Population Mean
8.1 Confidence Intervals
8.2 Hypothesis Testing
8.3 Summary
NIPRL
8.1 Confidence Intervals
8.1.1 Confidence Interval Construction(1/8)
•
Confidence Intervals
– A confidence interval for an unknown parameter θ is an interval that
contains a set of plausible values of the parameter.
– It is associated with a confidence level 1-, which measures the
probability that the confidence interval actually contains the
unknown parameter value.
– Confidence levels of 90%, 95%, and 99% are typically used.
NIPRL
8.1.1 Confidence Interval Construction(2/8)
• Inferences on a Population Mean
– Inference methods on a population mean based upon the tprocedure are appropriate for large sample sizes n ≥ 30 and
also for small sample sizes as long as the data can
reasonably be taken to be approximately normally
distributed.
– Nonparametric techniques (Chapter 15) can be employed
for small sample sizes with data that are clearly not
normally distributed.
NIPRL
8.1.1 Confidence Interval Construction(3/8)
•
Two-Sided t-Interval
– A confidence interval with confidence level 1- for a population
mean based upon a sample of n continuous data observations
with a sample mean x and a sample standard deviation s is
t / 2,n1s
t / 2,n1s
x
,x
n
n
– The interval is known as a two-sided t-interval or variance unknown
confidence interval.
NIPRL
8.1.1 Confidence Interval Construction(4/8)
• A two-sided t-interval
x
t / 2,n1s
s
t / 2,n 1
n
x
n
x
ˆ
critical point s.e.(ˆ )
NIPRL
t / 2,n 1s
n
8.1.1 Confidence Interval Construction(5/8)
• The length of two-sided t-interval is
L
2t / 2,n1s
n
2 critical point s.e.(ˆ )
• As the standard error of ˆ decreases, so that ˆ x becomes a
more “accurate” estimate of .
• The length of a confidence interval also depends upon the
confidence level. As the confidence level increases, the length
of the confidence interval also increase.
NIPRL
8.1.1 Confidence Interval Construction(6/8)
• We know that
n X
S
tn1
and so the definition of the critical point of the t-distribution
ensures that
n X
P t / 2,n1
t / 2,n1 1
S
NIPRL
8.1.1 Confidence Interval Construction(7/8)
• And
t
S
t
S
P X / 2,n1 X / 2,n1 1
n
n
• This probability statement should be interpreted as saying that
there is a probability of 1- that the random confidence interval
limits take values that “straddle” the fixed value .
NIPRL
8.1.1 Confidence Interval Construction(8/8)
• Technically speaking,
n X S has a t-distribution only
when the random variables X i are normally distributed.
• Nevertheless, the central limit theorem ensures that the
distribution of X is approximately normal for reasonably large
sample sizes, and in such cases it is sensible to construct t-
intervals regardless of the actual distribution of the data
observations.
NIPRL
Example 14 : Metal Cylinder Production (p.365)
•
Data : 60 metal cylinder diameters (page 290, Figure 6.5).
•
Summary statistics:
•
n = 60
Median = 50.01
Max. = 50.36
x = 49.999
Upper quartile = 50.07 Min. = 49.74
s = 0.134
Lower quartile = 49.91
Critical points:
Sample size n = 60
Confidence level 90%: t0.05,59 = 1.671
Confidence level 95%: t0.025,59 = 2.001
Confidence level 99%: t0.005,59 = 2.662
NIPRL
Example 14 : Metal Cylinder Production(2/4)
• Confidence interval with confidence level 90%:
1.671 0.134
1.671 0.134
49.999
,
49.999
(49.970,50.028)
60
60
• Confidence interval with confidence level 95%:
2.001 0.134
2.001 0.134
49.999
,
49.999
(49.964,50.033)
60
60
• Confidence interval with confidence level 99%:
2.662 0.134
2.662 0.134
, 49.999
49.999
(49.953,50.045)
60
60
NIPRL
Example 14 : Metal Cylinder Production(3/4)
• Confidence intervals for mean metal cylinder diameter
49.970
x =49.999
50.028
90%
49.964
x =49.999
50.033
95%
49.953
x =49.999
99%
NIPRL
50.045
Example 14 : Metal Cylinder Production(4/4)
• Conclusion with confidence interval:
With over 99% certainty, the average cylinder diameter lies
within 0.05 mm of 50.00mm, that is, within the interval
(49.95, 50.05).
• Comment:
It is important to remember that this confidence interval is for
the mean cylinder diameter, and not for the actual diameter
of a randomly selected cylinder.
NIPRL
8.1.2 Effect of the Sample Size
on Confidence Intervals(1/4)
• Recall:
L
2t / 2,n1s
n
• For a fixed critical point, a confidence interval length L is
inversely proportional to the square root of the sample size n.
• Notice that this dependence of the critical point on the sample
size also serves to produce smaller confidence intervals with
large sample sizes.
NIPRL
8.1.2 Effect of the Sample Size
on Confidence Intervals(2/4)
• If a confidence interval with a length no large than L0 is required,
then sample size
t / 2, n 1s
n 4
L
0
2
must be used. This inequality can be used to find a suitable
sample size n if approximate values or upper bounds are used for
t/2, n-1 and s.
NIPRL
8.1.2 Effect of the Sample Size
on Confidence Intervals(3/4)
• Example (p. 364): For the mean thickness of plastic sheets, an
experimenter wishes to construct a 95% confidence interval with
a length no larger than L0 = 2.0 mm.
– Known fact: the standard deviation (s) 4.0 mm
– Assumption: t0.025, n-1 2.1, for a large enough sample size
– Then a sample size is
t / 2, n 1s
2.1 4.0
2
2
L0
n
n
2.1 4.0
n 4
70.56
2.0
2
NIPRL
8.1.2 Effect of the Sample Size
on Confidence Intervals(4/4)
•
Additional sampling:
– For a sample size n1 and a sample standard deviation s, the length of
L
the confidence interval is
2t / 2, n1 1s
n1
– To reduce the confidence interval length to L0 < L, the size of the
additional sample required is
n n1 ,
NIPRL
t / 2,n1 1s
where n 4
L
0
2
Example 14 : Metal Cylinder Production (p. 365)
•
n = 60, 99% confidence interval (49.953, 50.045) and confidence
length = 0.092 mm
•
Question: How much additional sampling is required to provide the
increased precision of a confidence interval with a length of 0.08
mm at the same confidence level?
•
Answer: A total sample size required is
2
t0.005,59 s
2.662 0.134
n 4
4
79.53
0.08
L0
2
– Therefore, an additional sample of at least 80-60 = 20 cylinders
is needed.
NIPRL
8.1.4 Simulation Experiment(1/2)
•
Figure 8.8(page 369) shows the 500 simulation results (a mean of =
10, a sample size of n = 30) and 95% confidence intervals.
•
Notice that, in simulations 24 and 37, the confidence intervals do not
include the correct value = 10.
•
Each simulation provides a 95% confidence interval which has a
probability of 0.05 of not containing the value = 10.
NIPRL
8.1.4 Simulation Experiment(2/2)
• Since the simulations are independent of each other, the
number of simulations out of 500 for which the confidence
interval does not contain = 10 has a binomial distribution with
n = 500 and p = 0.05.
• In practice, an experimenter observes just one data set, and it
has a probability of 0.95 of providing a 95% confidence interval
that does indeed straddle the true value .
NIPRL
8.1.5 One-Sided Confidence Intervals(1/4)
• One-Sided t-Interval: One-sided confidence intervals with
confidence levels 1- for a population mean based on a
sample of n continuous data observations with a sample mean x
and a sample standard deviation s are
t ,n1s
, x
n
which provides an upper bound on the population mean , and
x
,
n
t ,n1s
which provides a lower bound on the population mean .
NIPRL
8.1.5 One-Sided Confidence Intervals(2/4)
• c.f.
Since
n X
S
tn1
the definition of the critical point t,n-1 implies that
n X
1
P t ,n1
S
NIPRL
8.1.5 One-Sided Confidence Intervals(3/4)
This may be rewritten
t
S
P X ,n1 1
n
so that
t ,n1s
, x
n
is a one-sided confidence interval for with a confidence level
of 1-.
NIPRL
8.1.5 One-Sided Confidence Intervals(4/4)
• Figure 8.10: Comparison of two-sided and one-sided
confidence intervals
x
One-sided(upper bound)
x
t / 2,n1s
Two-sided
One-sided(lower bound)
NIPRL
t ,n 1s
x
n
x
x
x
t ,n 1s
n
x
n
t / 2,n 1s
n
Example 45 : Hospital Worker Radiation Exposures
• n = 28,
sample mean x = 5.145,
• sample standard deviation s = 0.7524,
• critical point t0.01, 27 = 2.473.
• 99% one-sided confidence interval for is
t ,n1s
2.473 0.7524
, x
,5,145
,5.496
n
28
• Consequently, with a confidence level of 0.99 the experimenter
can conclude that the average radiation level at a 50cm
distance from a patient is no more than about 5.5.
NIPRL
8.1.6 z-Intervals(1/2)
• Two-Sided z-Interval
If an experimenter wishes to construct a confidence interval for
a population mean based on a sample of size n with a sample
mean x and using an assumed known value for the population
standard deviation , then the appropriate confidence interval is
z / 2
z / 2
x
, x
n
n
which is known as a two-sided z-interval or variance known
confidence interval.
NIPRL
8.1.6 z-Intervals(2/2)
• One-Sided z-Interval
One-sided 1- level confidence intervals for a population
mean based on a sample of n observations with a sample x
mean and using a known value of the population standard
deviation are
, x
z
n
and
x
z
,
n
These confidence intervals are known as one-sided z-intervals.
NIPRL
8.2 Hypothesis Testing
8.2.1 Hypotheses(1/2)
• Hypothesis Tests of a Population Mean
– A null hypothesis H0 for a population mean is a statement that
designates possible values for the population mean.
– It is associated with an alternative hypothesis HA, which is the
“opposite” of the null hypothesis.
– A two-sided set of hypotheses is
H0 : = 0
versus
for specified value of .
NIPRL
HA : ≠ 0
8.2.1 Hypotheses(2/2)
– A one-sided set of hypotheses is either
H0 : 0
versus
HA : > 0
versus
HA : < 0
or
H0 : ≥ 0
NIPRL
Example 14 : Metal Cylinder Production
• The machine that produces metal cylinders is set to make
cylinders with a diameter of 50 mm.
• The two-sided hypotheses of interest are
H0 : = 50
versus
HA : ≠ 50
where the null hypothesis states that the machine is calibrated
correctly.
NIPRL
Example 47 : Car Fuel Efficiency
• A manufacturer claim : its cars achieve an average of at least 35
miles per gallon in highway driving.
• The one-sided hypotheses of interest are
H0 : ≥ 35
versus
HA : < 35
• The null hypothesis states that the manufacturer’s claim
regarding the fuel efficiency of its cars is correct.
NIPRL
8.2.2 Interpretation of p-values(1/4)
• Types of error
– Type I error: An error committed by rejecting the null
hypothesis when it is true.
– Type II error: An error committed by accepting the null
hypothesis when it is false.
• Significance level
– is specified as the upper bound of the probability of type I
error.
NIPRL
8.2.2 Interpretation of p-values(2/4)
• p-value of a test (or observed level of significance)
– Definition: The p-value of a test is the probability of
obtaining a given data set or worse when the null hypothesis
is true.
– A data set can be used to measure the plausibility of null
hypothesis H0 through the construction of a p-value.
– The smaller the p-value, the less plausible is the null
hypothesis. (why?)
NIPRL
8.2.2 Interpretation of p-values(3/4)
• Rejection of the Null Hypothesis
– If a p-value is smaller than the significance level, then the
hypothesis H0 is rejected in favor of the alternative
hypothesis HA.
•
Acceptance of the Null Hypothesis
– A p-value larger than 0.10 is generally taken to indicate that the null
hypothesis H0 is a plausible statement. The null hypothesis H0 is
therefore accepted.
– However, this does not mean that the null hypothesis H0 has been
proven to be true.
NIPRL
8.2.2 Interpretation of p-values(4/4)
• Intermediate p-values
– A p-value in the range 1% ~ 10% is generally taken to
indicate that the data analysis is inconclusive. There is some
evidence that the null hypothesis is not plausible, but the
evidence is not overwhelming.
NIPRL
Example 14 : Metal Cylinder Production
• Q : whether the machine can be shown to be calibrated
incorrectly.
• H0 : = 50, HA : ≠ 50
• With a small p-value,
– The null hypothesis is rejected and the machine is
demonstrated to be miscalibrated.
• With a large p-value,
– The null hypothesis is accepted and the experimenter
concludes that there is no evidence that the machine is
calibrated incorrectly.
NIPRL
8.2.3 Calculation of p-values
• Two-sided t-test
– Consider testing
–
p value 2 P( X | t |)
where
X
NIPRL
H0 : 0 vs H A : 0
x 0
s/ n
• One-sided t-test
– Consider testing
– Then
p value P( X t )
– Consider testing
– Then
NIPRL
H0 : 0 vs H A : 0
H0 : 0 vs H A : 0
p value P( X t )
Example 14 : Metal Cylinder Production
H0: = 0
versus
HA: ≠ 0
• The data set of metal cylinder diameters: n = 60,
s = 0.1334
• 0 = 50.0
=x 49.99856,
49.99856 50.0
t
0.0836
0.1334 60
• p-value = 2 x P(X≥0.0836), X ~ t-distribution with n-1=59 d.f.
• p-value = 2 x 0.467 = 0.934
• With such a large p-value,
t59 distribution
0.467
the null hypothesis is accepted.
0 ltl=0.0836
NIPRL
Example 47 : Car Fuel Efficiency (1/2)
H0 : 35 versus H A : 35
•
n=20,
•
the fuel efficiency with a sample mean x =34.271 miles/gallon
•
sample standard deviation s=2.915 miles/gallon.
•
0 = 35.0
t
•
20 34.271 35.0
2.915
1.119
The alternative hypothesis is HA: < 35, so that
p value P X 1.119
where X has a t-distribution with n-1=19 d.f.
NIPRL
Example 47 : Car Fuel Efficiency (2/2)
• The value can be shown to
t19 distribution
be p-value = 0.1386.
• This p-value is larger than 0.10
and so the null hypothesis
should be accepted.
-1.119
p-value
NIPRL
0
8.2.4 Significance Levels(1/14)
•
Significance Level of a Hypothesis Test
– A hypothesis test with a significance level or size
rejects the null hypothesis H0 if a p-value smaller than is obtained
and
accepts the null hypothesis H0 if a p-value larger than is obtained.
•
P-values are more informative than knowing whether a size test
accepts or rejects the null hypothesis. (Why?)
NIPRL
8.2.4 Significance Levels(2/14)
•
Two-Sided Problems
– Two-Sided Hypothesis Test for a Population Mean
• A size test for the two-sided hypotheses
H0: = 0
versus
HA: ≠ 0
rejects the null hypothesis H0 if the test statistic ltl falls in
the rejection region
t t / 2, n1
and accepts the null hypothesis H0 if the test statistic ltl falls in
the acceptance region
•
What is the role of
NIPRL
here?
t t / 2, n1
Example 14 : Metal Cylinder Production
H0: = 0
versus
HA: ≠ 0
• The data set of metal cylinder diameters gives a test statistic of
ltl = 0.0836
• =0.10 t0.05,59=1.671
• =0.05 t0.025,59=2.001
• =0.01 t0.005,59=2.662
• The test statistic is smaller than each of these critical points,
and so the hypothesis tests all accept the null hypothesis.
• The p-value is therefore known to be larger than 0.10, and in fact
the previous analysis found the p-value to be 0.934.
NIPRL
8.2.4 Significance Levels(3/14)
•
Relationship Between Confidence Intervals and Hypothesis Tests
– The value 0 is contained within a 1- level two-sided confidence
interval
t / 2,n1s
t / 2,n1s
, x
x
n
n
if the p-value for the two-sided hypothesis test
H0: = 0
is larger than .
NIPRL
versus
HA: ≠ 0
8.2.4 Significance Levels(4/14)
• If 0 is contained within the 1- level confidence interval,
the hypothesis test with size accepts the null hypothesis,
and
• if 0 is not contained within the 1- level confidence interval,
the hypothesis test with size rejects the null hypothesis.
NIPRL
8.2.4 Significance Levels(5/14)
• (FIGURE 8.36, Page 399)
H0 : 0
p value
x
NIPRL
versus
H A : 0
p value
t / 2,n1s
x
p value
x
t / 2,n 1s
n
n
1- level two-sided confidence interval
Example 14 : Metal Cylinder Production (1/2)
• A 90% two-sided t-interval for the mean cylinder diameter was
found to be (49.970, 50.028).
• This contains the value 0=50.0
and so is consistent with
• the hypothesis testing problem
H0: = 50.0
versus
HA: ≠ 50.0
having a p-value of 0.934, so that the null hypothesis is accepted at
size =0.10.
NIPRL
Example 14 : Metal Cylinder Production (2/2)
• The 90% confidence interval implies that the hypothesis testing
problem
H0: = 0
versus
HA: ≠ 0
has a p-value larger than 0.10 for 49.970 050.028 and a p-value
smaller than 0.10 otherwise. (Why ?)
NIPRL
8.2.4 Significance Levels(6/14)
• One-Sided Inferences on a Population Mean(H0: 0) (p.401)
– A size test for the one-sided hypothesis
H0: 0
versus
HA: > 0
rejects the null hypothesis when
t>t
,n-1
and accepts the null hypothesis when
t t
NIPRL
,n-1
8.2.4 Significance Levels(7/14)
• One-sided Inferences on a Population Mean(H0: 0)
– The 1- level one-sided upper confidence interval
x
,
n
t ,n1s
consists of the values 0 for which this hypothesis testing
problem has a p-value larger than .
NIPRL
8.2.4 Significance Levels(8/14)
•
Relationship between hypothesis testing and confidence intervals for
one-sided problems
(FIGURE 8.40, Page 403)
H0 : 0
p value
x
NIPRL
versus
H A : 0
p value
t ,n 1s
x
n
1- level one-sided confidence interval
8.2.4 Significance Levels(9/14)
• One-sided Inferences on a Population Mean (H0: ≥ 0)
– A size test for the one-sided hypothesis
H0: ≥ 0
versus
HA: < 0
rejects the null hypothesis when
t < -t
,n-1
and accepts the null hypothesis when
t ≥ -t
NIPRL
,n-1
8.2.4 Significance Levels(10/14)
• One-sided Inferences on a Population Mean(H0: ≥ 0)
– The 1- level one-sided lower confidence interval
, x
t ,n1s
n
consists of the values 0 for which this hypothesis testing
problem has a p-value larger than .
NIPRL
8.2.4 Significance Levels(11/14)
• Relationship between hypothesis testing and confidence intervals
for one-sided problems (FIGURE 8.39, Page 402)
H0 : 0
versus
H A : 0
p value
p value
x
x
t ,n 1s
1- level one-sided confidence interval
NIPRL
n
Example 47 : Car Fuel Efficiency (1/2)
• The one-sided hypotheses of interest here are
– H0: ≥ 35.0
versus
HA: < 35.0
• t = -1.119 > –t0.10,19 = -1.328.
• So
a size = 0.10 hypothesis test accepts the null hypothesis.
• This conclusion is consistent with the previous analysis where
the p-value was found to be 0.1386, which is larger than
=0.10.
NIPRL
Example 47 : Car Fuel Efficiency (2/2)
• Furthermore, the one-sided 90% lower t-interval
, x
t ,n1s
1.328 2.915
,34.271
,35.14
n
20
contains the value 0 = 35.0, as expected.
• In fact, this confidence interval indicates that the hypothesis
testing problem
– H0: ≥ 0
versus
HA: < 0
has a p-value larger than 0.10 for any value of 0 35.14.
NIPRL
8.2.4 Significance Levels(12/14)
• Power of a hypothesis Test
– The power of a hypothesis test is defined to be
• power = 1 – (probability of Type II error | H A )
which is the probability that the null hypothesis is rejected
when it is false.
• Larger power levels and shorter confidence intervals are both
indications of an increase in the “precision” of an experiment
( why? ).
NIPRL
8.2.4 Significance Levels(13/14)
FIGURE 8.42 The specification of two quantities determines
the third quantity
Hypothesis Testing
Sample size n
Significance level
Power of hypothesis test
Confidence Intervals
Confidence level 1-
NIPRL
Sample size n
Length of confidence interval
8.2.4 Significance Levels(14/14)
• Relationship Between Power and Sample Size
– For a fixed significance level , the power of a hypothesis
test increases as the sample size n increases. (Why? )
NIPRL
8.2.5 z-Tests (1/7)
(t-test, s, t , n-1 ) (z-test, , z )
• Two-Sided z-test
– The p-value for the two-sided hypothesis testing problem
H0: = 0
versus
HA: ≠ 0
based upon a data set of n observations with a sample
mean
x and an assumed “known” population standard
deviation , is
p value 2 z
where x is the standard normal cumulative distribution
function.
NIPRL
8.2.5 z-Tests (2/7)
•
Two-Sided z-test
z
n x 0
which is known as the z-statistic.
•
A size test rejects the null hypothesis H0 if the test statistic lzl falls in
the rejection regionz z / 2
and accepts the null hypothesis H0 if the test statistic lzl falls in the
acceptance region
NIPRL
z z / 2
8.2.5 z-Tests (3/7)
– The 1- level two-sided confidence interval
•
z / 2
z / 2
x
,x
n
n
consists of the value 0 for which this hypothesis testing
problem has a p-value larger than .
NIPRL
8.2.5 z-Tests (4/7)
• One-Sided z-Test (H0: 0)
– The p-value for the one-sided hypothesis testing problem
H0: 0
versus
HA: > 0
based upon a data set of n observations with a sample mean
x and an assumed known population standard deviation ,
is
p value 1 z
This testing procedure is called a one-sided z-test.
NIPRL
8.2.5 z-Tests (5/7)
• One-Sided z-Test (H0: 0)
– A size test rejects the null hypothesis when
z z
and accepts the null hypothesis when
z z
– The 1- level upper one-sided confidence interval
z
x ,
n
NIPRL
8.2.5 z-Tests (6/7)
• One-Sided z-Test (H0: ≥ 0)
– The p-value for the one-sided hypothesis testing problem
• H0: ≥ 0
versus
HA: < 0
based upon a data set of n observations with a sample mean
x and an assumed known population standard deviation ,
is
•
p value z
This testing procedure is called a one-sided z-test.
NIPRL
8.2.5 z-Tests (7/7)
• One-Sided z-Test (H0: ≥0)
– A size test rejects the null hypothesis when
z z
and accepts the null hypothesis when
z z
– The 1- level lower one-sided confidence interval
z
, x
n
NIPRL