Econ 3780: Business and Economics Statistics

Download Report

Transcript Econ 3780: Business and Economics Statistics

Econ 3790: Business and
Economic Statistics
Instructor: Yogesh Uppal
Email: [email protected]
Chapter 11
Inferences About Population Variances

Inference about a Population Variance



Chi-Square Distribution
Interval Estimation of 2
Hypothesis Testing
Chi-Square Distribution

We will use the notation
to denote the value for
the chi-square distribution that provides an area of a
to the right of the stated
value.

For example, Chi-squared value with 5 degrees of
freedom (df) at a =0.05 is 11.07.
Interval Estimation of 2
.05
95% of the
possible 2 values
0
2
 .025
= 11.07
2
Interval Estimation of 2

Interval Estimate of a Population Variance
( n  1) s 2
 a2 / 2
 2 
( n  1) s 2
 2(1 a / 2)
where the  values are based on a chi-square
distribution with n - 1 degrees of freedom and
1 - a is the confidence coefficient.
Interval Estimation of 

Interval Estimate of a Population Standard
Deviation
(n  1) s 2
(n  1) s 2
 
2
a / 2
 (12 a / 2)
Taking the square root of the upper and lower
limits of the variance interval provides the confidence
interval for the population standard deviation.
Interval Estimation of 2

Example: Buyer’s Digest (A): Buyer’s Digest rates
thermostats manufactured for home temperature
control. In a recent test, 10 thermostats
manufactured by ThermoRite were selected and
placed in a test room that was maintained at a
temperature of 68oF. The temperature readings of the
ten thermostats are shown on the next slide.
Interval Estimation of 2

Example: Buyer’s Digest (A)
We will use the 10 readings below to
develop a 95% confidence interval
estimate of the population variance.
Thermostat
1
2
3
4
5
6
7
8
9
10
Temperature 67.4 67.8 68.2 69.3 69.5 67.0 68.1 68.6 67.9 67.2
Interval Estimation of 2
For n - 1 = 10 - 1 = 9 d.f. and a = .05
Selected Values from the Chi-Square Distribution Table
Area in Upper Tail
Degrees
of Freedom
5
6
7
8
9
10
.99
0.554
0.872
1.239
1.647
2.088
.975
0.831
1.237
1.690
2.180
2.700
.95
1.145
1.635
2.167
2.733
3.325
.90
1.610
2.204
2.833
3.490
4.168
2.558 3.247
3.940
4.865 15.987 18.307 20.483 23.209
Our
.10
9.236
10.645
12.017
13.362
14.684
2
value
.975
.05
11.070
12.592
14.067
15.507
16.919
.025
12.832
14.449
16.013
17.535
19.023
.01
15.086
16.812
18.475
20.090
21.666
Interval Estimation of 2

Sample variance s2 provides a point
estimate of  2.
2
(
x

x
)
6. 3

i
s2 

 . 70
n 1
9

A 95% confidence interval for the
population variance is given by:
(10  1). 70
(10  1). 70
2
 
19. 02
2. 70
.33 < 2 < 2.33
Hypothesis Testing about a
Population Variance

Left-Tailed Test
•Hypotheses
H0 :    0
H a :  2   02
2
2
where  02 is the hypothesized value
for the population variance
•Test Statistic
2 
( n  1) s 2
 20
Hypothesis Testing
About a Population Variance

Left-Tailed Test (continued)
•Rejection Rule
Critical value approach:
p-Value approach:
Reject H0 if  2  (12 a )
Reject H0 if p-value < a
where  (12 a ) is based on a chi-square
distribution with n - 1 d.f.
Hypothesis Testing
About a Population Variance

Right-Tailed Test
•Hypotheses
H0 :    0
2
2
H a :  2   20
where  02 is the hypothesized value
for the population variance
•Test Statistic
2 
( n  1) s 2
 20
Hypothesis Testing
About a Population Variance

Right-Tailed Test (continued)
•Rejection Rule
Critical value approach: Reject H0 if  2  a2
p-Value approach:
Reject H0 if p-value < a
where a2 is based on a chi-square
distribution with n - 1 d.f.
Hypothesis Testing
About a Population Variance

Two-Tailed Test
•Hypotheses
H 0 :  2   20
H a :  2   20
where  02 is the hypothesized value
for the population variance
•Test Statistic
2 
( n  1) s 2
 20
Hypothesis Testing
About a Population Variance

Two-Tailed Test (continued)
•Rejection Rule
Critical value approach:
Reject H0 if  2  (12 a /2) or  2  a2 /2
p-Value approach:
Reject H0 if p-value < a
where (12 a /2) and a2 /2 are based on a
chi-square distribution with n - 1 d.f.
Hypothesis Testing
About a Population Variance
Example: Buyer’s Digest (B): Recall that Buyer’s
Digest is rating ThermoRite thermostats. Buyer’s
Digest gives an “acceptable” rating to a
thermostat with a temperature variance of 0.5 or
less.
We will conduct a hypothesis test (with a = .10) to
determine whether the ThermoRite thermostat’s
temperature variance is “acceptable”.
Hypothesis Testing
About a Population Variance

Example: Buyer’s Digest (B)
Using the 10 readings, we will conduct a hypothesis
test (with a = .10) to determine whether the
ThermoRite thermostat’s temperature variance is
“acceptable”.
Thermostat
1
2
3
4
5
6
7
8
9
10
Temperature 67.4 67.8 68.2 69.3 69.5 67.0 68.1 68.6 67.9 67.2
Hypothesis Testing
About a Population Variance
Hypotheses


H0 :  2  0.5
H a :  2  0.5
Rejection Rule
Reject H0 if 2 > 14.684
Hypothesis Testing About a Population
Variance
For n - 1 = 10 - 1 = 9 d.f. and a = .10
Selected Values from the Chi-Square Distribution Table
Area in Upper Tail
Degrees
of Freedom
5
6
7
8
9
10
.99
0.554
0.872
1.239
1.647
2.088
.975
0.831
1.237
1.690
2.180
2.700
.95
1.145
1.635
2.167
2.733
3.325
.90
1.610
2.204
2.833
3.490
4.168
2.558 3.247
3.940
4.865 15.987 18.307 20.483 23.209
Our .10 value
2
.10
9.236
10.645
12.017
13.362
14.684
.05
11.070
12.592
14.067
15.507
16.919
.025
12.832
14.449
16.013
17.535
19.023
.01
15.086
16.812
18.475
20.090
21.666
Hypothesis Testing
About a Population Variance

Rejection Region
2 
(n  1)s 2
2
9s 2

.5
Area in Upper
Tail = .10
0
14.684
2
Reject H0
Hypothesis Testing
About a Population Variance
The sample variance s 2 = 0.7

Test Statistic
9(.7)
 
 12.6
.5
2

Conclusion
Because 2 = 12.6 is less than 14.684, we cannot
reject H0. The sample variance s2 = .7 is insufficient
evidence to conclude that the temperature variance
for ThermoRite thermostats is unacceptable.
Chapter 13, Part A: Analysis of Variance and
Experimental Design

Introduction to Analysis of Variance

Analysis of Variance: Testing for the
Equality of k Population Means
Introduction to Analysis of Variance
Analysis of Variance (ANOVA) can be used to test
for the equality of three or more population means.
We want to use the sample results to test the
following hypotheses:
H0: 1 = 2 = 3 = . . . = k
Ha: Not all population means are equal
Introduction to Analysis of Variance
H0: 1 = 2 = 3 = . . . = k
Ha: Not all population means are equal
If H0 is rejected, we cannot conclude that all
population means are different.
Rejecting H0 means that at least two population
means have different values.
Assumptions for Analysis of Variance
For each population, the response variable is
normally distributed.
The variance of the response variable, denoted  2,
is the same for all of the populations.
The observations must be independent.
Test for the Equality of k Population Means

Hypotheses
H0: 1 = 2 = 3 = . . . = k
Ha: Not all population means are equal

Test Statistic
F = MSTR/MSE
Between-Treatments Estimate
of Population Variance

A between-treatment estimate of  2 is
called the mean square treatment and is
denoted MSTR.
SSTR
MSTR 
k 1
Denominator represents
the degrees of freedom
Numerator is the
sum of squares
due to treatments
and is denoted SSTR
Within-Samples Estimate
of Population Variance
 The estimate of  2 based on the variation
of the sample observations within each
sample is called the mean square error
and is denoted by MSE.
SSE
MSE 
nT  k
Denominator represents
the degrees of freedom
associated with SSE
Numerator is the
sum of squares
due to error
and is denoted SSE
Test for the Equality of k Population Means

k: # of subpopulations you are comparing.
nT: Total number of observations.

Rejection Rule

Reject H0 if F > Fa
where the value of Fa is based on an
F distribution with k - 1 numerator d.f.
and nT - k denominator d.f.
Hypothesis Testing About the
Variances of Two Populations
Selected Values from the F Distribution Table
Denominator Area in
Degrees
of Freedom
8
9
Numerator Degrees of Freedom
Upper
Tail
.10
.05
.025
.01
7
2.62
3.50
4.53
6.18
8
2.59
3.44
4.43
6.03
9
2.56
3.39
4.36
5.91
10
2.54
3.35
4.30
5.81
15
2.46
3.22
4.10
5.52
.10
.05
.025
.01
2.51
3.29
4.20
5.61
2.47
3.23
4.10
5.47
2.44
3.18
4.03
5.35
2.42
3.14
3.96
5.26
2.34
3.01
3.77
4.96
Comparing the Variance Estimates:
The F Test
 If the null hypothesis is true and the ANOVA
assumptions are valid, the sampling distribution of
MSTR/MSE is an F distribution with MSTR d.f.
equal to k - 1 and MSE d.f. equal to nT - k.
 If the means of the k populations are not equal, the
value of MSTR/MSE will be inflated because MSTR
overestimates  2.
 Hence, we will reject H0 if the resulting value of
MSTR/MSE appears to be too large to have been
selected at random from the appropriate F
distribution.
ANOVA Table
Source of
Variation
Sum of
Squares
Degrees of
Freedom
Mean
Squares
Treatment
Error
Total
SSTR
SSE
SST
k–1
nT – k
nT - 1
MSTR
MSE
SST is partitioned
into SSTR and SSE.
F
MSTR/MSE
SST’s degrees of freedom
(d.f.) are partitioned into
SSTR’s d.f. and SSE’s d.f.
ANOVA Table
SST divided by its degrees of freedom nT – 1 is the
overall sample variance that would be obtained if we
treated the entire set of observations as one data set.
With the entire data set as one sample, the formula
for computing the total sum of squares, SST, is:
nT
k
SST   ( xij  x ) 2  SSTR  SSE
j 1 i 1
k
SSTR   n j ( x j  x ) 2
j 1
k
SSE   (n j  1) s 2j
j 1
ANOVA Table
ANOVA can be viewed as the process of partitioning
the total sum of squares and the degrees of freedom
into their corresponding sources: treatments and error.
Dividing the sum of squares by the appropriate
degrees of freedom provides the variance estimates
and the F value used to test the hypothesis of equal
population means.
Test for the Equality of k Population Means

Example: Reed Manufacturing
Janet Reed would like to know if
there is any significant difference in
the mean number of hours worked per
week for the department managers
at her three manufacturing plants
(in Buffalo, Pittsburgh, and Detroit).
Test for the Equality of k Population Means

Example: Reed Manufacturing
A simple random sample of five
managers from each of the three plants
was taken and the number of hours
worked by each manager for the
previous week is shown on the next
slide.
Conduct an F test using a = .05.
Test for the Equality of k Population Means
Observation
1
2
3
4
5
Sample Mean
Sample Variance
Plant 1
Buffalo
48
54
57
54
62
Plant 2
Pittsburgh
73
63
66
64
74
Plant 3
Detroit
51
63
61
54
56
55
26.0
68
26.5
57
24.5
Test for the Equality of k Population Means
 p -Value and Critical Value Approaches
1. Develop the hypotheses.
H0:  1 =  2 =  3
Ha: Not all the means are equal
where:
 1 = mean number of hours worked per
week by the managers at Plant 1
 2 = mean number of hours worked per
week by the managers at Plant 2
 3 = mean number of hours worked per
week by the managers at Plant 3
Test for the Equality of k Population Means
 Compute the test statistic using ANOVA Table
Source of
Variation
Sum of
Squares
Degrees of
Freedom
Mean
Squares
Treatment
Error
Total
490
308
798
2
12
14
245
25.67
F
9.5
Test for the Equality of k Population Means
 p –Value Approach
4. Compute the critical value.
With 2 numerator d.f. and 12 denominator d.f.,
Fa = 3.89.
5. Determine whether to reject H0.
The F > Fa, so we reject H0.
We have sufficient evidence to conclude that the
mean number of hours worked per week by
department managers is not the same at all 3 plant.