Transcript Document

ANOVA
ANOVA - What is it?
Analysis
of variance.
A method for splitting the total
variation of a data into
meaningful components that
measure different sources of
variation
One-Way Classification
(equal samples)
 Assumption


Random samples of size n are selected
from each of the k populations.
The k populations are independent and
normally distributed with means 𝝁𝟏 , 𝝁𝟐 , ⋯ 𝝁𝒌
and common variance 𝝈𝟐
Ho: & Ha:
Ho: 𝝁𝟏 = 𝝁𝟐 = ⋯ = 𝝁𝒌
Ha: at least two of the means are not equal
Critical Region and ANOVA Table
[equal samples]
𝑓
> 𝑓𝛼 [𝑘 − 1, 𝑘(𝑛 − 1)]
Computational Formulas
EXAMPLE
A company has three manufacturing
plants, and company officials want to
determine whether there is a difference
in the average age of workers at the
three locations. The following data are
the ages of five randomly selected
workers at each plant. Perform a oneway ANOVA to determine whether there
is a significant difference in the mean
ages of the workers at the three plants.
Use 0.01level of significance.
Between Groups = Column Means
Within Groups = Error
EXAMPLE
Between Groups = Column Means
Within Groups = Error
Critical Region and ANOVA Table
[unequal samples]
If the sample size for the k populations
are𝑛1 , 𝑛2 , … , 𝑛𝑘 then the critical region is given
𝑓 > 𝑓𝛼 𝑘 − 1, 𝑁 − 𝑘 where

𝑘
𝑁=
𝑛𝑖
𝑖=1
Computational Formulas
Example
It is suspected that higher-priced automobiles
are assembled with greater care than lowerpriced automobiles. To investigate whether
there is any basis for this feeling, a large luxury
model A, a medium-size sedan B, and a
subcompact hatchback C were compared for
defects when they arrived at the dealer’s
showroom. All cars were manufactured by the
same company. The number of defects for
several of the three models are recorded.
Test the hypothesis at 0.05 level of
significance that the average number of
defects is the same for the three models.
A
4
7
6
6
TOTAL
23
MODEL
B
5
1
3
5
3
4
21
C
8
6
8
9
5
36
80
EXAMPLE
A milk company has four machines that fill
gallon jugs with milk. The quality control
manager is interested in determining
whether the average fill for these machines
is the same. The following data represent
random samples of fill measures (in quarts)
for 19 jugs of milk filled by the different
machines. Use 𝛼 = 0.01 to test the
hypotheses.
Discuss
the
business
implications of your findings.
MACHINE 1 MACHINE 2 MACHINE 3 MACHINE 4
4.05
3.99
3.97
4
4.01
4.02
3.98
4.02
4.02
4.01
3.97
3.99
4.04
3.99
3.95
4.01
4
4
4
Tukey’s Honestly Significant
Difference Test : (HSD)

Equal Samples
𝐻𝑆𝐷 = 𝑞𝛼,𝑘,𝑘(𝑛−1)

𝑀𝑆𝐸
𝑛
Unequal Samples
𝐻𝑆𝐷 = 𝑞𝛼,𝑘,𝑘(𝑛−1)
𝑀𝑆𝐸 1
1
+
2 𝑛𝑟 𝑛𝑠
If 𝒙𝒓 − 𝒙𝒔 > 𝑯𝑺𝑫 then 𝝁𝒓 is is significantly
different from 𝝁𝒔
Example (Milk)
ROWS
COLUMNS
TOTAL
MEANS
𝑇1.
𝑥1.
𝑇2.
𝑥2.
⋮
⋮
⋮
𝑥𝑖𝑐
𝑇𝑖.
𝑥𝑖.
⋮
⋮
⋮
𝑥𝑟.
1
2
⋯
j
⋯
c
1
𝑥11
𝑥12
⋯
𝑥1𝑗
⋯
𝑥1𝑐
2
𝑥21
𝑥22
⋯
⋮
⋮
⋮
𝑥𝑖1
𝑥𝑖2
⋮
⋮
⋮
r
𝑥𝑟1
𝑥𝑟2
⋯
𝑥𝑟𝑗
⋯
𝑥𝑟𝑐
𝑇𝑟.
TOTAL 𝑇.1
𝑇.2
⋯
𝑇.𝑗
⋯
𝑇.𝑐
𝑇..
MEAN 𝑥.1
𝑥.2
⋯
𝑥.𝑗
⋯
𝑥.𝑐
i
⋯
⋮
⋯
𝑥𝑖𝑗
⋯
⋮
𝑥..
Two-Way ANOVA (w/o
replication)
 We wish to test the following hypotheses:
 Ho: The row means are all equal
 H1: The row means are significantly different
 Ho: The column means are all equal
 H1: The column means are significantly different
Computational Formulas
𝑆𝑆𝑇
𝑟
𝑐
2
𝑥𝑖𝑗
=
𝑖=1 𝑗=1
𝑇..2
−
𝑟𝑐
1
𝑆𝑆𝐶 =
𝑟
1
𝑆𝑆𝑅 =
𝑐
𝑐
𝑇.𝑗2
𝑗=1
𝑇..2
−
𝑟𝑐
𝑆𝑆𝐸 = 𝑆𝑆𝑇 − 𝑆𝑆𝐶 − 𝑆𝑆𝑅
𝑟
𝑇𝑖.2
𝑖=1
𝑇..2
−
𝑟𝑐
Example 4
 The
yields of three types of wheat using four
different kinds of fertilizer were recorded and
are shown on the next page:
 Test
the hypothesis at the 0.05 level of
significance that there is no difference in the
average yield of wheat when different kinds
of fertilizer are used. Also, test the hypothesis
that there is no difference in the average yield
of the three varieties of wheat.
Example 4
Two-Way ANOVA (with
Replication)
 We

wish to test the following hypotheses:
Ho: The row means are all equal
 H1:

The row means are significantly different
Ho: The column means are all equal
 H1:
The column means are significantly
different

Ho: There is no significant interaction effect.
 H1:
There is a significant interaction effect.
Computational Formulas
Example 5
 Aside
from testing the difference in the
yields according to fertilizer and variety of
wheat, try to determine if there is a
significant interaction effect on the two
variables, given the following data set.
Use a 0.05 level of significance.
Example 5
ANOVA : PLBautista