Transcript Chapter 11

Chapter 11
Analysis of
Variance and
Chi-Square
Applications
© 2002 Thomson / South-Western
Slide 11-1
Learning Objectives
• Understand the differences between
various experimental designs and when
to use them.
• Compute and interpret the results of a
one-way ANOVA.
• Compute and interpret the results of a
random block design.
© 2002 Thomson / South-Western
Slide 11-2
Learning Objectives, continued
• Compute and interpret the results of
a two-way ANOVA.
• Understand and interpret interaction.
• Understand the chi-square goodnessof-fit test and how to use it.
• Analyze data by using the chi-square
test of independence.
© 2002 Thomson / South-Western
Slide 11-3
Introduction to Design
of Experiments
• An Experimental Design is a plan and a
structure to test hypotheses in which the
business analyst controls or manipulates
one or more variables. It contains
independent and dependent variables.
• Factors is another name for the
independent variables of an experimental
design.
© 2002 Thomson / South-Western
Slide 11-4
Design of Experiments, continued
• Treatment variable is the independent
variable that the experimenter either
controls or modifies.
• Classification variable is the
independent variable that was present
prior to the experiment, and is not a
result of the experimenter’s
manipulations or control.
© 2002 Thomson / South-Western
Slide 11-5
Design of Experiments, continued
• Levels or Classifications are the
subcategories of the independent
variable used by the business analyst in
the experimental design.
• The Dependent Variable is the
response to the different levels of the
independent variables.
© 2002 Thomson / South-Western
Slide 11-6
Three Types
of Experimental Designs
• Completely Randomized Design
• Randomized Block Design
• Factorial Experiments
© 2002 Thomson / South-Western
Slide 11-7
Completely Randomized Design
1
Machine Operator
2
3
Valve Opening
Measurements
.
.
.
.
.
.
© 2002 Thomson / South-Western
.
.
.
Slide 11-8
Example: Number of Foreign Freighters
Docking in each Port per Day
Long
Beach
Houston
New York
New
Orleans
5
2
8
3
7
3
4
5
4
5
6
3
2
4
7
4
6
9
2
8
© 2002 Thomson / South-Western
Slide 11-9
Analysis of Variance
(ANOVA): Assumptions
• Observations are drawn from normally
distributed populations.
• Observations represent random
samples from the populations.
• Variances of the populations are equal.
© 2002 Thomson / South-Western
Slide 11-10
One-Way ANOVA:
Procedural Overview
H :        
o
1
2
3
k
Ha: At least one of the means is different from the others
MSC
F
MSE
If F >
If F 
© 2002 Thomson / South-Western
F , reject H .
F , do not reject H .
c
c
o
o
Slide 11-11
Partitioning Total Sum
of Squares of Variation
SST
(Total Sum of Squares)
SSC
(Treatment Sum of Squares)
© 2002 Thomson / South-Western
SSE
(Error Sum of Squares)
Slide 11-12
One-Way ANOVA:
Sums of Squares Definitions
Total sum of squares = error sum of squares + between sum of squares
SST = SSC + SSE
  X ji  X
C
nj
j=1 i=1

2

C

n j X j X
j 1
   X ij  X j 
2
C
nj
2
j 1 i 1
where :
i  particular member of a treatment level
j = a treatment level
C = number of treatment levels
n
 number of observations in a given treatment level
j
X= grand mean
X = mean of a treatment group or level
X  individual value
j
ij
© 2002 Thomson / South-Western
Slide 11-13
One-Way ANOVA:
Computational Formulas
X X
  X  X 
2
C
SSC   n j
j
j 1
C
SSE 
nj
nj
SST   
j 1 i 1
MSC 
ij
MSE 
X
ij  X
j

C
SSE
df
 C 1
df
E
 N C
2
SSC
df
C
2
j 1 i 1
C
df
df
T
 N 1
where: i = a particular member of a treatment level
j = a treatment level
C = number of treatment levels
n=
j
E
MSC
F
MSE
X = grand mean
X
X =
j
ij
© 2002 Thomson / South-Western
number of observations in a given treatment level
column mean
individual value
Slide 11-14
Freighter One-Way ANOVA:
Preliminary Calculations
New
Orleans
Long Beach
Houston
New York
5
7
4
2
2
3
5
4
6
8
4
6
7
9
8
3
5
3
4
2
T1 = 18
T2 = 20
T3 = 42
T4 = 17
n1= 4
n2 = 5
n3 = 6
n4 = 5
T = 97
N = 20
© 2002 Thomson / South-Western
Slide 11-15
Freighter One-Way ANOVA:
Sum of Squares Calculations
T
n
X
j
:
T
j
:
n
X
j
:
1
 18
T
1
4
n
X
1
 4.5
© 2002 Thomson / South-Western
2
 20
T
2
5
n
X
2
 4.0
3
 42
T
3
6
n
X
3
 7.0
4
 42
4
5
N  20
 3.4
X  4.85
4
T  97
Slide 11-16
Freighter One-Way ANOVA:
Sum of Squares Calculations, continued
C
SSC   n j
j 1
X

2
j X
 [ 4 (4.5 4.85)  5 (4.0 4.85)  6 (4.7  4.85)  5 (3.4  4.85)
2
 42.35
C nj
SSE    X ij  X
j 1 i 1

2
2

2
j
 (5 4.5)  (7  4.5)  (4  4.5)  (2  4.5)
2
2
2
2
 (2  4.0)  (3 4.0)  (4  34
. )  (2  34
. )
2
 44.20
C nj
SST    X ij  X
j 1 i 1

2
2
2
2

2
 (5 4.85)  (7  4.85)  (4  4.85)  (4  4.85)  (2  4.85)
2
2
2
2
2
 8655
.
© 2002 Thomson / South-Western
Slide 11-17
Freighter OneWay ANOVA:
Mean Square
and F
Calculations
df
df
df
C
E
T
 C 1  4 1  3
 N  C  20  4  16
 N  1  20  1  19
MSC 
MSE 
SSC
df
C
SSE
df
42.35

 14.12
3
44.20

 2.76
16
E
MSC 14.12
F

 512
.
MSE
2.76
© 2002 Thomson / South-Western
Slide 11-18
Freighter Example:
Analysis of Variance
Source of Variancedf
SS
MS
F
Between Factor
Error
Total
3
16
19
42.35
44.20
86.55
14.12
2.76
© 2002 Thomson / South-Western
5.12
Slide 11-19
A Portion of the F Table for  = 0.05
F
Denominator
Degrees of Freedom
1
...
15
16
17
.05,3,16
Numerator Degrees of Freedom
1
2
3
4
5
6
7
8
9
161.45 199.50 215.71 224.58 230.16 233.99 236.77 238.88 240.54
...
...
...
...
...
...
...
...
...
4.54
3.68
3.29
3.06
2.90
2.79
2.71
2.64
2.59
4.49
3.63
3.24
3.01
2.85
2.74
2.66
2.59
2.54
4.45
3.59
3.20
2.96
2.81
2.70
2.61
2.55
2.49
© 2002 Thomson / South-Western
Slide 11-20
Freighter One-Way ANOVA:
Procedural Summary
Ho :       
1
2
3
4
Ha : At least one of themeans
is differentfrom theothers
If F >
If F 
F  3.24, reject H .
F  3.24, do reject H .
c
c
1
3
2
 16
o
o
Since F = 5.12 > Fc  3.24, reject Ho.
© 2002 Thomson / South-Western


Rejection Region
Non rejection
Region

F
.05,9,11
 324
.
Critical Value
Slide 11-21
Excel Output
for the Freighter Example
Anova: Single Factor
SUMMARY
Groups
Long Beach
Houston
New York
New Orleans
Count
4
5
6
5
ANOVA
Source of Variation
Between Groups
Within Groups
SS
42.35
44.2
Total
86.55
© 2002 Thomson / South-Western
Sum Average Variance
18
4.5 4.3333
20
4
2.5
42
7
3.2
17
3.4
1.3
df
3
16
MS
14.117
2.7625
F
P-value
5.1101 0.0114
F crit
3.2389
19
Slide 11-22
Multiple Comparison Tests
• An analysis of variance (ANOVA) test
is an overall test of differences among
groups.
• Multiple Comparison techniques are
used to identify which pairs of means
are significantly different given that the
ANOVA test reveals overall significance.
© 2002 Thomson / South-Western
Slide 11-23
Randomized Block Design
• An experimental design in which there
is one independent variable, and a
second variable known as a blocking
variable, that is used to control for
confounding or concomitant variables.
• Confounding or concomitant variable
are not being controlled by the business
analyst but can have an effect on the
outcome of the treatment being studied.
© 2002 Thomson / South-Western
Slide 11-24
Randomized Block Design, continued
• Blocking variable is a variable that the
business analyst wants to control but is
not the treatment variable of interest.
• Repeated measures design is a
randomized block design in which
each block level is an individual item
or person, and that person or item is
measured across all treatments.
© 2002 Thomson / South-Western
Slide 11-25
Partitioning the Total Sum of Squares
in the Randomized Block Design
SST
(total sum of squares)
SSE
(error sum of squares)
SSC
(treatment
sum of squares)
SSR
(sum of squares
blocks)
© 2002 Thomson / South-Western
SSE’
(sum of squares
error)
Slide 11-26
A Randomized Block Design
Single Independent Variable
.
Individual
observations
.
Blocking
Variable
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
© 2002 Thomson / South-Western
Slide 11-27
Randomized Block Design Treatment
Effects: Procedural Overview
Ho :         
1
2
3
k
Ha : At least one of themeansis differentfrom theothers
MSC
F
MSE
If F >
If F 
F , reject H .
F , do not reject H .
c
c
© 2002 Thomson / South-Western
o
o
Slide 11-28
Randomized Block Design:
Computational Formulas
C
SSC  n ( X j  X )
j 1
n
SSR  C  ( X
i 1
n
n
i X )
2
2
SSE    ( X ij  X i  X i  X )
j  1 i 1
n
n
SST    ( X ij  X )
j  1 i 1
SSC
MSC 
C 1
SSR
MSR 
n 1
SSE
MSE 
N  n  C 1
MSC

F treatments MSE
MSR
F blocks  MSE
2
2
df
C
df
R
df
E
df
E
 C 1
 n 1
  C  1 n  1  N  n  C  1
 N 1
where: i = block group (row)
j = a treatment level (column)
C = number of treatment levels (columns)
n = number of observations in each treatment level (number of blocks - rows)
X  individual observation
X  treatment (column) mean
X  block (row) mean
© 2002 Thomson / South-Western
ij
j
i
SSC  sum of squares columns (treatment)
SSR = sum of squares rows (blocking)
SSE = sum of squares error
SST = sum of squares total
X = grand mean
N = total number of observations
Slide 11-29
Tread-Wear Example:
Randomized Block Design
Speed
Supplier
Slow
Medium
Fast
Block
Means
( X )
i
n=5
1
3.7
4.5
3.1
3.77
2
3.4
3.9
2.8
3.37
3
3.5
4.1
3.0
3.53
4
3.2
3.5
2.6
3.10
5
3.9
4.8
3.4
4.03
3.54
4.16
2.98
3.56
Treatment
Means( X )
N = 15
X
j
C=3
© 2002 Thomson / South-Western
Slide 11-30
Tread-wear Randomized Block Design:
Sum of Squares Calculations (Part 1)
C
SSC  n ( X j  X )
j 1
2
 5[(3.54  356
. )  (4.16 356
. )  (2.98 356
. )
2
2
2
 3484
.
n
SSR  C  ( X
i 1
i X )
2
 3[(3.77  356
. )  (3.37  356
. )  (3.53 356
. )  (3.10 356
. )  (4.03 356
. )]
2
2
2
2
2
 1549
.
© 2002 Thomson / South-Western
Slide 11-31
Tread-wear Randomized Block Design:
Sum of Squares Calculations (Part 2)
C
n
SSE    ( X ij  X j  X i  X )
j 1 i 1
2
 (3.7  354
.  377
.  356
. )  (3.4  354
.  337
.  356
. ) 
2
2
(2.6 2.98 310
.  356
. )  (3.4  2.98 4.03 356
. )
 0143
.
2
C
n
SST    ( X ij  X )
2
2
j 1 i 1
 (3.7  356
. )  (3.4  356
. )  (2.6 3.56)  (3.4  356
. )
2
2
2
2
 5176
.
© 2002 Thomson / South-Western
Slide 11-32
Tread-wear Randomized Block Design:
Mean Square Calculations
SSC 3.484
MSC 

 1742
.
C 1
2
SSR 1549
.
MSR 

 0.387
n 1
4
SSE
0143
.
MSE 

 0.018
N  n  C 1
8
MSC 1742
.
F

 96.78
MSE 0.018
© 2002 Thomson / South-Western
Slide 11-33
Analysis of Variance
for the Tread-Wear Example
Source of VarianceSS
df
Treatment
3.484
Block
1.549
Error
0.143
Total
5.176
© 2002 Thomson / South-Western
MS
2
4
8
14
F
1.742
0.387
0.018
96.78
Slide 11-34
Tread-wear Randomized Block Design
Treatment Effects: Procedural Summary
Ho:  1   2   3
Ha: At least one of the means is different from the others
MSC 1742
.
F
 96.78
MSE 0.018
F = 96.78 >
© 2002 Thomson / South-Western
F
.01,2,8
= 8.65, reject Ho.
Slide 11-35
Excel Output for Tread-Wear
Randomized Block Design
Anova: Two-Factor Without Replication
SUMMARY
1
2
3
4
5
Slow
Medium
Fast
Count
Sum
11.3
10.1
10.6
9.3
12.1
Average
3.7666667
3.3666667
3.5333333
3.1
4.0333333
Variance
0.4933333
0.3033333
0.3033333
0.21
0.5033333
5 17.7
5 20.8
5 14.9
3.54
4.16
2.98
0.073
0.258
0.092
3
3
3
3
3
ANOVA
Source of Variation
SS
df
MS
F
P-value
F crit
Rows
1.5493333
4 0.3873333 21.719626 0.0002357 7.0060651
Columns
3.484
2
1.742 97.682243 2.395E-06 8.6490672
Error
0.1426667
8 0.0178333
Total
©
2002 Thomson / South-Western 5.176
14
Slide 11-36
Two-Way Factorial Design
• An experimental design in which two ot
more independent variables are studied
simultaneously and every level of
treatment is studied under the
conditions of every level of all other
treatments.
• Also called a factorial experiment.
© 2002 Thomson / South-Western
Slide 11-37
Two-Way Factorial Design
Column Treatment
.
.
Row
Treatment
Cells
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
© 2002 Thomson / South-Western
Slide 11-38
Two-Way ANOVA: Hypotheses
Row Effects:
Ho: Row Means are all equal.
Ha: At least one row mean is different from the others.
Columns Effects:
Ho: Column Means are all equal.
Ha: At least one column mean is different from the others.
Interaction Effects: Ho: The interaction effects are zero.
Ha: There is an interaction effect.
© 2002 Thomson / South-Western
Slide 11-39
Formulas for Computing
a Two-Way ANOVA
R
SSR  nC  ( X
i 1
C
i X )
2
SSC  nR  ( X j  X )
j 1
R
2
C
SSI  n  ( X ij  X i  X j  X )
i 1 j 1
SSE     ( X ijk  X ij )
R
C
n
i 1 j 1 k 1
C
R
n
SST     ( X ijk  X )
2
2
c 1 r 1 a 1
SSR
R 1
SSC
MSC 
C 1
SSI
MSI 
 R  1 C  1
SSE
MSE 
RC n  1
MSR 
© 2002 Thomson / South-Western
2
 R 1
df
R
df
 C 1
C
df
I
df
df
E
T
  R  1 C  1
C = number of column treatments
 RC n  1
R = number of row treatments
i = row treatment level
 N 1
MSR
MSE
MSC
F C  MSE
MSI
F I  MSE
F
R
where:
n = number of observations per cell

j = column treatment level
k = cell member
Xijk = individual observation
X
X
X
ij
i
j
= cell mean
= row mean
= column mean
X = grand mean
Slide 11-40
A 2  3 Factorial Design
with Interaction
Row effects
Cell
Means
R1
R2
C1
© 2002 Thomson / South-Western
C2
Column
C3
Slide 11-41
A 2  3 Factorial Design
with Some Interaction
Row effects
Cell
Means
R1
R2
C1
© 2002 Thomson / South-Western
C2
Column
C3
Slide 11-42
A 2  3 Factorial Design
with No Interaction
Row effects
Cell
Means
R1
R2
C1
C2
C3
Column
© 2002 Thomson / South-Western
Slide 11-43
CEO Dividend 2  3 Factorial Design:
Data and Measurements
Location Where Company
Stock is Traded
How Stockholders
are Informed of
Dividends
Annual/Quarterly
Reports
Presentations to
Analysts
Xj
NYSE
AMEX
2
1
2
1
X11=1.5
2
3
1
2
X21=2.0
2
3
3
2
X12=2.5
3
3
2
4
X22=3.0
1.75
2.75
© 2002 Thomson / South-Western
OTC
Xi
4
3
4
2.5
3
X13=3.5
4
4
3
2.9167
4
X23=3.75
X=2.7083
N = 24
n=4
3.625
Slide 11-44
CEO Dividend 2  3 Factorial Design:
Calculations (Part 1)
R
SSR  nC  ( X i  X )
2
i 1
 ( 4)( 3)[( 2.5  2.7083) 2  (2.9167  2.7083) 2 ]
 10418
.
C
SSC  nR  ( X j  X )
2
j 1
 ( 4)( 2)[(1.75  2.7083) 2  (2.75  2.7083) 2  (3.625  2.7083) 2 ]
 14.0833
R
C
SSI  n  ( X ij  X i  X j  X )
2
i 1 j 1
 4[(15
.  2.5  1.75  2.7083) 2  (2.5  2.5  2.75  2.7083) 2
 ( 3.5  2.5  3.625  2.7083) 2  ( 2.0  2.9167  1.75  2.7083) 2
 ( 3.0  2.9167  2.75  2.7083) 2  ( 3.75  2.9167  3.625  2.7083) 2 ]
 0.0833
© 2002 Thomson / South-Western
Slide 11-45
CEO Dividend 2  3 Factorial Design:
Calculations
SSE     ( X ijk  X ij)
R
C
n
(Part 2)
2
i 1 j 1 k 1
 (2 15
. )  (115
. )  (3 375
. )  (4  375
. )
2
2
2
2
 7.7500
C
R
n
SST     ( X ijk  X )
2
c 1 r 1 a 1
 (2  2.7083)  (1 2.7083)  (3 2.7083)  (4  2.7083)
2
2
2
2
 22.9583
© 2002 Thomson / South-Western
Slide 11-46
CEO Dividend 2  3 Factorial Design:
Calculations (Part 3)
SSR 10418
.
MSR 

 10418
.
R 1
1
SSC 14.0833
MSC 

 7.0417
C 1
2
SSI
0.0833
MSI 

 0.0417
 R  1 C  1
2
SSE
7.7500
MSE 

 0.4306
RC n  1
18
© 2002 Thomson / South-Western
MSR 10418
.
F R  MSE  0.4306  2.42
MSC 7.0417
F C  MSE  0.4306  16.35
MSI 0.0417
.
F I  MSE  0.4306  010
Slide 11-47
CEO Dividend: Analysis of Variance
Source of VarianceSS
df
Row
1.0418
Column
14.0833
Interaction
0.0833
Error
7.7500
Total
22.9583
*Denotes
MS
1
2
2
18
23
F
1.0418 2.42
7.0417 16.35*
0.0417 0.10
0.4306
significance at = .01.
© 2002 Thomson / South-Western
Slide 11-48
Anova: Two-Factor With Replication
CEO
Dividend
Excel
Output
(Part 1)
SUMMARY
Reports
Count
Sum
Average
Variance
Presentation
Count
Sum
Average
Variance
NYSE
ASE
OTC
Total
4
6
1.5
0.3333
4
10
2.5
0.3333
4
14
3.5
0.3333
12
30
2.5
1
4
8
2
0.6667
4
12
3
0.6667
4
15
3.75
0.25
12
35
2.9167
0.9924
8
14
1.75
0.5
8
22
2.75
0.5
8
29
3.625
0.2679
Total
Count
Sum
Average
Variance
© 2002 Thomson / South-Western
Slide 11-49
CEO Dividend Excel Output (Part 2)
ANOVA
Source of Variation
Sample
Columns
Interaction
Within
SS
1.0417
14.083
0.0833
7.75
Total
22.958
© 2002 Thomson / South-Western
df
1
2
2
18
MS
1.0417
7.0417
0.0417
0.4306
F
P-value F crit
2.4194 0.1373 4.4139
16.355
9E-05 3.5546
0.0968 0.9082 3.5546
23
Slide 11-50
2 Goodness-of-Fit Test
The 2 goodness-of-fit test compares
expected (theoretical) frequencies
of categories from a population distribution
to the observed (actual) frequencies
from a distribution to determine whether
there is a difference between what was
expected and what was observed.
© 2002 Thomson / South-Western
Slide 11-51
2 Goodness-of-Fit Test
 f o  f e
2
 
2
f
e
df = k - 1 - c
where :
f
e
f
o
 frequencyof observed values
 frequencyof expectedvalues
k  number of categories
c = number of parametersestimatedfrom thesampledata
© 2002 Thomson / South-Western
Slide 11-52
Milk Sales Data
for Demonstration
Problem 11.4
© 2002 Thomson / South-Western
Month
January
February
March
April
May
June
July
August
September
October
November
December
Gallons
1,553
1,585
1,649
1,590
1,497
1,443
1,410
1,450
1,495
1,564
1,602
1,609
18,447
Slide 11-53
Demonstration Problem 11.4:
Hypotheses and Decision Rules
Ho : T hemonthlymilk figures for milk sales
are uniformlydistributed
Ha : T hemonthlymilk figures for milk sales
are not uniformlydistributed
 .01
df  k  1  c
 12  1  0
 11

2
.01,11
If
If


2
Cal
2
Cal
 24.725, reject Ho.
 24.725, do not reject Ho.
 24.725
© 2002 Thomson / South-Western
Slide 11-54
Demonstration Problem 11.4:
Calculations
Month
January
February
March
April
May
June
July
August
September
October
November
December
fo
fe
(fo - fe)2/fe
1,553 1,537.25
0.16
1,585 1,537.25
1.48
1,649 1,537.25
8.12
1,590 1,537.25
1.81
1,497 1,537.25
1.05
1,443 1,537.25
5.78
1,410 1,537.25
10.53
1,450 1,537.25
4.95
1,495 1,537.25
1.16
1,564 1,537.25
0.47
1,602 1,537.25
2.73
1,609 1,537.25
3.35
18,447 18,447.00
41.59
© 2002 Thomson / South-Western
Observed Chi-square
= 41.59
Slide 11-55
Demonstration Problem 11.4:
Conclusion
df = 11
Non Rejection
region
0.01
24.725

2
Cal
© 2002 Thomson / South-Western
 41.59  24.725, reject Ho.
Slide 11-56
Defects Example: Using a 2 Goodness-ofFit Test to Test a Population Proportion
 .05
df  k  1  c
 2 1 0
1

Ho : P = .08
Ha: P  .08
If
2
 3.841
.05,1
© 2002 Thomson / South-Western
If


2
Cal
2
Cal
 3841
. , reject Ho.
 3841
. , do not reject Ho.
Slide 11-57
Defects Example: Calculations
fo
33
167
200
Defects
Nondefects
n=
Defects
f
f
e
e
fe
16
184
200
 nP
  200 .08 
 16
Nondefects
f
f
e
e
f o f e

 
2
2
f
e
167184
33
16




=

2
2
16
.
+ 1.57
 18.06
.
 1963
184
 n  1  P 
  200 .92 
 184
© 2002 Thomson / South-Western
Slide 11-58
df = 1
Defects
Example:
Conclusion
0.05
Non Rejection
region
3.841

© 2002 Thomson / South-Western
2
Cal
 19.63  3.841, reject Ho.
Slide 11-59
Contingency Analysis:
2 Test of Independence
A statistical test used to analyze the
frequencies of two variables with
multiple categories to determine
whether the two variables are
independent.
Qualitative Variables
Nominal Data
© 2002 Thomson / South-Western
Slide 11-60
Investment Example:
2 Test of Independence
• In which region of the country do you
reside?
A. Northeast
B. Midwest
C. South
D. West
• Which type of financial investment are you
most likely to make today?
E. Stocks
F. Bonds
G. Treasury bills
© 2002 Thomson / South-Western
Slide 11-61
Investment Example:
2 Test of Independence
Type of financial
Investment
Contingency Table
E
F
A
Geographic B
C
Region
D
nE
© 2002 Thomson / South-Western
nF
G
O13
nG
nA
nB
nC
nD
N
Slide 11-62
Investment Example:
2 Test of Independence
If A and F are independent,
P A  F  P A  P F 
n
P A 
A
N
n
P F  
F
e
AF
 n A nF 
 N
 
 N N
N
P A  F  
n n
A
N
 N  P A  F 
F
N
n
n

A
F
N
Type of Financial
Investment
Contingency Table
E
A
Geographic B
C
Region
D
G
e12
nE
© 2002 Thomson / South-Western
F
nF
nG
nA
nB
nC
nD
N
Slide 11-63
2 Test of Independence: Formulas
eij
Expected
Frequencies

n n 
i
j
N
where : i = the row
j = the colum n
ni 
nj 
the total of row i
the total of column j
N = the total of all fr equencies
f o  f e

 
2
Calculated 
(Observed )
© 2002 Thomson / South-Western

2
fe
where : df = (r - 1)(c - 1)
r = the numbe r of rows
c = the numbe r of columns
Slide 11-64