L643: Evaluation of Information Systems
Download
Report
Transcript L643: Evaluation of Information Systems
Social Statistics: ANOVA
Review
2
The problem with t-tests…
How to compare the difference on >2 groups on one
or more variables
3
If it is only one variable, we could compare three groups
with multiple ttests: M1 vs. M2, M1 vs. M3, M2 vs. M3
>2 variables?
For example, how two teaching methods are
different for three different sizes of classes.
ANOVA allows you to see if there is any difference
between groups on some variables.
What is ANOVA?
“Analysis of Variance”
A hypothesis-testing procedure used to evaluate
mean differences between two or more
treatments (or populations) on different variables.
ANOVA is available for both parametric (score
data) and non-parametric (ranking) data.
Advantages:
4
1) Can work with more than two samples.
2) Can work with more than one independent variable
One example
5
Assume that you have data on student
performance in non-assessed tutorial
exercises as well as their final grading.You are
interested in seeing if tutorial performance is
related to final grade. ANOVA allows you to
break up the group according to the grade and
then see if performance is different across
these grades.
Types of ANOVA?
One-way between groups
Differences between the groups
The groups are categorized in one way, such as
groups were divided by age, or grade.
This is the simplest version of ANOVA
It allows us to compare variable between different
groups, for example, to compare tutorial
performance from different students grouped by
grade.
6
Types of ANOVA?
One-way repeated measures
A single group has been measured by a variable
for a few times
Example 1: one group of patients were tested by a
new drug in different times: before taking the drug,
after taking the drug
Example 2: student performance on the tutorial
over time.
7
Types of ANOVA?
8
Two-way between groups
For example: the grades by tutorial analysis could be extended
to see if overseas students performed differently to local
students. What you would have from this form of ANOVA is:
The effect of final grade
The effect of overseas versus local
The interaction between final grade and overseas/local
Each of the main effects are one-way tests. The
interaction effect is simply asking "is there any significant
difference in performance when you take final grade and
overseas/local acting together".
Types of ANOVA?
Two-way repeated measures
Use the repeated measures
Include an interaction effect
For example, we want to see the performance of
tutorial about gender and time of testing. We have
the same two groups (male, and female groups)
and test them in different times to compare the
difference.
9
What is ANOVA?
In ANOVA an independent or quasiindependent variable is called a factor.
Factor = independent (or quasi-independent)
variable.
Levels = number of values used for the
independent variable.
One factor → “single-factor design”
More than one factor → “factorial design”
10
What is ANOVA?
An example of a single-factor design
A example of a two-factor design
11
How ANOVA works?
12
ANOVA calculates the mean for each of the final grading groups on the tutorial
exercise figure - the Group Means.
It calculates the mean for all the groups combined - the Overall Mean.
Then it calculates, within each group, the total deviation of each individual's score
from the Group Mean - Within Group Variation.
Next, it calculates the deviation of each Group Mean from the Overall Mean Between Group Variation.
Finally, ANOVA produces the F statistic which is the ratio Between Group
Variation to the Within Group Variation.
If the Between Group Variation is significantly greater than the Within Group
Variation, then it is likely that there is a statistically significant difference between
the groups.
The statistical package will tell you if the F ratio is significant or not.
All versions of ANOVA follow these basic principles but the sources of Variation
get more complex as the number of groups and the interaction effects increase.
F value
Variance between treatments can have two
interpretations:
Variance is due to differences between treatments.
Variance is due to chance alone. This may be due
to individual differences or experimental error.
13
Excel: ANOVA
Data Analysis—Analysis Tools—
three different ANOVA:
Anova: Single Factor (one-way between groups)
Anova: Two-factors With Replication
Anova: Two-Factors Without Replication
14
Example (one-way ANOVA)
Three groups of preschoolers and their
language scores, whether they are overall
different?
Group 1 Scores
87
86
76
56
78
98
77
66
75
67
15
Group 2 Scores
87
85
99
85
79
81
82
78
85
91
Group 3 Scores
89
91
96
87
89
90
89
96
96
93
F test steps
Step1: a statement of the null and research
hypothesis
One-tailed or two-tailed (there is no such thing in
ANOVA)
H 0 : 1 2 3
H1 : at least one is different
16
F test steps
Step2: Setting the level of risk (or the level of
significance or Type I error) associated with
the null hypothesis
17
0.05
F test steps
Step3: Selection of the appropriate test
statistics
18
Groups
1
2
3
Count
10
10
10
ANOVA: Single factor
Sum
Average Variance
766
76.6 143.1556
852
85.2
38.4
916
91.6
11.6
F test steps
Group 1 Scores
87
86
76
56
78
98
77
66
75
67
x square
Group 2 Scores
7569
87
7396
85
5776
99
3136
85
6084
79
9604
81
5929
82
4356
78
5625
85
4489
91
n
∑x
10
766
10
852
10
916
X
76.6
85.2
91.6
59964
58675.6
72936
72590.4
84010
83905.6
( X 2 )
( X ) 2 / n
x square
7569
7225
9801
7225
6241
6561
6724
6084
7225
8281
Group 3 Scores
89
91
96
87
89
90
89
96
96
93
x square
7921
8281
9216
7569
7921
8100
7921
9216
9216
8649
N
∑∑X
( X ) 2 / N
(X ) 2
( X ) 2 / n
30
2534
214038.5333
216910
215171.6
F-test
Between sum of
squares
within sum of
squares
total sum of
squares
( X ) / n ( X )
( X ) ( X ) / n
2
2
( X )
2
/N
215171.6-214038.53
1133.07
216910-215171.60
1738.40
216910-214038.53
2871.47
2
2
( X ) 2 / N
F test steps
Between-group degree of freedom=k-1
k: number of groups
Within-group degree of freedom=N-k
N: total sample size
source
Between
groups
sums of
squares
mean sums of
squares
df
1133.07
2
566.53
Within gruops
1738.40
27
64.39
Total
2871.47
29
F
8.799
F test steps
Between-group degree of freedom=k-1
Within-group degree of freedom=N-k
22
k: number of groups
N: total sample size
F test steps
Step4: (cont.)
df for the denominator = n-k=30-3=27
df for the numerator = k-1=3-1=2
23
F test steps
Step4: determination of the value needed for rejection of the null
hypothesis using the appropriate table of critical values for the
particular statistic
24
Table-Distribution of F (http://www.socr.ucla.edu/applets.dir/f_table.html)
F test steps
Step5: comparison of the obtained value and
the critical value
If obtained value > the critical value, reject the null
hypothesis
If obtained value < the critical value, accept the
null hypothesis
8.80 and 3.36
25
F test steps
26
Step6 and 7: decision time
What is your conclusion? Why?
How do you interpret F(2, 27)=8.80, p<0.05