Transcript Quantitative Analysis - South Eastern University of Sri Lanka
Analysis of Variance – ANOVA
Faculty of Information Technology King Mongkut’s University of Technology North Bangkok
Content
Estimation Hypothesis testing Forming hypothesis Testing population means Testing population variances Testing categorical data / proportion Hypothesis about many population means One-way ANOVA Two-way ANOVA
Analysis of Variance (ANOVA)
Test if any of multiple means are different from each other One-way ANOVA: 1 variables – 3 or more groups Dependent variable is assumed is of interval or ratio scale Also used with ordinal scale data Can describe the effect of independent variable on dependent variable Two-way ANOVA: two independent, one dependent variables MANOVA: Two or more dependent variables Can describe interaction between two independent variables
One-way ANOVA
Test the means (of dependent variable) between groups as specified by an independent variable that are organized in 3 or more groups (dichotomous) Occupation: Student, Lecturer, Doctor (1 var - 3 groups) Salary: dependent variable Assumptions Dependent variable is either an interval or ratio (continuous) Dependent variable is approximately normally distributed for each category of the independent variable There is equality of variances between the independent groups (homogeneity of variances).
Independence of cases.
One-way ANOVA Concept
Total Variance = Between-Group Variance + Within-Group Variance Between-Group Variance Describe the difference of means between groups, which is the effect on variable of interest Within-Group Variance Describe the difference of means within each group, which is the effect caused by other factors, called Error H 0 : μ 1 = μ 2 = μ 3 = … = μ n H 1 : μ 1 != μ 2 != μ 3 != … != μ n (at least one different pair)
One-way ANOVA Table
Source of Variance Degree of Freedom (df) Between Groups (Treatment) Within Groups (Error) k-1 n-k Sum Square (SS)
SSW
j K
1
i n
j
1
X ij
2
j K
1
T j
2
n j
Mean Square (MS)
MSB
SSB k
1 F-ratio
F
MSB MSW MSW
SSW n
k
Total n-1
SST
j K
1
i n
1
X ij
2
T n
2 SST = SSB + SSW k: number of groups n: number of samples df: degree of freedom
One-way ANOVA: SPSS
Analyze -> Compare Means -> One-way ANOVA Option -> Tick… Homogeneity of variance test Descriptive (optional) Welch Post Hoc - used when the result is significant (at least one of the means is different) to find the group with the different mean https://statistics.laerd.com/spss-tutorials/one-way-anova-using-spss-statistics.php
http://academic.udayton.edu/gregelvers/psy216/spss/1wayanova.htm
Example
Determine if the means of total score are different in the 5 Sections H 0 : μ 1 = μ 2 = μ 3 = μ 4 = μ 5 H 1 : μ 1 != μ 2 != μ 3 != μ 4 != μ 5 At least one pair is different
Result: Descriptives and Variances
Check Levene test “Sig.” > = 0.05, thus variances are equal in all groups If not, need to refer to the Robust Tests of Equality of Means Table (Welch) instead of the ANOVA Table
Result: ANOVA Table
Sig. = 0.013 < α, thus at least one of the group has different means Use Post-Hoc tests To find the pair with different mean
Result: Post Hoc Tests
The pair that Sig. < α has different mean Section 1 and 4 Section 2 and 4 Section 2 and 5 Section 3 and 4 Section 4 and 5
Two-way ANOVA
Use to determine the effect of 2 or more factors (independent variables) on one dependent variable Occupation: Student, Lecturer, Doctor Age: less than 20, 20-30, 31-40, 41 or older Salary: dependent variable Assumptions Dependent variable is either interval or ratio (continuous) The dependent variable is approximately normally distributed for each combination of levels of the two independent variables Homogeneity of variances of the groups formed by the different combinations of levels of the two independent variables.
Independence of cases
Two-way ANOVA Concept
Two-way ANOVA compares Means between columns Means between rows Means from the interaction of factors Sum Square Row (SSR): variation effect of the 1 st factor Sum Square Column (SSC): variation effect of the 2 nd factor Sum Square Row Column (SSRC): variation effect of the interaction of the two factors Sum Square Error (SSE): Error caused by external factors Sum Square Total (SST) = SSR + SSC + SSRC + SSE
Two-way ANOVA Table
r: number of rows c: number of columns n: number of samples df: degree of freedom
Two-way ANOVA: SPSS
Analyze -> General Linear Model -> Univariate Multivariate is MANOVA Add dependent variable and two or more factors (independent variables) Option -> tick “Homogeneity tests” (optional “Descriptive”) Plot -> add one factor (containing more groups) to “Horizontal Axis” and other to “Separate Lines” then click “Add” To obtain profile plot Post Hoc to find pair that has different means (similar to One way ANOVA, optional) https://statistics.laerd.com/spss-tutorials/two-way-anova-using-spss-statistics.php
Example
Determine the effect of major and gender on the total score H 0 : μ 1 = μ 2 = μ 3 = μ 4 H 1 : μ 1 != μ 2 != μ 3 != μ 4
Result
Compare Error to Corrected Total Error should be less than 20% of corrected total Error is very large compared to corrected total Total score is effected by other external factors Gender row Sig. = 0.024 < α, gender has effect on total score Major row Sig. = 0.575 > α, major has no effect on total score Major*Gender row Sig. = 0.298 > α, the interaction between two factors has no effect on total score
Result: Profile Plot
Example
Determine the effect of section and gender on the total score