Transcript Slide 1
The greatest achievement in life is to be able to get up again from failure. 1 Categorical Data Analysis Chapter 5 II: Logistic Regression for Qualitative/Mixed Factors 2 Anova Type Representation of Factors • Binary response variable: Y ~ Bernoulli(p) • Qualitative factors: A, B, … SAS textbook Sec 8.4 3 Example: Berkeley Admissions Data (Table 2.10) Men Major Women # of applicants % admitted # of applicants % admitted A 825 62 108 82 B 560 63 25 68 C 325 37 593 34 D 417 33 375 35 E 191 28 393 24 F 373 6 341 7 4 Anova-Type Logistic Regression • Only one factor (eg. Department) pi Ai log 1 pi • Only main effects of two factors p ij Ai B j log 1 p ij • Full model p ij Ai B j ABij log 1 p ij 5 Anova-Type Logistic Regression • Parameterization (in SAS): The effect at the last level of each factor is set as 0 • (Regular) logistic regression expression by dummy variables (one factor example) p log 1 x1 2 x2 ... I 1 xI 1 1 p 6 Mixed-type Logistic Regression • Binary response variable: Y ~ Bernoulli(p) • Qualitative factors: A, B, … • Quantitative factors: X SAS textbook Sec 8.5 7 Example: Horseshoe Crab • Dataset is given in Table 4.3, textbook • Each female crab had a male crab attached to her in her nest; other males residing nearby her are called satellites • Y= # of satellites • X= female crab’s color (C), spine condition (S), weight (Wt), and carapace width (W) – C = 1 to 4 (light to dark); – S = 1 to 3 (good to worst) 8 Mixed-Type Logistic Regression Numerical factors Wt, W and: •Only one factor (eg. color) pi Ci 1Wt 2W log 1 p i • Only main effects of two factors p ij Ci S j 1Wt 2W log 1 p ij • With interaction effects (Not the saturated model) p ij Ci S j CS ij 1Wt 2W log 1 p ij 9 Mixed-Type Logistic Regression • Parameterization (PROC GENMOD in SAS): The effect at the last level of each factor is set as 0 • (Regular) logistic regression expression by dummy variables (C + W example) p log 1 p 1 x1 2 x2 ... I 1 xI 1 W 10 Quantitative Treatment of Ordinal factors • Assign scores to its categories for each ordinal factor • Treat the ordinal factors as quantitative factors to fit GLM e.g. color 11 Goodness of Fit • Deviance or comparison to the full model • Residuals • Model comparisons (L-R tests) 12