Multiple Regression - Michael Kalsher Home

Download Report

Transcript Multiple Regression - Michael Kalsher Home

Multiple Regression

Adv. Experimental Methods & Statistics PSYC 4310 / COGS 6310 Department of Cognitive Science

Michael J. Kalsher

PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2012, Michael Kalsher

Multiple Regression:

Basic Characteristics

• 1 Continuous DV (Outcome Variable) • 2 or more Continuous IVs (Predictors) • General Form of the Equation: – outcome

i

= (model) + error

i

– Y

pred

= (b

0

+ b

1

X

1

+ b

2

X

2

+ … + b

n

X

n

) +

 i – record sales pred = b 0 + b 1

ad budget

i + b 2

airplay

i +  PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2011, Michael Kalsher and James Watt

Scatterplot of the relationship between record sales, advertising budget and radio play Slope of b Airplay PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2011, Michael Kalsher and James Watt Slope of b Advert

Partitioning the Variance:

Sums of Squares, R, and R 2

SS T

Represents the total amount of differences between the observed values and the mean value of the outcome variable.

SS R

SS R Represents the degree of inaccuracy when the best model is fitted to the data. uses the differences between the observed data and the regression line.

SS M

Shows the reduction in inaccuracy resulting from fitting the regression model to the data. SS M uses the differences between the values of Y predicted by the model (the regression line) and the mean. A large SS M the outcome variable better than the mean.

implies the regression model predicts

Multiple R

The correlation between the observed values of Y (outcome variable) and values of Y predicted by the multiple regression model. It is a gauge of how well the model predicts the observed data. R 2 is the amount of variation in the outcome variable accounted for by the model. PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2011, Michael Kalsher and James Watt

Variance Partitioning

Variance in the outcome variable (DV) is due to action of all IV’s plus some error:

X 1 X 2 B 2 B 1 B 3 X 3 Y Var X Gender Var Y Newspaper Readership 2 Var X 3 Age Var X 1 Income PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2011, Michael Kalsher and James Watt

Covariation

Error Var Y Cov X 2 Y Cov X 1 X 2 Y Var X 2 Gender Var Y Newspaper Readership Var X 1 Income Cov X 1 Y Cov X 3 Y Cov X 1 X 3 Y Var X 3 Age PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2011, Michael Kalsher and James Watt

Partial Statistics

• Partial Correlations and Regression Coefficients

– Effect of all other IV’s are held constant when estimating the effect of each target IV.

• Covariation of other IV’s with DV is subtracted out • Partial Correlation for X 1 = r p1 = Cov X 2 Y / (Var Y - Cov X 1 X 2 Y - Cov X 1 Y - Cov X 1 X 3 Y - Cov X 3 Y)

• Partial correlations describe the

independent effect

of the IV on the DV, controlling for the effects of all other IV’s

PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2011, Michael Kalsher and James Watt

Part (semi-Partial) Statistics

• Part (semi-partial) r – Effect of other IV’s are NOT held constant. – Semi-partial r’s indicate the

marginal (additional) effect

of a particular IV on the DV, allowing all other IV’s to operate normally.

PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2011, Michael Kalsher and James Watt

Methods of Regression:

Predictor Selection and Model Entry Rules

• Selecting Predictors

– More is not better! Select the most important ones based on past research findings.

• Entering variables into the Model

– When predictors are uncorrelated order makes no difference.

– Rare to have completely uncorrelated variables, so method of entry becomes crucial.

PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2011, Michael Kalsher and James Watt

Methods of Regression

• Hierarchical

(blockwise entry) – Predictors selected and entered by researcher based on knowledge of their relative importance in predicting the outcome.

• Forced entry

(Enter) – All predictors forced into model simultaneously.

• Stepwise

(mathematically determined entry) – Forward method – Backward method – Stepwise method PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2011, Michael Kalsher and James Watt

Hierarchical / Blockwise Entry

• Researcher decides order.

• Known predictors usually entered first, in order of their importance in predicting the outcome.

• Additional predictors can be added all at once, stepwise, or hierarchically

(i.e., most important first).

PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2011, Michael Kalsher and James Watt

Forced Entry (Enter)

• All predictors forced into the model simultaneously.

• Default option • Method most appropriate for testing theory

(Studenmund Cassidy, 1987) PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2011, Michael Kalsher and James Watt

Stepwise Entry:

Forward Method Procedure

1. Initial model contains only the intercept (b 0 ).

2. SPSS next selects predictor that best predicts the outcome variable by selecting the predictor with the highest simple correlation with the outcome variable.

3. Subsequent predictors selected on the basis of the size of their semi-partial correlation with the outcome variable.

Semi-partial correlation measures how much of the remaining unexplained variance in the outcome is explained by each additional predictor.

4. Process repeated until all predictors that contribute significant unique variance to the model have been included in the model.

PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2011, Michael Kalsher and James Watt

Stepwise Entry:

Backward Method Procedure

1.

SPSS places all predictors in the model and then computes the contribution of each one by evaluating the t-test for each predictor.

2.

Significance values are compared against a removal criterion. Predictors not meeting the criterion are removed. ( In SPSS the default probability to eliminate a variable is called pout = p  0.10. (

probability out

).

3.

4.

SPSS re-estimates the regression equation with the remaining predictor variables. Process repeats until all the predictors in the equation are statistically significant, and all outside the equation are not.

Preferable to

Forward

method because of

suppressor effects

(occur when a predictor has a significant effect, but only when another variable is held constant).

PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2011, Michael Kalsher and James Watt

Suppressor Variables:

Defined

Suppressor variables

increase the size of regression coefficients associated with other IVs or set of variables (Conger, 1974). Suppressor variables could be termed

enhancers

(McFatter, 1979)

when they correlate with other IVs, and account for (or suppress) outcome-irrelevant variation (unexplained variance) in one or more other predictors, thereby improving the overall predictive power of the model.

A variable may act as a suppressor or enhancer —even when the suppressor has a significant zero-order correlation with an outcome variable —by improving the relationship of other independent variables with an outcome variable.

PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2011, Michael Kalsher and James Watt

Stepwise Entry:

Stepwise Method

Procedure 1. Same as the Forward method, except that each time a predictor is added to the equation, a removal test is made of the least useful predictor.

2. The regression equation is constantly reassessed to see whether any redundant predictors can be removed. PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2011, Michael Kalsher and James Watt

Assessing the Model I:

Does the model fit the observed data? Outliers &Influential Cases

The mayor of London at the turn of the 20 th century is interested in how drinking affects mortality. London is divided into eight regions termed “boroughs” and so he measures the number of pubs and the number of deaths over a period of time in each one.

PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2011, Michael Kalsher and James Watt

PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2011, Michael Kalsher and James Watt Statistical Oddity?

Regression Diagnostics:

Outliers and Residuals If a model fits the sample data well,

residuals

(error) should be small. Cases with large residuals could be

outliers

.

Unstandardized residuals:

measured in the same units as the outcome variable, so aren’t comparable across different models. Useful in terms of their relative size.

Standardized residuals:

Created by transforming unstandardized residuals into standard deviation units.

– In a normally distributed sample: • 95% of z-scores should lie between -1.96 and +1.96 (shouldn’t be more than 5%) • 99.% of z-scores should lie between -2.58 and +2.58 (shouldn’t be more than 1%) • 99.9% of z-scores should lie between -3.29 and +3.29 (always a problem if exceeded) •

Studentized residuals :

The unstandardized residual divided by an estimate of its standard deviation that varies point by point. More precise estimate of the error variance of a specific case.

PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2011, Michael Kalsher and James Watt

Regression Diagnostics:

Influential Cases Several residual statistics are used to assess the influence of a particular case .

• • • • •

Adjusted predicted value:

If a specific case doesn’t exert a large influence on the model, and the model is calculated WITHOUT the particular case, we would expect the adjusted predicted value of the outcome variable to be very similar.

DFFit

: The difference between the adjusted predicted value and the original predicted value.

Mahalanobis distances

: measures the distance of cases from the means of the predictor variables (values above 25 are problematic, even with large samples and more than 5 predictors).

Cook’s Distance:

measure of the overall influence of a case on the model. Values greater than 1 may be problematic (Cook & Weisberg, 1982).

Leverage:

Measures the influence of the observed value of the outcome variable over the predicted values. Values range between “0” (no influence) to “1” (complete influence over predictor).

PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2011, Michael Kalsher and James Watt

Assessing the Model II:

Checking Assumptions

Drawing conclusions about the population

Variable Types:

IVs must be quantitative or categorical; DV must be quantitative, continuous and unbounded.

• •

Non-zero variance

: Predictors must have some variation.

No perfect collinearity:

Predictors should not correlate too highly. Can be tested with the VIF (variance inflation factor). Indicates whether a predictor has a strong relationship with the other predictors. Values over 10 are worrisome.

Homoscedasticity

: Residuals at each level of the predictor(s) should have the same variance. •

Independent errors

: The residual terms for any two observations should be independent (uncorrelated). Tested with the

Durbin-Watson test

, which ranges from 0 to 4. Value of 2 means residuals are uncorrelated. Values greater than 2 indicate a negative correlation between adjacent residuals; values below 2 indicate a positive correlation.

Normally distributed errors

: Residuals are assumed to be random, normally distributed variables with a mean of 0.

Independence

: All values of the DV are assumed to be independent.

Linearity

: Assumes the relationship being modeled is linear.

PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2011, Michael Kalsher and James Watt

Multiple Regression Using SPSS

Record2.sav

PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2011, Michael Kalsher and James Watt

PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2011, Michael Kalsher and James Watt

Estimates

: Provides estimated coefficients of the regression model, test statistics and their significance.

Confidence Intervals

: Useful tool for assessing likely value of the regression coefficients in the population.

Model Fit

: Omnibus test of the model’s ability to predict the DV.

R-squared Change

: R 2 resulting from inclusion of a new predictor.

Descriptives:

Table of means, standard deviations, number of observations and correlation matrix.

Part and partial correlations:

Produces zero-order correlations, partial correlations and part correlations between each predictor and the DV.

Collinearity diagnostics:

VIF (variance inflation factor), tolerance, eigenvalues of the scaled, uncentred cross-products matrix, condition indexes, and variance proportions.

Durbin-Watson :

Tests the assumption of independent errors.

Case-wise diagnostics :

Lists the observed value of the outcome, the predicted value of the outcome, the difference between these values, and this difference standardized.

PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2011, Michael Kalsher and James Watt

Interpreting Multiple Regression

What can we learn from examining the correlations between the predictors?

PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2011, Michael Kalsher and James Watt

Multiple Regression

:

Model Summary Should be close to 2; less than 1 or greater than 3 poses a problem.

PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2011, Michael Kalsher and James Watt

Multiple Regression

:

Model Parameters PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2011, Michael Kalsher and James Watt

Multiple Regression

:

Casewise Diagnostics Allows us to examine the residual statistics for extreme cases. We changed the default criterion from 3 to 2. Given a sample of 200, we would expect fewer than 5% of cases to have standardized residuals greater than approximately +/- 2 standard deviations.

PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2011, Michael Kalsher and James Watt

Multiple Regression

:

ChildAgression.sav

A study was carried out to explore the relationship between

Aggression

and several potential predictor variables in 666 children that had an older sibling. Potential predictor variables measured were:

Parenting_Style

(high score = bad parenting)

Computer_Games

(high scores = more time playing computer games)

Television

(high score = more time watching television)

Diet

(high score = the child has good diet)

Sibling_Aggression

(high score = more aggression in older siblings) Past research indicated that parenting style and sibling aggression were good predictors of levels of aggression in younger children. All other variables were treated in an exploratory fashion. How will you analyze these data?

PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2011, Michael Kalsher and James Watt

Past research indicated that parenting style and sibling aggression were good predictors of aggression, so these should be entered in Block 1.

PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2011, Michael Kalsher and James Watt

PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2011, Michael Kalsher and James Watt

How did you decide to add the three remaining variables? Hierarchically or Simultaneously? Did the word problem provide you with any hints? PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2011, Michael Kalsher and James Watt

Multiple Regression

:

Syntax Be sure to check the Syntax to make sure you selected the desired analysis options.

PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2011, Michael Kalsher and James Watt

Multiple Regression

:

Descriptive Statistics PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2011, Michael Kalsher and James Watt

Multiple Regression

:

Correlation Results Is multicollinearity a problem? How can you tell?

PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2011, Michael Kalsher and James Watt

Multiple Regression

:

Summary of Model PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2011, Michael Kalsher and James Watt

Multiple Regression

:

Regression Coefficients

Collinearity Diagnostics: VIF

(variance inflation factor) indicates whether a predictor has a strong linear relationship with the other predictors. No larger than 10 for any value; average VIF should be 1 or lower.

Tolerance:

The reciprocal of VIF, values below 0.1 indicate serious problems.

Partial correlations

: Relationships between each predictor and the outcome variable, controlling for the effects of the other predictors.

Part correlations

: Relationship between each predictor and the outcome, controlling for the effect that the other two variables have on the outcome. In other words, the unique relationship that each predictor has with the outcome.

PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2011, Michael Kalsher and James Watt

Multiple Regression

:

Casewise Diagnostics “

Extreme

” cases: Cases with

standardized residuals

less than -2 or greater than 2. We would expect 95% of cases to have standardized residuals within about +/-2.

In our sample, 36 of 666 cases are extreme for a rate of 5.4% PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2011, Michael Kalsher and James Watt

Multiple Regression

:

Reporting the Results The ANOVA for the full model was significant,

F

(5,660)=11.88,

p

<.01. As illustrated in the model summary, the linear combination of the complete set of predictors (i.e., sibling aggression, parenting style, use of computer games, good diet, time spent watching television) accounted for a moderate portion of the variance in aggression,

R

2 = .08. The significant

R

2 change following the addition of use of computer games, good diet, time spent watching television,

F

(3,660)=7.03,

p

<.01, indicates these predictors explained an additional 3% of the variance in aggression beyond that explained by sibling aggression and parenting style.

PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2011, Michael Kalsher and James Watt

Multiple Regression

:

Reporting the Results Block 1 Block 2 Constant Parenting Style Sibling Aggression Constant Parenting Style Sibling Aggression Time Watching TV Use of Computer Games Good Diet

B

-.01

.06

.09

-.01

.06

.08

.03

.14

-.11

SE B

.01

.01

.04

.01

.02

.04

.05

.04

.04

 .19** .10* .18** .08* .03

.15** -.12**

t

-0.48

5.06

2.49

-0.42

3.89

2.11

0.72

3.85

-2.87

Sig.

.63

.00

.02

.68

.00

.04

.48

.00

.00

An analysis of the regression coefficients for the full model showed that all predictors except for time watching TV contributed significantly to the model (

p

’s < .05). As shown in the table above, parenting style, use of computer games, and sibling aggression were positively related to aggression, whereas good diet was negatively related to aggression.

PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2011, Michael Kalsher and James Watt