Lecture 4: Correlation and Regression

Download Report

Transcript Lecture 4: Correlation and Regression

Lecture 4: Correlation and Regression

Laura McAvinue School of Psychology Trinity College Dublin

Correlation

• Relationship between two variables – Do two variables co-vary / co-relate?

• Is mathematical ability related to IQ?

• Are depression and anxiety related?

– Does variable Y vary as a function of variable X?

• Does error awareness vary as a function of ability to sustain attention?

• Does accuracy of memory decline with age?

Correlation

• Direction – Do both variables move in the same direction?

– Do they move in opposite directions?

• Degree – What is the degree or strength of the relationship?

• Analysis – Scatterplot – Correlation Coefficient • Statistical significance

Scatterplot

 Describe the relationship between the two variables using a scatterplot  Visual representation of the relationship between the variables  Plot each observation in the study, displaying its value on variable X and variable Y  Place the predictor variable on the X axis  The independent variable, which is making the prediction  Place the criterion variable on the Y axis  The dependent variable, which is being predicted

   Participant 1 2 3

6 5 4 Dep 3 2 1

Anxiety 1 3 6 Depression 1 3 6   

1 2 3 4 5 6 Anx

No Relationship

Random Scatter

Positive Relationship

Direction in scatter

Negative Relationship

Direction in Scatter

Sometimes, the direction of the relationship might not be as obvious…

70 60 50 40 30 20 10 0 0

What is the relationship between verbal coherence and the number of pints of beer consumed?

2 4 No. of Pints 6 8 10

Regression Line

• Useful to add a regression line – Model of the relationship – Straight line that best represents the relationship between the two variables • ‘The line of best fit’ – Helps us to understand the direction of the relationship

Adding the regression line helps us see the direction of the relationship 70 60 50 40 30 20 10 0 0 2 4 No. of Pints 6 8 10

Direction of Relationship

• Positive – Two variables tend to move in the same direction • As X increases, Y also increases • As X decreases, Y also decreases • Negative – Two variables tend to move in opposite directions • As X increases, Y decreases • As X decreases, Y increases

A Positive Relationship

A Negative Relationship

Degree of Relationship

• Degree or strength of relationship – Calculate a correlation coefficient • Pearson Product-Moment Correlation Coefficient (

r

) – Statistic that varies between -1 and 1 •

r

= 0, no relationship between the variables – Change in X is not associated with systematic change in Y •

r

= 1, perfect positive correlation – Increase in X associated with systematic increase in Y •

r

= -1, perfect negative correlation – Increase in X associated with systematic decrease in Y

Interpretation of

r

Perfect Negative relationship -1 0 Absolutely No relationship +1 Perfect Positive relationship Closer Pearson r is to one of the extremes, the stronger the relationship between the variables

Calculation of Pearson

r

• Based on the covariance – A statistic representing the degree to which two variables vary together – Based on how an observation deviates from the mean on each variable

Calculation of Pearson

r

• Covariance is not suitable as measure of degree of relationship – Absolute value is a function of standard deviations – Scale the covariance by the standard deviations • Pearson

r

Assessing Magnitude of

r

• Cohen’s (1988) standards • Small Medium .1 - .29

.3 - .49

Large .5 - 1 • Statistical Significance – Test the null hypothesis that the true correlation in the population (

rho

) is zero • H o : ρ = 0 – Calculate the probability of obtaining a correlation of this size if the true correlation is zero – If

p

< .05, reject H o and conclude that it is unlikely that the results are due to chance, the correlation obtained represents a true correlation in the population

Summary

• Interested in the relationship between two variables • Direction and degree of relationship – Scatterplot & regression line • Direction – Correlation Coefficient • Magnitude • Statistical significance

As temperature increases, ice-cream consumption increases

r

= .73 (large)

n

= 12

p

= .007

As temperature increases, hot whiskey consumption decreases

r

= -.908 (large)

n

= 12

p

<.001

Issues to consider

• Assumption of linearity – Pearson correlation assumes there is a linear relationship between the two variables – Assumes the relationship can be represented by a straight line – It is possible that the relationship might be better represented by a curved line • Examine scatterplot – Curve-fitting procedures

160 140 120 100 80 60 40 20 0 0 2 VAR00003 4

Linear?

6 8 10 12 14

Non-linear?

160 140 120 100 80 60 40 20 0 0 2 VAR00003 4 6 8 10 12 14

Non-linear

14 12 10 8 6 4 2 0 0 STRESS 2 4 6 8 10 12

Issues to consider

• Correlation can be affected by – Range restrictions – Heterogeneous subsamples – Extreme observations • Correlation does not mean causation

Regression

• The regression line – A straight line that represents the relationship between two variables – Useful to add to the scatterplot to help us see the direction of the relationship – But it’s much more than this… • Prediction – Regression line enables us to predict Variable Y on the basis of Variable X

Y’ = 45

Regression

• If you have an equation of the line that represents the relationship between Variables X & Y, you can use it to predict a value of Y given a certain value of X.

X = 63

Regression Equation

ˆ 

bX

a

Predicted value of Y Predicting value of X Regression Coefficients The basic equation of a line

Regression Equation

ˆ 

bX

a

 b  The slope of the regression line  The amount of change in Y associated with a one-unit change in X  a  The intercept  The point where the regression line crosses the Y axis  The predicted value of Y when X = 0

a Y’

Regression Equation

b ˆ 

bX

a

X

Same intercept, different slopes Same slope, different intercepts

Summary

• The relationship between two variables, X & Y • Correlation – Degree and direction of relationship • Regression – Predict Y, given X – More on regression next lecture…