Cognitive Neuroscience

Download Report

Transcript Cognitive Neuroscience

Statistics and Research methods
Wiskunde voor HMI
Bijeenkomst 2
Correlation

Association between scores on two variables
–
e.g., age and coordination skills in children, price
and quality
Scatter Diagram

A Scatter Diagram (or scatterplot) is a visual
display of the relationship between two
variables

Example: A company is interested in whether
there is a relationship between the number of
employees supervised by a manager and the
amount of stress reported by that manager
Stress and Employees Supervised
10
9
Stress Level
8
7
6
5
4
3
2
1
0
0
2
4
6
8
10
# of Employees Supervised
12
Cause and Effect

An important type of relationship between two
variables: cause and effect

Independent variable = cause
Dependent variable = effect

Correlation and Causality

Three possible directions of causality:
1. X
Y
2. X
Y
3.
Z
X
Y
Correlation and Causality

In situations where variables cannot be
manipulated experimentally, it is difficult to
know whether one is actually causing the other

Example in newspaper: “drinking coffee causes
cancer”
–
–
However, a third variable may cause both high
coffee consumption and cancer
Such third variables are called ‘confounds’

However, we can still try to predict one variable
on the basis of a second variable, even if the
causal relationship has not been determined

Predictor variable
Criterion variable

Scatter Diagrams

The independent (or predictor) variable goes
on the horizontal (x) axis; the dependent (or
criterion) variable on the vertical (y) axis.
Hours of Overtime Worked and
Spouse’s Marital Satisfaction
10
Marital Satisfaction
9
8
7
6
5
4
3
2
1
0
0
5
10
15
Hours of Overtime
20
25
Patterns of Correlation





Linear correlation
Curvilinear correlation
No correlation
Positive correlation
Negative correlation
Degree of Linear Correlation
The Correlation Coefficient


Figure correlation using Z scores
Cross-product of Z scores
–

Multiply score on one variable by score on the other
variable
Correlation coefficient
–
Average of the cross-products of Z scores
Degree of Linear Correlation
The Correlation Coefficient

Formula for the correlation coefficient:

Positive perfect correlation: r = +1
No correlation: r = 0
Negative perfect correlation: r = –1


Correlation and Causality

Correlational research design
–
–
Correlation as a statistical procedure
Correlation as a kind of research design
Issues in Interpreting the
Correlation Coefficient


Statistical significance e.g. p < .05
Proportionate reduction in error =
Proportion of variance accounted for
–
–
r2
Used to compare correlations
Issues in Interpreting the
Correlation Coefficient (continued)

Restriction in range

Unreliability of measurement
Correlation in Research Articles


Scatter diagrams occasionally shown
Correlation matrix
Regression

Making predictions
–



does knowing a person’s score on one variable allow us to say
what their score on a second variable is likely to be?
The method we use to make predictions is called
regression
When scores on one variable are used to predict
scores on another variable, it is called bivariate
regression (two variables)
When scores on two or more variables are used to
predict scores on another variable, it is called multiple
regression
Naming (two variables)
Variable Predicted
From
Variable Predicted
To
Independent Variable
Dependent Variable
Predictor Variable
Criterion Variable
Symbol
X
Y
Example
Number of hours slept
night before
Happy mood that
day
Name
Alternative Name
H
A
P
I
N
E
S
0 1 2 3 4 5
• These two variables correlate
positively
• People who drink a lot of coffee
tend to be happy, and people
who do not tend to be unhappy
• Preview: The line is called a
regression line, and represents
the estimated linear relationship
between the two variables.
Notice that the slope of the line is
positive in this example.
1
2
3
C
4
5
O F F EE
C
O N
The Regression Line



Relation between predictor variable and
predicted values of the criterion variable
Formula: Y = a + (b) X
Slope of regression line
–

Intercept of the regression line
–

Equals b, the raw-score regression coefficient
Equals a, the regression constant
Method of least squares to derive a and b
Method of least squares

a and b derived by:
–
–
least squares method (drawing)
line through MX and MY
The Regression Line
Y = a + (b) X
where
b = (SDY/SDX) = (r)(SDY/SDX)
a = MY – bMX
Bivariate Raw Score Prediction

Direct raw-score prediction model
–
Predicted raw score (on criterion variable) =
regression constant plus the result of multiplying a
raw-score regression coefficient by the raw score
on the predictor variable
Formula
–
The “hat” over Y means “predicted”
–
Yˆ  a  (b)(X )
Bivariate prediction with Z scores


Given the Z score for X, what is the Z score for Y?
We use the prediction model:
ZˆY    ZX 



where  (beta) is the “standardized regression
coefficient”
It’s also called “beta weight”, because it tells us how
much “weight” to give to ZX when making a prediction for
ZY.
The “hat” over ZY means “predicted”.
What is ?


It turns out that the best value to use for  in
the prediction model is r, the (Pearson)
correlation coefficient
Thus, the bivariate regression model is
ZˆY  r ZX 

When r = 1,ZˆY

When r = 0; no relation;
 ZX
; when r = -1,
ZˆY  0
“best guess” for Y is the mean score
ZˆY  ZX
Proportionate Reduction in Error

We want a measure of how accurate our
regression model (raw score prediction
formula) is predicting the data

We can compare the error we make when
predicting with our regression model, SSError
to the error that we would make if we didn’t
have the model SSTotal
Proportionate Reduction in Error

Error
–
Actual score minus the predicted score
Error2  (Y  Yˆ )2

SSError = Sum of squared error using prediction model


(Y  Yˆ ) 2
SSTotal = Sum of squared error when predicting from
the mean =
2


Y

M

Error and Proportionate Reduction
in Error

Formula for proportionate reduction in error:
SSTotal  SSError
Proportion
ate reductionin error
SSTotal


Proportionate reduction in error = r2
Proportion of variance accounted for