Reliability and validity
Download
Report
Transcript Reliability and validity
Reliability & Validity
Limits all inferences that can be drawn from
later tests
If reliable and valid scale, can have
confidence in findings
If unreliable or invalid scale need to be very
cautious
Related
measures &
outcomes
Item 1
Item 2
Item 3
CONSTRUCT
Unrelated
measures &
outcomes
Captures how the value of one variable
changes when the value of the other changes
Ranges from -1 to +1
A Pearson correlation is based on continuous
variables
Important to remember this is a relationship
for a group, not each person/item
Reflects the amount of variability shared by
two variables
Correlations
test 1
Pearson Correlation
test 1
1.000
test 2
.555**
test3
.364**
.000
.000
105
105
105
.555**
1.000
.613**
Sig. (2-tailed)
N
test 2
test3
Pearson Correlation
Sig. (2-tailed)
.000
N
105
105
105
.364**
.613**
1.000
Sig. (2-tailed)
.000
.000
N
105
105
Pearson Correlation
**. Correlation is significant at the 0.01 level (2-tailed).
.000
105
rxy =
n ΣXY - ΣX ΣY
[n ΣX2 – (ΣX)2][n ΣY2 - (ΣY)2]
rxy = correlation coefficient between x & y
n = size of sample
X = score on X variable
Y = score on Y variable
.80
.60
.40
.20
.00
to
to
to
to
to
1.0
.80
.60
.40
.20
very strong
strong
moderate
weak
weak/none
Relationships of .70 or stronger are generally
considered acceptable in reliability analyses
The extent to which a scale measures
construct consistently
Any measurement is an observed score
Reliability = true score/ (true score + error)
Less error = observed score is closer to true
score (more reliable)
We never know the “true score”
Extent to which a test is reliable over time
Calculate the correlation between two time
points for each person
◦ Items should relate positively
*Sometimes you expect the scores to be
different
Extent to which two forms of a test are
equivalent
Calculate the correlation between the two
forms of the test
Extent to which items are consistent with one
another and represent one dimension
Correlation between individual scores and the
total score
Also estimate correlations among the items
Important that all items use the same scale
and be in the same direction
Cronbach’s alpha (α)
α=
k
k-1
s2y – Σs2i
s 2y
k = number of items
S2y = variance associated with observed score
Σ s2i = sum of all variances for each item
Reliability Statistics
Cronbach's Alpha
Cronbach's Alpha
N of Items
Based on
Standardized Items
.582
.576
3
Inter-Item Correlation Matrix
-In uncertain times, I
-I’m always optimistic
-If something can go
usually expect the
about my future.
wrong for me, it will.
best.
-In uncertain times, I usually expect the
best.
-I’m always optimistic about my future.
-If something can go wrong for me, it
will.
1.000
.474
.262
.474
1.000
.200
.262
.200
1.000
Agreement between two raters
ir =
# of agreements
# of possible agreements
The extent to which the scale measures what
it is intended to measure
Can be reliable without being valid
Items sample the universe of items for a
construct
Can ask an expert (or several) whether items
seem representative
Scale relates to other measures or behaviors
in ways that would be expected
Concurrent
◦ At same time
or predictive
◦ Predicts later scores
Scale measures the underlying construct as
intended
Relation to the behaviors that the construct
represents