in Rasch Model

Download Report

Transcript in Rasch Model

AN OVERVIEW OF
THE FAMILY OF
RASCH MODELS
Elena Kardanova
University of Ostrava
Czech republic
26-31, March, 20122

The family of Rasch measurement
models is a way to make sense of the
world.
Benjamin D. Wright
Advantages of Rasch Models






The simplest models that provide parameter invariance
Include minimal number of parameters
Parameters have simple interpretation, can be easily
estimated (on the interval scale with estimate of
precision)
Can be applied to all item types which use in educational
and psychological tests
Theory of item and examinee analysis is well developed
All specific testing problems can be easily solved
Family of Rasch models:
Dichotomous Rasch Model
 Partial Credit Model
 Rating Scale Model
 Binomial and Poisson Models
 Many-Facet Rasch Model
 Multidimensional Rasch Model

Criteria for model choice:





The number of response categories: two vs. more than
two
The structure of response alternatives in polytomous
items: common vs. individual
The number of attempts to an item: one attempt vs. more
than one
The number of examinee parameters: one ability vs.
more than one
The number of factors influencing the examinee
performance: only item difficulty vs. plus additional
factors
Relationship between basic
Rasch models
Software
Winsteps (Dichotomous Model, PCM,
RSM, Binomial and Poisson Models)
 ConQuest (all models except Binomial and
Poisson Models)
 Facets (Many-Facet Model)
 Other IRT software (depends on the
software)

Dichotomous Rasch Model
exp( n   i )
P ( X ni  1 /  n ,  i ) 
1  exp( n   i )
 P(Xni =1/ θn, δi) is the probability that an examinee n
(n=1,…,N) with ability θn answers item i (i=1,…,I) with
difficulty δi correctly.
 The model is called one-parameter because the
probabilty Pni is a function of difference (θn - δi).
 The model is also called logistic because the function is
logistic
Item Characteristic Curve
in Rasch Dichotomous model
δi – the point on the ability scale where the probability of a correct response is
0.5. The greater the value of this parameter, the greater the ability that is
required for an examinee to have a 50% chance of getting the item correct;
hence, the harder the item.
In theory δi parameter can vary from -∞ to +∞, but typically values of δi vary
from about -3 to +3.
ICCs of three items in Rasch model
with difficulties δ1= -1, δ2= 0 и δ3= +1
Assumptions of Rasch model




ICCs differ only in their location along the ability scale,
they don’t cross (are parallel).
Item difficulty is the only item characteristic that
influences examinee performance.
All items are equally discriminating.
The lower asymptote of the ICC is zero: examinees of
very low ability have zero probability of correctly
answering the item (no guessing).
Parameter Interpretation in
Dichotomous Rasch Model

An ability level of any examinee is defined as logarithm
chance for this examinee to answer correctly an item
with 0 difficulty:
Pn1
 n  ln
,
1  Pn1

A difficulty level of any item is defined as logarithm
chance to answer correctly this item by an examinee
with 0 ability:
1  P1i
 i  ln
P1i
Parameter Separation in Rasch
Model
Pni
ln
 n  i
1  Pni




Log odds that a person passes an item is just difference between
examinee ability level and item difficulty.
Item and examinee parameters are completely separated, making it
possible to estimate examinee ability independently of item difficulty,
and to estimate item difficulty independently of examinee ability.
Item and examinee parameters lie on the same linear scale.
The unit of measurement on this scale is one logit (shortening of
log-odds unit – the unit of logarithm chances).
Concept of “Specific Objectivity”
in Rasch Model

Comparisons between objects must be invariant over the specific
conditions under which they were observed :
- comparisons between persons must be invariant over the specific
items used to measure them,
- comparisons between items must be invariant over the specific
persons used to calibrate them.

Only Rasch models guarantee this property.
Invariant-Person Comparisons: the same
differences are observed regardless of the
items
Consider the Rasch model predictions for log odds ratio for two persons with
abilities θ1 and θ2 for an item with difficulty δi :
ln
P1i
P
 1   i , ln 2i   2   i
1  P1i
1  P2i
Subtracting the differences yields the following:
ln
P1i
P
 ln 2i  (1   i )  ( 2   i )  1   2
1  P1i
1  P2i
Thus, the difference in log odds for any item is simply the difference
between the two abilities: the item difficulty δi dropped out of the equation.
So, the same difference in performance between the two persons is
expected, regardless of item difficulty.
Invariant-Item Comparisons: differences
between items don’t depend on the
particular persons used to compare them
Consider two items with difficulties δ1 and δ2 and the following two equations for
the log odds of two items for any person n:
ln
Pn1
  n  1 ,
1  Pn1
ln
Pn 2
 n   2
1  Pn 2
Subtracting the differences yields:
ln
Pn1
P
 ln n 2  ( n  1 )  ( n   2 )   2  1
1  Pn1
1  Pn 2
The ability level dropped out of the equation. So, the expected difference in
performance for any examinee is the difference between item difficulties.
Other IRT Models (2PL and 3PL) fail to meet
“specific objectivity”:
For example, comparison of two persons in the framework of 2PL model yields
the following:
P
P
ln 1i  ai (1   i ), ln 2i  ai ( 2   i )
1  P1i
1  P2i
And further
ln
P1i
P
 ln 2i  ai (1   i )  ai ( 2   i )  ai (1   2 )
1  P1i
1  P2i
The right part of this equation contains a discrimination parameter ai
of the item. So, unlike the Rasch model, the expected difference in
performance does not depend only on abilities; it is proportional to their
difference with the proportion ai depending on the particular item.
Parameter Estimation in Rasch
Model



Total number of parameters to be estimated in dichotomous Rasch
model is N+I, where N is the number of examinees, I is the number
of items.
Methods of mathematical statistics are used for parameter point
estimation. Most estimation methods employ some form of the
method of maximum likelihood (without distributional assumptions or
with distributional assumptions regarding the parameters).
Under Rasch model raw scores are sufficient statistics for both items
and persons measures. It means that all examinees with the same
raw score will get the same ability estimate. Similarly for items. Due
to this property, all measures can be estimated simultaneously.
Probability Curves for Rasch
Dichotomous Model
πni0 and πni1 – probabilities of getting by an examinee score 0 and 1
for item i.
In dichotomous case πni1=Pni and πni0 = 1- πni1 = 1- Pni .
Partial Credit Model




A simple extension of dichotomous Rasch model: one or
more intermediate levels of performance are allowed.
Different levels of performance are labelled 0 (no steps
taken), 1, 2, …, m (the highest level of performance
possible).
In order to reach the highest category m, an examinee
must complete m steps consecutively, getting 1 point for
each of them. Each step can be taken only if the
previous step has been completed.
Difficulty of each step doesn’t depend on difficulties of
other steps.
Two-step item (m=2)



Performance levels: 0 (absolutely correct, superior
quality) ,1 (particular correct, good quality) and 2
(incorrect, poor quality).
An item has an intermediate scoring level which allows
to award an additional point for particular completed
item.
Such item has three possible categories and two steps.
The probability of completing each step can
be described by a Rasch model:
exp  n   i1 
Pni1 
,
1  exp  n   i1 




exp( n   i 2 )
Pni 2 
1  exp( n   i 2 )
Pni1 - probability of person n scoring 1 rather 0 on item i
Pni2 - probability of person n scoring 2 rather 1 on item i
θn - ability level of examinee n
δi1 and δi2 – step difficulties in item i.
Item Operating Curves for two-step
item (Step Characteristics Curves)
Partial Credit Model
k
 nik 



exp  ( n   ij )
j 0

mi
l
l 0
j 0
,    exp  ( n   ij )
π nik is the probability of examinee n with ability
θn to get score k for item i.
k is the count of the completed item steps.
k=0,1,…, mi , where mi is the number of item
steps.
Category Probability Curves for
Two-Step Item
Category Probability Curves for Two-Step Item
for the case δi1 > δi2
When the second step is easier
than the first, the probability curve
for the middle response category
doesn’t dominate on any part of the
ability scale.
Even though the second step is
easier than the first, the defined
order of the response categories
requires that this easier second
step be undertaken only after the
harder first step has been
successfully completed.
PCM can be written as:
 nik
ln
  n   ik
 ni ( k 1)

For any step k log odds for this examinee
is only defined by the difficulty of the step
δik
Step Characteristics Curves in PCM
(Operaing Curves)
These operating
curves have the same
slope (so don’t cross)
and differ only in their
location on the ability
continuum.
Item Characteristic Curve for two-step item
ICC for polytomous item
represents an expected
score on the item as a
function of examinee
ability level
ICCs of several two-step items
Unlike ICCs in the
dichotomous Rasch
model, ICCs of
different polytomous
items are not parllel,
they can cross
Rating Scale Model




Can be considered as a particular case of PCM when all
items have the common response format (for example,
Likert scale)
Usually is used to collect attitude data
Each item is provided with a stem (or statement of
attitude) and a few response alternatives where a
respondent is required to chose one, indicating the
extent to which the statement in the stem is endorsed
Thus, all items have m response alternatives and they
are the same for all items. Completing of the k-th step
can be considered as choosing the k-th alternative over
the (k-1)-th in response to the item.
Example: Likert Scale

Has 4 or 5 categories: Strongly Disagree, Disagree, Undecided (or
Neutral) - may be omitted, Agree, Strongly Agree:
SD



D
N
A
SA
Response alternatives are ordered to represent a respondent’s
increasing inclination towards the concept questioned
A person who chooses to Agree with a statement on an attitude
questionnaire can be considered to have chosen Disagree over
Strongly Disagree (1-st step taken), and also Neutral over Disagree
(2-nd step taken), and also Agree over Neutral (3-rd step taken), but
to have failed to choose Strongly Agree over Agree (4-th step not
taken).
All responses are coded as 1,2,3,4,5, where the higher number
indicates a higher degree of agreement with the statement.
Concept of item difficulty

Consider two statements from the test of computer anxiety :
I am so afraid of computers I avoid using them
I am afraid that I’ll make mistakes when I use my computer

SD D N A SA
SD D N A SA
It is more than likely that the first stem indicates much higher levels
of computer anxiety that does the second stem. Indeed, the children
who respond SA on the “mistakes” stem might endorse N on the
“avoid using” stem. And we should use :
I am so afraid of computers I avoid using them
SD D N A SA
I am afraid that I’ll make mistakes when I use my computer SD D N A SA

The first item can be considered as more difficult than the second
item. So each item can be accorded a difficulty estimate (location of
the item on the variable axis)
Concept of threshold parameter



As the same set of rating points is used with every item,
it is usually thought that the relative difficulties of the
steps in each item should not vary from item to item.
The pattern of item steps around an item location is
supposed to be determined by the fixed set of threshold
parameters, that is fixed set of rating points used with all
items.
These threshold parameters are estimated once for the
entire item set.
Difficulty of any step can be resolved into
two components :
ik  i   k




δik- difficulty of completing the k-th step or choosing the k -th
alternative in the response to the item i
δi - the location of item i (item difficulty)
τk – the location of the k –th step in each item relative to that
item’s location (threshold parameter for k-th step)
The only difference between items is the difference in their
location on the variable (or difference in their difficulty). The
pattern of item steps around this location is described by the
threshold parameters τk, k =1,…,m, that is fixed set of rating
points used with all items.
Probabilities of passing each threshold can
be described by a Rasch model (for two twostep items):
exp( n  ( i   1 ))
Pni1 
,
1  exp( n  ( i   1 ))
Pnj1 




exp( n  ( j   1 ))
1  exp( n  ( j   1 ))
,
exp( n  ( i   2 ))
Pni 2 
1  exp( n  ( i   2 ))
Pnj 2 
exp( n  ( j   2 ))
1  exp( n  ( j   2 ))
Pnik - probability of person n scoring k rather k-1 (choosing the kth alternative over (k-1)-th) in response to the item i; k=1,2.
θn - ability level of examinee n
δi – the location of item i on the variable axis (item difficulty)
τi1 and τi2 – threshold parameters in item i.
Item Operating Curves (Step Characteristics
Curves) for two Rating Scale Items with
Three Response Categories
Rating Scale Model
k
 nik 


exp  (n  ( i   j ))
j 0

m
l
l 0
j 0
,    exp  (n  ( i   j ))
π nik is the probability of examinee n with ability
θn to get score k for item i (to chose the k-th
alternative).
k=0,1,…, m , where m is the number of item
steps in any item.
RSM can be written as:
 nik
ln
  n  ( i   k )
 ni ( k 1)

For any step k (or the k-th response category)
log odds of choosing the category over the
previous adjacent one for this examinee is only
defined by the difficulty of the item δi and
difficulty of the k-th step τk