Workshop Neural Test Theory

Transcript Workshop Neural Test Theory

Neural test theory model for
graded response data
SHOJIMA Kojiro
The National Center for University Entrance Examinations, Japan
[email protected]
1
Accuracy of tests
• Weighing machine
– A1 weighs 73 kg
– fW(A1)=73
• fW (A1)≠74
• fW (A1)≠72
• Academic test
– B1 scores 73 points
– fT(B1)=73
• fT(B1)≠74 ?
• fT(B1)≠72 ?
2
Discriminating ability of tests
• Weighing machine
– A1 weighs 73 kg
– A2 weighs 75 kg
• fW(A1)<fW (A2)
• Academic test
– B1 scores 73 points
– B2 scores 75 points
• fT(B1)<fT (B2) ?
3
Resolving ability of tests
• Weighing machine
– A1 weighs 73 kg
– A2 weighs 75 kg
– A3 weighs ...
• Academic test
kg
– B1 scores 73 points
– B2 scores 75 points
– B3 scores ...
T
4
Neural Test Theory (NTT)
• Academic tests are an important public tool
• Precise measurements are difficult
– 10% measurement error
• Tests are at best capable of classifying academic
ability into 5–20 levels
• Neural test theory (NTT)
– Shojima, K. (2009) Neural test theory. K. Shigemasu et al. (Eds.) New
Trends in Psychometrics, Universal Academy Press, Inc., pp. 417-426.
– Test theory that uses the mechanism of a self-organizing map (SOM;
Kohonen, 1995)
– Latent scale is ordinal
5
Continuous academic
ability evaluation scale
based on IRT or CTT
It is difficult to explain the
relationship between scores and
abilities because individual
abilities also change continuously
For Qualifying Tests
Graded
evaluation
↓
Accountability
↓
Qualification test
Ordinal academic ability
evaluation scale based on
Neural Test Theory
Because the individual abilities also
change in stages, it is easy to explain
the relationship between scores and
abilities. This increases the test’s
accountability.
6
Statistical Learning Procedure in NTT
・For (t=1; t ≤ T; t = t + 1)
・U(t)←Randomly sort row vectors of U
・For (h=1; h ≤ N; h = h + 1)
・Obtain zh(t) from uh(t)
・Select winner rank for uh(t)
・Obtain V(t,h) by updating V(t,h−1)
・V(t,N)←V(t+1,0)
Point 1
Point 2
7
NTT Mechanism
Point 2
1
Point 2
1
Response
1
0
Number of items
0
1
0
1
0
0
1
0
1
0
1
0
1
0
Latent rank scale
0
1
8
Point 1: Winner Rank Selection
Likelihood
(t )
h
p(u | V
( t , h 1)
n


 
)   zhj(t ) uhj(t ) ln vqj(t ,h1)  1  uhj(t ) ln 1  vqj(t ,h1)

j 1
ML
Rw( ML ) : w  arg maxln p(u (ht ) | V (t ,h1) )
Bayes
Rw( MAP) : w  arg max ln p(u (ht ) | V (t ,h1) )  ln p( f q )
qQ
qQ


The least squares method can also be used.
9
Point 2: Update the rank reference matrix
V(t ,h)  V(t ,h1)  (1n h(t ) ' )
(t )
h (t )  {hqw
} (n 1)
2



Q
(
q

w
)
(t )
t
hqw 
exp
2 2 
N
 2Q  t 
(T  t )1  (t  1) T
t 
T 1
(T  t ) 1  (t  1) T
t 
T 1
(z(ht ) 1Q' )
u
1  V(t ,h1)
(t ) '
h Q

• The nodes of the ranks
nearer to the winner are
updated to become closer to
the input data
• h: tension
• α: size of tension
• σ: region size of learning
propagation
10
Analysis Example
• A geography test
5000
35
17
35
2
33
16.911
4.976
0.313
-0.074
0.704
500
FREQUENCY
N
n
Median
Max
Min
Range
Mean
Sd
Skew
Kurt
Alpha
400
300
200
100
0
0
5
10 15 20 25 30 35
SCORE
11
Fit Indices
ML, Q=10
ML, Q=5
• Useful for determining the number of latent
ranks
12
Item Reference Profiles
Monotonic increasing constraint can be imposed
13
Test Reference Profile (TRP)
• Weighted sum of IRPs
• Expected value of each
latent rank
• Weakly ordinal alignment condition
– TRP increases monotonically, but not all IRPs increase monotonically
• Strongly ordinal alignment condition
– All IRPs increase monotonically  TRP also increases monotonically
• For the latent scale to be an ordinal scale, it must at least satisfy the weakly
ordinal alignment condition (WOAC).
14
Rank Membership Profile (RMP)
• Posterior distribution of the latent rank to
which each examinee belongs
RMP
piq 
p(u i | v q ) p( f q )

Q
p
(
u
|
v
)
p
(
f
)
i
q
'
q
'
q '1
15
Examples of RMP
0
1
0.6
0.4
0.2
0
1
0.6
0.4
0.2
1
0.8
0.6
0.4
0.2
0
Examinee 12
0.8
0.6
0.4
0.2
0
2
4
6
8
LATENT RANK
10
0.4
0.2
0.8
0.6
0.4
0.2
10
10
PROBABILITY
0.2
1
0.4
0.2
10
Examinee 10
0.8
0.6
0.4
0.2
0
2
4
6
8
LATENT RANK
10
2
4
6
8
LATENT RANK
Examinee 14
1
0.8
0.6
0.4
0.2
0
2
4
6
8
LATENT RANK
0.4
2
4
6
8
LATENT RANK
Examinee 9
0.6
1
0.6
0
0
Examinee 13
Examinee 5
0.8
10
0.8
10
0
2
4
6
8
LATENT RANK
0.2
1
0.6
1
0.4
2
4
6
8
LATENT RANK
Examinee 8
2
4
6
8
LATENT RANK
PROBABILITY
Examinee 11
0.6
0
0.8
10
1
0.8
10
0
2
4
6
8
LATENT RANK
PROBABILITY
PROBABILITY
Examinee 7
0.8
10
0.2
2
4
6
8
LATENT RANK
0
2
4
6
8
LATENT RANK
0.4
10
PROBABILITY
Examinee 6
0.6
0
2
4
6
8
LATENT RANK
PROBABILITY
PROBABILITY
10
0.8
1
0.2
0
2
4
6
8
LATENT RANK
1
0.4
0.8
Examinee 4
PROBABILITY
0.2
0.6
1
PROBABILITY
0.4
0.8
Examinee 3
PROBABILITY
0.6
1
PROBABILITY
0.8
Examinee 2
PROBABILITY
1
PROBABILITY
Examinee 1
PROBABILITY
PROBABILITY
1
10
Examinee 15
0.8
0.6
0.4
0.2
0
2
4
6
8
LATENT RANK
10
2
4
6
8
LATENT RANK
10
16
Extended Models
• Graded Neural Test Model (RN07-03)
– NTT model for ordinal polytomous data
• Nominal Neural Test Model (RN07-21)
– NTT model for nominal polytomous data
• Continuous Neural Test Model
• Multidimensional Neural Test Model
17
Graded NTT Model
0.8
1
0
1
2
0.6
3
2
0.4
0.2
3
3
0.0
2
1.0
012
6
1
0
1
0.6
01
2
2
0.4
3
2
0.2
3
3
0.0
2
6
8 10
LATENT RANK
3
3
0
0.8
1
6
8
0
1
0.6
2
3
0.4
3
3
0.2
2
3
0.2
3
3
4
6
8
10
LATENT RANK
4
6
8
10
LATENT RANK
1.0
01
2
2
2
10
2
2
0.6
10
2
0.4
10
0.0
4
1.0
0
1
0.8
LATENT RANK
0.0
4
3
0.2
2
PROBABILITY
PROBABILITY
0.8
2
0.4
8 10
1.0
10
2
2
0.6
LATENT RANK
0
01
0.8
0.0
4
1.0
0
1
PROBABILITY
0
PROBABILITY
1.0
PROBABILITY
PROBABILITY
Boundary Category Reference Profiles
0
0
0.8
01
2
1
0.6
3
1
0.4
2
0.2
2
0.0
3
2
3
4
6
8
10
LATENT RANK
Graded NTT Model
Item Category Reference Profile
0.8
3
0.6
0.4
2
1
2
0.2
1
3
0
3
0.0
2
4
2
0
1
0
6
8 10
0.8
0.6
2
2
1
0.4
0
3
0.0
2
3
2
1
3
0
0.2
LATENT RANK
4
6
1
0
8
0.2
2
0
0.0
3
2
2
1
4
2
3
3
0
1
0
6
8 10
LATENT RANK
3
0.4
2
2
3
0.2
1
3
1
10
0
2
2
1
0
0
4
6
8
10
LATENT RANK
1.0
PROBABILITY
1
0.4
PROBABILITY
0.6
0.6
0.0
1.0
0.8
0.8
LATENT RANK
1.0
PROBABILITY
1.0
PROBABILITY
1.0
PROBABILITY
PROBABILITY
1.0
0.8
0.6
1
0.4
2
1
2
0
3
0.2
0.0
2
2
3
1
0
3
0
4
6
8
0.8
0.6
3
1
0
0.4
0.2
2
3
0.0
10
LATENT RANK
1
2
0
3
2
4
6
2
1
0
8
10
LATENT RANK
Nominal NTT Model
Item Category Reference Profile
* Correct selection, x Combined categories selected less than 10％ of the time
0.8
0.6
3
3
0.4
2
4
x
0.2
0.0
2
4
3
2
2
4
x
4
x
6
8 10
0.8
0.4
1
4
3
x
3
1
4
x
0.2
0.0
2
4
6
1
3
x
8
3
3
2
4
x
0.4
0.2
2
4
x
0.0
2
4
6
2
4
x
4
4
0.6
4
2
3
x
0.4
0.2
3
2
x
0.0
2
LATENT RANK
0.6
3
0.4
4
x
0.2
x
4
4
x
0.0
4
6
3
2
x
8
4
6
8 10
LATENT RANK
6
8
10
0.8
4
0.6
0.2
1
4
2
3
0.0
x
0.4
10
2
4
1
2
3
x
4
2
0.8
0.6
2
3
2
3
0.4
x
0.2
3
x
x
2
4
6
8
6
2
1
3
x
8
10
LATENT RANK
1.0
0.0
2
4
3
2
x
LATENT RANK
PROBABILITY
3
PROBABILITY
0.8
3
2
x
2
1.0
3
0.2
LATENT RANK
1.0
3
2
x
0.4
1.0
0.8
8 10
4
4
0.0
PROBABILITY
0.6
PROBABILITY
3
4
0.6
10
1.0
0.8
0.8
LATENT RANK
1.0
PROBABILITY
4
0.6
LATENT RANK
PROBABILITY
1.0
PROBABILITY
1.0
PROBABILITY
PROBABILITY
1.0
0.8
4
4
x
x
4
0.6
0.4
x
0.2
0.0
10
LATENT RANK
2
4
6
8
10
LATENT RANK
• Website
http://www.rd.dnc.ac.jp/~shojima/ntt/index.htm
• Software
– EasyNTT
• By Prof. Kumagai (Niigata Univ.)
– Neutet
• By Prof. Hashimoto (NCUEE)
– Exametrika
• By Shojima (NCUEE)
21
Demonstration of Exametrika
22
Can-Do Chart (Example)
23