下載/瀏覽Download

Download Report

Transcript 下載/瀏覽Download

Estimation of Item Difficulty Index
Based on Item Response Theory for
Computerized Adaptive Testing
Authors:Shu-Chen Cheng,
Guan-Yu Chen
Southern Taiwan University of Science and Technology
Intelligent System Lab. (iLab)
1
Outline
1. Introduction
2. Literature Reviews
3. Methods
4. Experiments and Results
5. Conclusions
Southern Taiwan University of Science and Technology
Intelligent System Lab. (iLab)
2
1. Introduction (1/2)
• Computerized Adaptive Testing
– Item Response Theory
Advantage: Personalized test, Shorter test length.
Shortcoming: The number of pre-test samples.
• IRT-1PL: 20 items, 200 testees (Wright & Stone, 1979)
• IRT-2PL: 30 items, 500 testees (Hulin et al., 1982)
• IRT-3PL: 60 items, 1000 testees (Hulin et al., 1982)
( There are 1,513 items in our item bank!)
Southern Taiwan University of Science and Technology
Intelligent System Lab. (iLab)
3
1. Introduction (2/2)
• Test System = Item Bank + Item Selection
 Item Difficulty Index
 Answers Abnormal Rate
 Dynamic Item Selection Strategy
 Particle Swarm Optimization
Southern Taiwan University of Science and Technology
Intelligent System Lab. (iLab)
4
2. Literature Reviews
2.1 Computerized Adaptive Testing
2.2 Item Difficulty Index
2.3 Item Response Theory
Southern Taiwan University of Science and Technology
Intelligent System Lab. (iLab)
5
2.1 Computerized Adaptive Testing (1/2)
• To select the item that its difficulty is most
consistent with testee’s ability.
• To assess testee’s ability immediately.
• The difficulty of next item is affected by
previous answer.
Southern Taiwan University of Science and Technology
Intelligent System Lab. (iLab)
6
2.1 Computerized Adaptive Testing (2/2)
• To test for different abilities through dynamitic
item selection strategy.
– High ability testee  No too easy items.
– Low ability testee  No too difficult items.
• A personalized test.
Southern Taiwan University of Science and Technology
Intelligent System Lab. (iLab)
7
2.2 Item Difficulty Index (1/2)
• Method 1:
𝑅
𝑃 = × 100%
𝑁
P : Item difficulty.
R : The number of correct answers.
N : The number of total testees.
Southern Taiwan University of Science and Technology
Intelligent System Lab. (iLab)
8
2.2 Item Difficulty Index (2/2)
• Method 2:
𝑃𝐻 + 𝑃𝐿
𝑃=
2
P : Item difficulty.
PH : Correct rate of high score group.
PL : Correct rate of low score group.
(Generally take 25%, 27%, 33%, etc.)
Southern Taiwan University of Science and Technology
Intelligent System Lab. (iLab)
9
2.3 Item Response Theory (1/2)
• Item Response Theory (Lord, 1980)
– To estimate testee’s ability, aptitude, or location of other
continuous psychological interval by the information of
their item responses.
– Ability location  Item response (Psychometric theory)
– In addition to the model of IRT, without any other
information to describe the item responses.
Southern Taiwan University of Science and Technology
Intelligent System Lab. (iLab)
10
2.3 Item Response Theory (2/2)
• Three-Parameter Logistic Model (Birnbaum, 1968)
𝑒 𝑎𝑖 (𝜃−𝑏𝑖 )
𝑃𝑖 𝜃 = 𝑐𝑖 + (1 − 𝑐𝑖 ) ∙
1 + 𝑒 𝑎𝑖 (𝜃−𝑏𝑖 )
Pi(θ): Correct probability of item i for ability θ.
ai : Discrimination parameter of item i.
bi : Difficulty parameter of item i.
ci : Guess parameter of item i.
Southern Taiwan University of Science and Technology
Intelligent System Lab. (iLab)
11
3. Methods (1/4)
• Answers
1) Testees’ ability > Item difficulty index
 Most testees are supposed to answer correctly.
2) Testees’ ability < Item difficulty index
 Most testees are supposed to answer wrong.
3) Testees’ ability = Item difficulty index
 The correct answer rate is 50%.
Southern Taiwan University of Science and Technology
Intelligent System Lab. (iLab)
12
3. Methods (2/4)
• Answers Abnormal
– Violations of any one of these above 3 assumptions
among answers are answers abnormal.
1st group with wrong answers.
(Testee’s ability > Item difficulty)
2nd group with correct answers.
(Testee’s ability < Item difficulty)
3rd group, correct answer rate ≠ 0.5.
(Testee’s ability = Item difficulty)
Southern Taiwan University of Science and Technology
Intelligent System Lab. (iLab)
13
3. Methods (3/4)
• Answers Abnormal Rate
𝐴𝐴𝑅𝑖𝑗 :Answers abnormal rate of item i with difficulty j.
h :1st group (Testee’s ability > Item difficulty).
l :2nd group (Testee’s ability < Item difficulty).
e :3rd group (Testee’s ability = Item difficulty).
T :The number of correct answers.
F :The number of wrong answers.
N :The number of total testees.
Southern Taiwan University of Science and Technology
Intelligent System Lab. (iLab)
14
3. Methods (4/4)
• Item Difficulty
𝑃𝑖 = Difficulty j, let 𝐴𝐴𝑅𝑖𝑗 be the smallest.
𝑃𝑖 : Item difficulty index of item i.
𝐴𝐴𝑅𝑖𝑗 : Answers abnormal rate of
item i with difficulty j.
Southern Taiwan University of Science and Technology
Intelligent System Lab. (iLab)
15
4. Experiments and Results
4.1 System Descriptions
4.2 Experiment Descriptions
4.3 Results and Discussions
Southern Taiwan University of Science and Technology
Intelligent System Lab. (iLab)
16
4.1 System Descriptions (1/3)
http://ilearning.csie.stust.edu.tw/EST/Dedault.aspx
Southern Taiwan University of Science and Technology
Intelligent System Lab. (iLab)
17
4.1 System Descriptions (2/3)
Southern Taiwan University of Science and Technology
Intelligent System Lab. (iLab)
18
4.1 System Descriptions (3/3)
PSO Dynamic Item
Selection Strategy
• Item Difficulty
• Knowledge Weights
• Item Exposure Rate
Southern Taiwan University of Science and Technology
Intelligent System Lab. (iLab)
19
4.2 Experiment Descriptions
• Method: Online test
• Item Bank:
– Items: 1,513
– Initial Difficulty: 0.5 (9 levels, 0.1~0.9)
• Participants:
– Students: 51
– Initial Ability: 0.2 (9 levels, 0.1~0.9)
• Periods: 6 weeks
Southern Taiwan University of Science and Technology
Intelligent System Lab. (iLab)
20
4.3 Results and Discussions (1/3)
200
180
160
Number of Items.
140
120
100
80
60
40
20
0
0.1
0.2
0.3
0.4
0.5
0.6
Item Difficulty Index.
0.7
0.8
0.9
Southern Taiwan University of Science and Technology
Intelligent System Lab. (iLab)
21
4.3 Results and Discussions (2/3)
800
688
Number of Adjusted Items.
700
600
500
400
300
200
76
100
51
29
20
16
4
5
6
0
1
2
3
Weeks.
Southern Taiwan University of Science and Technology
Intelligent System Lab. (iLab)
22
4.3 Results and Discussions (3/3)
0.25
Average Adjusted Levels.
0.2
0.15
0.1
0.05
0
1
2
3
4
5
6
Weeks.
Southern Taiwan University of Science and Technology
Intelligent System Lab. (iLab)
23
5. Conclusions
• Each test item is treated as independent, and the item
difficulty can be estimated individually. Therefore,
the item bank can be expanded easily at any time.
• The estimation based on the answers abnormal rate
proposed in this study can estimate the item difficulty
index quickly and reasonably without too many pretest samples.
Southern Taiwan University of Science and Technology
Intelligent System Lab. (iLab)
24
The End ~
Thanks for your attention!
Southern Taiwan University of Science and Technology
Intelligent System Lab. (iLab)
25