Item writing and analysis

Download Report

Transcript Item writing and analysis

Alternatives in Assessment
Assessment Options
Popham, W. J. (1995). Classroom assessment : what teachers need to know. Boston: Allyn and Bacon.
Item Types
• Selected-response
• Constructed-response
• Personal response
Selected-response formats
•
•
•
•
•
•
•
Conventional multiple-choice
Matching
Alternate choice
True-False
Complex multiple-choice
Multiple True – False
Context-dependent item sets (reading
comprehension)
Conventional MC item
“I’d like to buy some more coffee, please.”
“I’m sorry, but there doesn’t seem to be
____.”
A.any left
Answer (*key)
B.left any
C.leaving any
D.some left
(C.E.L.T. Test of English Structure)
Stem
distractors
Matching
original
private
royal
complete
slow
first
sorry
not public
• Keep items similar
• Unequal number of
choices
• +Efficient
• -Can lead to trivial
lists
total
Nation, I. S. P. (1990). Teaching and learning vocabulary.
Boston: Heinle & Heinle.
Alternate choice
The team needs new (shirts / shorts).
(Oxford Placement Test – Listening)
 Comparison between choices – not
true/false
 Many tests have only two plausible
distractors anyway
 + Efficient
 Score range 50% - 100%
True - False
A person who lethargic is full of energy.
•
•
•
•
TRUE / FALSE
Popular in many textbook exercises
+ Efficient
+ realistic judgment
- guessing affected by test-takers’
personality
• Score range 50% - 100%
Complex multiple choice
The fluid imbalance known as edema is commonly associated
with:
1. Allergic reactions.
2. Congestive heart failure.
3. Extensive burns.
4. Protein deficiency.
The correct answer is:
a. 1, 2, and 3.
b. 1 and 3.
c. 2 and 4.
d. 4 only.
*e. 1, 2, 3, and 4.
• More difficult to process than regular MCQ
• Does not discriminate well
BYU Guidebook for Developing Multiple-Choice Tests
Multiple True-False
The fluid imbalance known as edema is commonly associated with:
1. Allergic reactions. (T/F)
2. Congestive heart failure. (T/F)
3. Extensive burns. (T/F)
4. Protein deficiency. (T/F)
• Also often used in textbooks
• Fewer problems than complex multiplechoice
Context dependent item sets
• Sometimes known as ‘testlets’, ‘item
bundles’, ‘scenarios’
• Reading comprehension passage with
multiple questions
• Listening comprehension passage with
multiple questions
• Items are usually not independent
• Should ideally be treated as a group
Interlinear items
Rathmell, G. (1985). Test of English proficiency level.
California: Hayward: The Alemany Press.
Items using visual support
Reading proficiency test by Siwon Park, piloted in HELP program
Advantages of selected response
• easy to administer
• easy to score
• scoring is “objective”
Disadvantages of selectedresponse
• relatively difficult to create
• requires no language production
• tend to measure knowledge rather than
skill
Constructed-response formats
•
•
•
•
•
•
•
Fill-in
cloze
C-test
short answer
essay
interview
performance (weak / strong)
Fill-in item
Tom loves coffee. He drinks it ______ day.
• Requires production but is constrained
• Can target specific knowledge
• Potential for ambiguity
Cloze
The science of automatic control depends on certain common
principles by which an organism, machine, or system regulates itself.
Many historical developments up to the present day have helped to
identify these principles. For hundreds of years there ___(1)____ many
examples of automatic control systems, but no connections were
recognized among ___(2)____. A very early example was a device on
windmills designed ___(3)___ keep their sails facing into the wind ….
(Heisei University)
• Rational versus mechanical deletion
+(Relatively) easy to create and score
-What does cloze measure?
C-Test
An American friend of ours hired a car in London although he was
inexperienced in driving on the left-hand side of the road. Soon
h_(1)_ found him__(2)_ going i__(3)_ the wr__(4)_ direction ro__(5)_
a roundabout. H__(6)_ braked sha__(7)_, slid side__(8)_, and
en__(9)_ up wi__(10)_ both fr__(11)_ wheels o__(12)_ the
pave__(13)_. …
http://www.uni-duisburg.de/FB3/ANGLING/FORSCHUNG/CTENGS.htm
+Objective scoring possible
+Easy to create
-Format may confuse test takers
Short answer / Essay
ESSAY: Recently, incidents depicting violence have been increasingly
common in primetime TV programs. Some people argue that watching
violent acts encourages people to become more violent. Other people
claim that watching violence fulfills a natural desire for aggression and
actually reduces violent behavior. Which position do you agree with?
Write an essay defending your position. Be sure to support your ideas.
+Easy to create
+Extended sample of language
-Scoring can be subjective
-Interaction between language and other abilities
Interview
• Very popular format for assessing oral
language
+direct assessment of speaking ability
-scoring can be subjective
-influence of interviewer
Strengths of performance
assessments
• Positive washback for programs
• Focus on real world ability rather than
trivial knowledge
• Closer match to course objectives
• Can document creativity, critical thinking
skills
• Provide useful diagnostic information
about students’ abilities
Weaknesses of performance
assessments
• Logistically difficult to create and
administer on a large scale
• Use of rater judgments may lead to low
reliability
• Construct underrepresentation
• Construct irrelevant variance
• Generalizability
Advantages of constructedresponse
• no effect for guessing
• productive language use
• interaction of receptive and productive
skills
Disadvantages of constructedresponse
• difficult and time consuming to score
• scoring is subjective
• “bluffing” is possible
Personal response
•
•
•
•
conferences
portfolios
student profiles
self-assessment
Advantages of personal response
• student-centered
• appropriate for assessing learning
processes
• useful for learners with unique profiles
Disadvantages of personal
response
• difficult to create and structure
• scoring is subjective
Test delivery formats
• paper and pencil
• telephone
• computer
– computer-delivered
– web delivered
– computer adaptive
Adaptive testing
Sands, W. A., & Waters, B. K. (1997). Introduction to ASVAB and CAT. In W. A. Sands & B. K.
Waters & J. R. McBride (Eds.), Computerized adaptive testing (pp. 3-10). Washington:
American Psychological Association.