Paper and Pencil Test Items

Download Report

Transcript Paper and Pencil Test Items

Developing Selected-Response
Items
Two General Types of
Paper and Pencil Tests
Selected Response Items:
– Binary response items (e. g., true-false).
– Multiple-choice items.
– Matching items.
Constructed Response Items:
– Fill-in-the-blank items.
– Completion and short answer items.
– Essay items.
General Item Writing Rules
(These rules are available on the website for this course)
• Provide clear and understandable
directions to students about how to
respond.
• Be sure the items themselves are clear
(unambiguous) to students.
• Do not provide unintentional cues regarding
the correct response.
• Use grammar and vocabulary consistent
with the source of instructio.
General Item Writing Rules
Keep reading level below students’
ability.
Format the item for efficient scoring.
Be sure content experts would agree on
the correct answer.
WRITE THE ITEM SO THAT IT
MEASURES THE SPECIFIED
LEARNING TARGET.
Additoinal rules for Binary-Choice Items
Binary-choice (or alternate-choice) items
present a proposition for which one of
two opposing options represents the
correct answer.
Several variants exist:
True-false.
Fact-opinion.
Right-wrong.
Yes-no.
Binary-Choice Items
(Variations: Embedded true-false items)
Indicate whether each underlined word is used as
a verb (V) or as something other (O) than a verb.
Sailing has many advantages as a recreational sport.
You can sail by yourself or with others. While basic
techniques can be learned quickly, you can spend a
life-time developing your sailing skills.
Answers
V O has
V O with
V O spend
V O as
V O While
V O sailing
V O can
V O techniques
V O sail
V O learned
Binary-Choice Items
(Variations: Multiple true-false items)
Read each option and indicate which are
correct.
In comparison with multiple-choice items, an
advantage of true-false items is …
1. more items can be administered within a given time.
2. higher reliability is obtained from a given number of
items.
3. each test item can be developed in less time.
4. students will select the correct answer only when they
have achieved the skill being assessed.
Binary-Choice Items
Advantages and Limitations
Advantages:
• Allows adequate
sampling of (usually,
knowledge-level)
content.
• Relatively easy to
construct.
• Objectively and
efficiently scored.
Limitations:
• Highly susceptible to
guessing.
• Can be used only when
dichotomous answers
represent sufficient
response options.
• Usually, only indirectly
assess intellectual skills.
Binary-Choice Items
Qualities of Good Binary-choice Items
• Good binary-choice items should….
– Measure the specified skill (learning target).
• This requires some serious thinking
– Require appropriate level of reading skill.
– Emphasize adjectives or adverbs when they alter
or reverse the meaning of the item.
– Have one (of two) response options that is
unequivocally correct.
– Continued on next slide
Binary-Choice Items
Qualities of Good Binary-choice Items
– Exclude adjectives and adverbs that imply an indefinite
degree.
– Avoid adjectives and adverbs that imply absolute
meaning
– Be stated as simply as possible (e.g., should exclude
“window dressing”).
– Should be written so that the incorrect response is
plausible.
– Should present a single proposition (not a doublebarreled proposition)
Binary-Choice Items
Tip for improving the quality: Use contrasts
Without contrast: The reliability of shortanswer tests is unaffected by guessing.
With contrast: The reliability of short-answer
tests is less affected by guessing than is the
reliability of multiple-choice tests.
Binary-Choice Items
Examples of “double-barreled” propositions
Although essay tests require less time to
construct than do multiple-choice tests,
they require more time to score.
Classroom tests should be reliable and
yield consistent scores across time.
Binary-Choice Items
Evaluation
Learning target: Information. Identify qualities
desired in multiple-choice items.
Poor Item
TRUE or FALSE: It is not important for a multiplechoice item to contain five options.
Improved Item
TRUE or FALSE: A multiple-choice item can contain as
few as two options.
Binary-Choice Items
Evaluation
Learning target: Information. Identify qualities
desired in multiple-choice items.
Poor Item
T or F: Sometimes multiple choice items are superior
to true-false items.
Improved Item
T or F: A 10-item multiple-choice test typically will be
more reliable than a 10-item true-false test.
Binary-Choice Items
Evaluation
Learning target: Information. Identify qualities
desired in multiple-choice items.
Poor Item
T F Good multiple-choice items measure
important skills.
Improved Item
T F If plausible distracters are easy to develop, a
table of specifications is of little value when
constructing multiple-choice items.
Multiple-Choice Items
Anatomy of
Multiple-Choice Items
• MC items consist of ….
– A stem,
• Either a direct question, or
• An incomplete statement to be completed.
– A correct answer, and
– Two or more distracters or foils.
Multiple-Choice Items:
Advantages and Limitations
Advantages
Limitations
• Provide for a wide
• Somewhat susceptible
sampling of content.
to guessing.
• Effectively structure the • Indirectly measure
problem to be addressed. targeted behaviors.
• Can be quickly and
• Time-consuming to
objectively scored.
construct.
Multiple-Choice Items
Example: Due to lack of parallel content this
item may have more than one correct
answer:
Which of the following represents the
warmest temperature?
A. 100 degrees Celsius
B. 200 degrees Fahrenheit
C. 300 degrees Kelvin
D. an oven set a medium
Multiple-Choice Items
Qualities Continued
Options avoid repetitive words.
Example:
Criterion-referenced…
A. refers to how a test is constructed.
B. refers to how a test is interpreted.
C. refers to how a test is scored.
D. refers to how a passing score is
established.
Multiple-Choice Items
Qualities Continued
• Extraneous content (“window dressing”) is
excluded (example on next slide).
• Adjectives or adverbs are highlighted when
they reverse or alter the meaning of a stem.
• Words like not and except should be
emphasized.
• These can be used, but only when it is
important to do so.
Multiple-Choice Items
Which item stem contains window dressing?
A. What is the highest numerical value of a
reliability coefficient?
B. Although usually not obtainable, the
maximum value of a reliability coefficient 
is 1.0.
Multiple-Choice Items
Examples of uses of not and except
1. Which of the following qualities least affects
the reliability of a test?
2. All of the following represents types of validity
EXCEPT…
3. The quality that is not an advantage of
multiple-choice items is…
4. ALL BUT WHICH ONE of the following is...
Multiple-Choice Items
(Continued)
Sample item with equally plausible distracters:
Which item format requires students to spend the
greatest portion examination time actually solving
problems presented by the items:
A. Essay
B. Short-answer
C. True-false
D. Multiple-Choice
Multiple-Choice Items
(Continued)
• Qualities desired in M-C items, continued
– Options contain grammar consistent with the
item stem.
– The use of “all of the above” or “none of the
above” used only when necessary.
– Options are arranged in “natural” or logical
order.
Evaluating M-C Items
Poor item:
Internal consistency is high…
A. when students who scored high on the first half of
the test score high on the second half of the test.
B. when students who scored high on the first half of
the test score low on the second half of the test.
C. when students who scored high on the first half of
the test score in an unpredictable manner on the
second half of the test.
D. when all of the above are true.
Evaluating M-C Items
Improved item:
If the internal consistency of a test is good,
how will a group of students score on the
second half of the test if they got the highest
scores on the first half of the test?
A. Highest scores.
B. Lowest scores.
C. Unpredictable scores.
Evaluating M-C Items
Learning Target: Identify characteristics of
formal and informal assessments.
Which of the following is an example of informal
assessment?
A. Allowing students a choice of which questions
they will answer.
B. Not allowing students a choice of which
questions they will answer.
C. Observing which students are paying attention.
Evaluating M-C Items
Poor item:
Various item formats have specific advantages and
limitations. An advantage the essay format has
over the multiple-choice format is:
A. the essay item can assess more skills in a given
amount of time.
B. the essay item can assess students’ ability to
evaluate ideas.
C. the essay item can be reliably scored.
D. the essay item requires students to
communicate ideas in writing.
Evaluating M-C Items
Improved item:
Which is an advantage of essay over multiple
choice items?
A. Assess more skills in a given amount of time.
B. Assess students ability to evaluate ideas.
C. Evaluate students’ ability to communicate
ideas.
D. Facilitate reliable scoring of answers.
Multiple-Choice Items:
Item-writing Guidelines
1. Does the stem present a clearly stated
problem or question?
2. Is extraneous content (“window dressing”)
excluded from the stem?
3. Are adjectives or adverbs emphasized
when they reverse or significantly alter
the meaning of a stem or option?
4. Are negatives avoided wherever possible
or highlighted where necessary?
5. Are the “correct” answers equally
distributed across all choice categories?
Multiple-Choice Items:
Item-writing Guidelines
6.
7.
8.
9.
Are options parallel in form and content?
Do the options avoid repetitive words?
Is each distracter plausible?
Is the grammar in each option consistent
with the stem?
10. Does the item exclude options equivalent
to “all of the above” and “none of the
above”?
11. Unless another order is more logical, are
options arranged alphabetically?
Matching Items
• Anatomy of a matching item:
– Consist of
• Premises (or stimuli) and
• Responses.
– Advantages
• Provides for wide sampling of knowledge
targets.
• Relatively easy to construct.
• Can be scored objectively and efficiently.
1.
2.
3.
4.
5.
Specific Item-Writing Guidelines for
Matching Items
Include homogeneous premises and responses.
Use more responses than premises.
Make sure directions are clear to students.
Keep responses short and logically ordered.
Use four to ten premises (and restrict to one
page).
6. Avoid grammatical clues to correct answers.
Developing Constructed-Response
Items
Paper & Pencil Constructed-response
items, that is.
Developing Constructed-Response
Items
• Major advantage of constructedresponse items:
– They elicit responses that more closely
resemble real-life behavior.
• In general, however, if a selectedresponse item can provide the same
evaluative information as a
constructed-response item, use the
selected-response item.
Short-Answer Items:
Advantages and disadvantages
Advantages:
1. Easy to construct.
2. Require the
student to supply
and answer.
3. Many such items
can be included in
a test.
Disadvantages:
1. Generally limited
to knowledge-level
skills.
2. More likely scored
erroneously than
are selectedresponse items.
Short-Answer Items:
Item-writing rules
1.
2.
3.
4.
5.
Use direct questions rather than incomplete
statements.
Write items so that the correct response is concise
(a few words or a short phrase).
Write items so that they can be scored efficiently.
Be sure there is a highly limited set of correct
responses.
Think of the correct response, then write the item.
Completion Items:
Item-writing rules
Same advantages/disadvantages of short-answer
items.
Same rules applicable to completion items, plus these
additional four:
1.
2.
3.
4.
Be sure the blank represents a key word or phrase.
Position blank at or near the end of the item.
Keep blanks the same length.
Use no more than two or three blanks.
Essay Items: Advantages
Unique advantage: Can assess ability to
communicate in writing (synthesize,
evaluate, compose).
Other advantages:
1. Provide more direct measures of behaviors
specified in performance objectives.
2. Require the student to produce a response.
Essay Items: Limitations
Scoring is less reliable (more subjective).
1. Inconsistent within teachers across multiple scorings of
the same responses.
2. Inconsistent within teachers across students.
3. Inconsistent among teachers on the same responses.
Provides less adequate sampling of content domain.
More time-consuming to score.
Essay Item-Writing Rules
1.
Convey a clear idea of how extensive a response is expected:
–
–
–
2.
Ten minutes or less (typical for a restricted-response essays).
Specify a range for the number of words or the amount of time to be spent on
the response.
Make the distribution of points obvious.
Develop a suitable scoring plan (rubric):
–
–
–
–
Would different readers assign the same score?
Describe what constitutes a correct and complete response.
The rubric should be obvious to knowledgeable students.
You do not have an essay item unless you have a rubric!
Essay Item-Writing Rules
(Continued)
3.
4.
5.
6.
Do not allow a choice of which items to answer.
Evaluate all responses one item at a time.
Vary the student order when reading responses.
Decide on the weight grammar and vocabulary
will carry beforehand.
7. Conceal the identity of students, if possible.
8. Use multiple scorers, when possible.
Multiple-Choice Item Flaws
Examples
M-C Item Flaws
Which best describes what happens when
work is done?
A.
B.
C.
D.
A force operates through a distance.
A force is exerted.
Energy is destroyed.
Potential energy is changed to kinetic energy.
[A]Flaw: Using stereotyped phrases. Item
can be answered correctly based on recall
of verbal information as well as through
understanding of the principal involved.
M-C Item Flaws
Which of the following has helped most to
increase the length of human life?
A.
B.
C.
D.
Fast driving.
Avoidance of overeating.
Wider use of vitamins.
Wider use of inoculation.
[D]Flaw: highly implausible distracter.
Choice “A” is unreasonable, reducing the
item to a three-choice item.
M-C Item Flaws
Horace Greeley is known for his
A.
B.
C.
D.
advice to young men not to go west.
discovery of anesthetics.
editorship of the New York Times.
humorous anecdotes.
[C]Flaw: Verbal trick in distracter: choice “A”
inserts the word not into a phrase
otherwise attributable to Horace Greeley.
M-C Item Flaws
Slavery was first started
A.
B.
C.
D.
at Jamestown settlement.
at Plymouth settlement.
at a settlement in Massachusetts.
a decade before the Civil War.
[A]Flaw: Non-parallel distracters. Choices
“A” and “B” give specific places, “C”
designates a more general area, “D”
specifies a time. This ambiguity makes
more than one choice correct.
M-C Item Flaws
In purifying water for a city water supply, one
process is to have the impure water seep through
layers of sand and fine and course gravel. Here
many impurities are left behind. Below are four
terms, one of which will describe this process
better than the others. Select the correct one.
A. Sedimentation
C. Chlorination
B. Filtration
D. Aeration
[B]Flaw: Stem includes an “instructional
aside.”
M-C Item Flaws
While ironing her formal, Jane burned her
hand accidentally on the hot iron. This
was due to a transfer of heat by
A. conduction.
B. radiation.
C. conversion.
D. absorption.
[A]Flaw: Stem includes “window dressing.”
The introduction implies a practical
problem when the item only involves
knowledge of technical terms.
M-C Item Flaws
In the definition of a mineral, which of the
following is incorrect?
A.
B.
C.
D.
It was produced by geologic processes.
It has distinctive physical properties.
It contains one or more elements.
It has a variable chemical composition.
[D]Flaw: Uses a negative in the stem; tends
to be confusing. These types of items are
rarely found outside the classroom.
M-C Item Flaws
Which event is more important in
American history?
A.
B.
C.
D.
Braddock’s defeat.
Burr’s conspiracy.
Hayes-Tilden contest.
Webster-Hayne debate.
Flaw: No best answer. Who’s to say which
is more important. Even experts would
not agree.
M-C Item Flaws
The population of Denmark is about
A.
2 million.
B.
15 million.
C.
4 million.
D.
7 million.
Flaw: Unnatural sequence of responses. It
would be better to order from 2 million to
15 million.
M-C Item Flaws
The balance sheet report for the Ajax
Canning Company would reveal (A) the
company’s profit for the previous fiscal
year, (B) the amount of money owed to
its creditors, (C) the amount of income
tax paid, or (D) the amount of sales for
the previous fiscal period.
[A]Flaw: Placing distracters in tandem with
the item stem.
M-C Item Flaws
Which is the best definition of a vein?
A.
B.
C.
D
A blood vessel carrying blood going to the
heart.
A blood vessel carrying blue blood.
A blood vessel carrying impure blood.
A blood vessel carrying blood away from the
heart.
[A]Flaw: Needless repetition in the
distracters.
End