Transcript Slide 1

EXAMS:
WHAT ARE THEY REALLY GOOD FOR, ANYWAY?
What professors say about exams.
What students hear about exams.
The standard assumption in nearly all education research is that
learning occurs while students study and encode material.
It is also generally assumed that testing is a relatively neutral event
that only measures the learning that occurred during study but is
not, in and of itself, a learning experience.
Thus, if learning happens exclusively during study periods, and if
tests are neutral assessments, then additional study trials should
have a strong positive effect on learning, whereas additional test
trials should produce little effect.
If repeated study and/or test trials do benefit learning, this would
contradict the conventional wisdom that students drop material
that they have already learned from further study or testing in
order to focus their efforts on material they have not yet learned.
This latter strategy is implicitly endorsed by many contemporary
theories of study-time allocation and is often used in many
popular study methods (e.g. flash cards).
Some numbers for your consideration:
The amount of hours the Coursemaster spends in preparing, setting up,
proctoring, and grading each Anatomy exam:
Making up exam:
Setting up exam:
Proctoring exam:
Grading exam:
TOTAL/EXAM
~2 hours
~2 hours
~1 hour
~5 hours
~10 hours
TOTAL FOR 3 EXAMS ~ 30 HOURS
The total number of hours Coursemaster devotes to teaching for the entire course
is ~140 hours. Thus, the Coursemaster spends ~20% as much time in duties
related solely to test taking as he does in teaching the entire course.
In a Pass/Fail grading system, this effort is made solely to determine those
students who do not meet a minimal level of competency in the subject (usually a
grade of at least 60-65%). In over 25 years of teaching at WUMS, the number of
such students is normally <1%.
Questions to ponder: Are exams a useful way for both students and faculty to be
spending their time? Are exams simply anxiety producing neutral events that only
measure learning that occurred during study, or do they contribute to learning in
some fundamental way?
In a recent experiment, 40 Washington University undergraduates were asked to learn
a list of 40 Swahili-English word pairs. Students learned the list across a total of 8
alternating study (S) and test (T) periods. The first study period consisted of 40 study
trials (5 secs/pair) followed by 40 test trials (8 secs/pair). After that, the number of
study and test trials varied according to the condition.
1. In the first condition, subjects repeatedly studied, and were tested on, the entire
list of 40 word pairs in each study and test period (denoted ST).
2. In the second condition, once a word pair was “learned” (i.e., recalled) it was
dropped from further study but still tested in each subsequent test period
(denoted SnT).
3. In the third condition, “learned” pairs were dropped from further testing but still
studied in each subsequent study period (denoted STn).
4. In the fourth condition, “learned” pairs were dropped from both study and test
periods (SnTn). This condition represents what conventional wisdom and many
educators instruct students to do: Study something until it is learned (i.e., can be
recalled) and then drop it from further study and testing ( “non-cumulative”
exams?).
Which learning strategy do you think led to a steeper learning curve of the study material?
Jeffrey D. Karpicke and Henry L. Roediger III, The Critical Importance of Retrieval for Learning Science
V. 319:966-968; 15 Feb. 2008
IT DIDN’T MAKE THE SLIGHTEST BIT OF DIFFERENCE!
Cumulative performance during the learning phase. There were no differences in the
learning curves of the four learning strategies.
The students in all four conditions also predicted they would recall about 50% of the word pairs
when they were retested in 1 weeks time.
THIS IS NOT WHAT HAPPENED!
Proportion of word pairs recalled on the retest taken 1 week after “learning”.
Total number of trials:
320
237
243
155
BEARING THESE RESULTS IN MIND,
WHERE DO WE GO FROM HERE?
Since we have such a select medical student body, this should allow
us to be more creative in our “testing” of students. Our exams should NOT
simply test the basic “competency level” of our students. Our students ARE
competent! One possibility - let the NBME filter out the fewer than 1% who
aren’t competent for one reason or another.
Medicine, like all science today, is a collaborative enterprise. Therefore we
need to find creative ways to make our testing reflect this new paradigm. For
example, can we find ways to “team test” and still be confident that each
individual student is doing their part? Peer pressure may be a powerful
motivator in this regard.
Testing should be frequent, non-threatening (self-testing?), and collaborative.
KEEPING IT FRESH: THE QUESTION OF THE WEEK
(our attempt to combine competency skills, testing benefits, and benign peer pressure)
WEEK 1
• Can these two x-rays be from the same patient?
• If so, where is the injury?
• Is the spinal cord damaged? If so, which part?
WEEKS 2-3
• Would this patient need to be on a respirator?
Why or why not?
• Would any part of the autonomic nervous
system be damaged?
• Where would you insert a needle to obtain
a CSF sample from this patient?
• What anatomical landmarks would you use for this
procedure?
WEEKS 6-7
• Would you expect any loss of bladder control
and/or sexual function? Why or why not?
WEEKS 9-11
• Would you expect any motor and/or sensory
loss in this patient’s Upper Limbs? Thorax?
Abdomen? Lower limbs?
WEEK 13-14
• What might be a cause of elevated CSF pressure?
In such a condition would you still consider
collecting a CSF sample the same way you
suggested in WEEK 2?
SO WHAT’S THE TAKE HOME MESSAGE FOR WUMS STUDENTS?
STUDY LESS
AND TEST MORE!
THE END
Keeping It Fresh:
The Challenge of An Old Subject
(Anatomy)
Middle-Aged
Professors
and Young Students