Transcript Dia 1

Statistics of EBO 2009 Examination
Symposium EBO “Training for the Trainers”
116ème Congrès de la Société Française d’Ophtalmologie (SFO)
Danny G.P. Mathysen
MSc. Biomedical Sciences
EBOD Assessment and Executive Officer
Antwerp University Hospital, Department of Ophthalmology
Wilrijkstraat 10, B-2650 Edegem, Belgium
E-mail: [email protected]
Processing examination results …
Yearly increase of candidates
• SpeedWell
– SpeedWell is specialised in
organising medical examinations
– Optical reader system
Number of Candidates
350
284
300
308
224
250
200
159
150
100
• continuous and yearly increase of
applications / interest in EBOD
– Provided software tools
• Design of the MCQ answer sheet
• Design of the Viva Voce mark sheets
• Statistical analysis output (MultiQuest®)
 verification of examination results on-site
74
50
0
2005
2006
2007
2008
2009
Statistical approaches for EBOD
• Part I. Written examination (MCQ paper)
–
–
–
–
representing 40 percent of the total candidate score
52 questions, each with 5 true-false items
10 pre-defined topics
Available in English (master), French and German (translations)
• Part II. Oral examination (Viva Voce)
– representing 60 percent of the total candidate score
– 4 topics
– Available in English, French, German (basic languages) and
(whenever possible) in native language of the candidate
Descriptive Statistics for EBOD 2009
Country
2008
2009
Δ
Country
2008
2009
Δ
Austria
2
5

Latvia
2
1

Belgium
23
25

Lithuania
1
1

4

Norway
1

Bulgaria
Czech Republic
2
2

Poland
1
2

Denmark
4
6

Slovakia
1
1

Estonia
3
2

Slovenia
6
5

Finland
7
2

Spain
14
17

France
92
96

Sweden
6
5

Germany
44
59

Switzerland
32
29

Greece
10
19

The Netherlands
7
7

Hungary
1
2

Turkey
11
5

Ireland
5
5

United Kingdom
2
1

Italy
4
6

Total
284
308

Many EU countries apply
Descriptive Statistics for EBOD 2009
• MCQ total scores
EBOD 2009 MCQ Scores
– Range of total
154 – 230Intervals
withscores:
95 % Confidence
– Mean ± SD total score: 204.11 ± 13.04
Specialists (88)
No significant difference!
Residents (220)
5
Residents
n = 220
6
7
8
Total Score
205.40 ± 12.18
9
Specialists
10
n = 88
200.91 ± 14.41
Descriptive Statistics for EBOD 2009
No significant differences!
Belgium
Switzerland
Germany
France
Residents
n = 21
207.71 ± 10.96
n = 29
207.97 ± 12.22
n = 39
209.67 ± 10.46
n = 84
201.52 ± 11.22
Specialists
n=4
181.25 ± 20.22
n = 20
206.10 ± 15.57
n = 12
200.58 ± 15.20
Total
n = 25
203.48 ± 16.14
n= 59
208.46 ± 12.54
n = 96
201.41 ± 11.80
n = 29
207.97 ± 12.22
Residents tend to have higher total MCQ scores with lower
standard deviations when compared to specialists.
In general there are no statistically significant differences
between countries.
Descriptive Statistics for EBOD 2009
• Careful selection (and modification) of master MCQs
• Translation of master MCQs (English) to German/French
by native-speaking experts in Ophthalmology
• Verification of correctness of translations by
independent EBO Examination Committee members
Descriptive Statistics for EBOD 2009
• Pre-selecting and controlling of the MCQ paper
– guarantee that EBOD remains a test in ophthalmology and not
a test in language
EBOD is not a language test!
English
German
French
Residents
n = 58
205.98 ± 12.54
n = 61
209.46 ± 11.51
n = 101
202.60 ± 11.62
Specialists
n = 53
200.08 ± 12.71
n = 21
205.67 ± 15.27
n = 14
196.93 ± 17.06
Total
n = 111
203.16 ± 12.96
n= 82
208.46 ± 12.54
n = 115
201.91 ± 12.55
Statistical approaches for EBOD
• Part I. Written examination (MCQ paper)
–
–
–
–
representing 40 percent of the total candidate score
52 questions, each with 5 true-false items
10 pre-defined topics
Available in English (master), French and German (translations)
• Part II. Oral examination (Viva Voce)
– representing 60 percent of the total candidate score
– 4 topics
– Available in English, French, German (basic languages) and
(whenever possible) in native language of the candidate
Statistical analysis of MCQ paper
• Cronbach’s coefficient alpha (r) = 0.78
– Estimator of the lower bound of the internal consistency (degree to
which all MCQs leaves are measuring the same, i.e. knowledge of
candidates) of EBOD 2009 (95% CI: 0.75 – 0.81)
internal consistency
of EBOD MCQ-test is good


260 
r
1

260 1 


260
2

 i
i 1


  Riti  i 


 i 1

260
2



  0.78



Statistical analysis of MCQ paper
• Point biserial correlation coefficient (Rit) = 0.14
– Estimator of the correlation between the individual item scores Xi
(either 0 or 1) and the total MCQ scores Yi (ranging from 154 to 230) of
the candidates
 X i X   Yi Y


Rit 

  sY
n 1 i  1  s X

1
n




correlation between
item and total MCQ score
-1
0
+1
Statistical analysis of MCQ paper
• Assessment of the degree of difficulty
– Average P-value ≈ 0.79
• Indication of items answered incorrectly by guessing ≈ 0.21
• Estimation of items answered correctly by guessing ≈ 0.21
• Estimation of percentage of candidates guessing ≈ 0.42
OR Estimation of percentage of candidates knowing ≈ 0.58
Answered
by knowledge
Answered
by guessing
0
58
79
100
Statistical analysis of MCQ paper
• Classical Analysis Methods
–
–
–
–
Cronbach Alpha (internal consistency)
Point Biserial Correlation
Degree of Difficulty
Comparison of item test scores
• Item-Response analysis
– Rasch analysis (1-parameter analysis)  items differ only in difficulty
– 3-Parameter analysis  items differ in difficulty, discriminative
power and guess factor
Statistical analysis of MCQ paper
• Advantages for EBO candidates of T/F items
– Reliable in case of translation (English, French, German)
 choice of language will not result in being (dis)advantaged
– Accessibility (e.g. dyslexia)
 not too complicated for candidates
– Duration of the examination
 stress level of candidates can be kept to a minimum
– Relatively easy to process
 results can be presented on-site
• Disadvantage for EBO candidates of T/F items
– Probability of guessing right = 50 %
 level of weakest candidates is overestimated ( oral examination)
Statistical analysis of MCQ paper
• How to overcome the disadvantages of T/F items?
– Introduction of negative marking
• Increase of discriminative power of examination
• Reduction of guess factor
– wild guesses will be punished (weakest candidates)
– guesses by reasoning (partial knowledge) will be rewarded
NEGATIVE MARKING
AT EBOD 20101
Spread of total test scores
with negative marking
Spread of total test scores
without negative marking
-130
0
260
Statistical analysis of MCQ paper
• Does negative marking influences the pass rate?
– Score of +1 in case (only) the correct answer is indicated
– Score of -0.5 in case the incorrect answer or nothing is indicated
– Score of 0 in case the “D”-option (don’t know) is indicated
6  MCQ  score SDMCQ  score
– Score conversion (pass mark = 6) (formula above)
– Other marks are derived
NEGATIVE MARKING
DOES NOT INFLUENCE
PASS RATE!
Definition of pass mark
• Synonyms: standard, cutpoint
• A pass mark is a special score that serves as boundary
between those who perform well enough and those who
do not
0
• How to set pass marks? Reaching a consensus rather
than obtaining a scientifically correct solution
100
Importance of pass marks
• The purpose of an examination is to select the group of
candidates that perform well enough (pass) and to
eliminate the group of candidates that do not perform
well enough (fail)
• In order to achieve this goal, a (limited) number of
questions are presented to the candidates
• The discriminative power of the examination will
depend on the validity of the questions used
Validity of questions
• Degree of difficulty of questions
– Can be assessed by calculating the P-value (i.e. percentage of
candidates answering correctly)
–
Thumb rule:
Avoid questions with P-value above 0.90 or below 0.10
• Degree of discriminative power of questions
– Objective measurement of the degree to which the question is able to
discriminate strong from weak candidates
– Can be assessed by calculating the Rit/Rir value (correlation of
question score to total examination score)
Thumb rule:
Avoid questions with Rit-value below 0.20
Types of pass marks
Validation of questions
• “absolute” pass mark (criterion-reference)
– expressed as a number (e.g. 70 correct responses) of test questions
– expressed as a percentage (e.g. 70 % correct responses) of test
questions
–
• how to determine reasonable criteria for candidates?
• flexibility in case you are not familiar with the technique
• “relative” pass mark (norm-reference)
– expressed as a number (e.g. 50 best performers) of examinees
– expressed as a percentage (e.g. top 20 % performers) of examinees
• number of candidates ≥ 40
• candidates have to take the test on an individual basis
Statistical analysis of MCQ paper
• Does negative marking influence the pass rate?
– Score of +1 in case (only) the correct answer is indicated
– Score of -0.5 in case the incorrect answer or nothing is indicated
– Score of 0 in case the “D”-option (don’t know) is indicated
6  MCQ  score SDMCQ  score
– Score conversion (pass mark = 6) (formula above)
– Other marks are derived
NEGATIVE MARKING
DOES NOT INFLUENCE
THE PASS RATE !!!
Statistical approaches for EBOD
• Part I. Written examination (MCQ paper)
–
–
–
–
representing 40 percent of the total candidate score
52 questions, each with 5 true-false items
10 pre-defined topics
Available in English (master), French and German (translations)
• Part II. Oral examination (Viva Voce)
– representing 60 percent of the total candidate score
– 4 topics
– Available in English, French, German (basic languages) and
(whenever possible) in native language of the candidate
Statistical approaches for EBOD
• Examiners
–
–
–
–
Careful pre-selection
Clear instructions before examination
Different questions
Different languages
• Nevertheless…
– MCQ and Viva Voce scores are well correlated
– No (dis)advantage for candidates to be assigned to any specific jury
Statistical approaches for EBOD
Topic
EBOD 2009 Viva Voce Scores
with 95 % Confidence Intervals
Specialists (88)
Score
A. Optics, Refractions,
Strabismus and Neuroophthalmology
7.62 ± 1.32
B. Cornea, External
diseases and Ocular
adnexa
7.59 ± 1.29
C. Glaucoma, Cataract
and Refractive surgery
7.45 ± 1.24
Residents (220)
5
6
7
8
Total Score
9
10
D. Posterior segment,
Ocular inflammation and
Uveitis
7.83 ± 1.28
EBOD scores are high!
Statistical approaches for EBOD
• Part I. Written examination (MCQ paper)
–
–
–
–
representing 40 percent of the total candidate score
52 questions, each with 5 true-false items
10 pre-defined topics
Available in English (master), French and German (translations)
• Part II. Oral examination (Viva Voce)
– representing 60 percent of the total candidate score
– 4 topics
– Available in English, French, German (basic languages) and
(whenever possible) in native language of the candidate
Statistical approaches for EBOD
EBOD 2009 Total Scores
with 95 % Confidence Intervals
EBOD 2009
Specialists (88)
Residents (220)
5
6
7
8
9
10
Score
Written examination
(MCQ paper)
7.42 ± 2.01
Oral examination
(Viva Voce)
7.62 ± 0.90
EBOD 2009
(MCQ + Viva Voce)
7.54 ± 1.18
EBOD scores are
comparable for MCQ
and Viva Voce!
Total Score
Residents have higher MCQ and Viva Voce scores with lower
standard deviations when compared to specialists.
Statistical approaches for EBOD
Success rate of EBOD is much higher as
compared to other medical specialties (60-70 %)
Success Rate
2005
2006
2007
2008
2009
87.6%
88.1%
89.2%
90.8%
89.6 %
EBOD success rate is quite stable over the years and quite high
as the level of candidates usually tends to be good.
18 Residents (out of 220: 8.2%) and 14 specialists (out of 88:
15.9 %) failed at EBOD 2009. As there were 308 candidates the
general failure rate was 10.4 %.
General Conclusions
• Careful monitoring of EBOD
• Careful validation of EBOD
• Reliable results of EBOD
• Stable results of EBOD over the years
• Result of careful pre- and post-assessment of EBOD
General Conclusions
• Publication on EBO website
General Conclusions
• Presentation at scientific meetings
– Société française d’Ophtalmologie (SFO 2009-2010)
– European Society of Ophthalmology (SOE 2009)
– Council for European Specialist Medical Examinations
(CESMA 2009-2010)
– International Association for Medical Education (AMEE 2010)
• Publication in peer-reviewed journal
– Manuscripts in preparation
General Conclusions