Transcript Development of Exercises for Basic Surgical Skills Assessment
Development of Exercises for Basic Surgical Skills Assessment
Niyant Patel, James Robbins, Mario Villalba Jr., Daryl Reid, and Charles Shanley Department of Surgery William Beaumont Hospital, Royal Oak, Michigan
Changes in Operative Experience
The 80 hour workweek Resident Autonomy Specialized Centers Minimally Invasive Surgery
Uniformly Used Methods of Assessment
Operative Logs Faculty Evaluations In-training Examination scores
Goals and Objectives
To develop low fidelity exercises for basic, open surgical skills To demonstrate construct validity To establish interrater reliability To show internal consistency of the test
Definitions
Construct validity Extent to which a test discriminates between various levels of expertise Interrater reliability Extent of agreement between two or more independent raters Internal consistency Correlation of parts of a test with each other
Model Development
Low fidelity Reproducible Portable Focused on components of basic skills
Model Development
The five included in this study had face validity* All exercises were limited by time Promote efficiency Accentuate differences * Face validity - Resemblance to real life situations
Exercises 1 & 2 Needle Driving
30 targets 4 x 2 inch label
Exercise 1 Needle Driving
The needle was placed directly through the target and out the sides
Exercise 2 Needle Driving (blind)
The needle was placed through the sides and out the target
Exercises 1 & 2 Needle Driving
Metrics recorded Accuracy of each target Time (limit 300s) Score = (Red x 7.46)+ (White x 1.95) + Blue - Miss + ((300-time) x (total completed/30))
Accuracy Scoring
Red
Score = (Red x 7.46)+ (White x 1.95) + Blue - Miss + ((300-time) x (total completed/30))
Accuracy Scoring
White
Score = (Red x 7.46)+ (White x 1.95) + Blue - Miss + ((300-time) x (total completed/30))
Exercise 3 Needle Transferring
30 needles, 3 different sizes Pick up with forceps, transfer to needle driver, place into sponge Metrics recorded Number transferred Number dropped Time (limit 150s) Score = (transferred x 2) – dropped + ((150-time) x (needles attempted/30))
Exercise 4 Fine Forceps use
Threading of beads onto monofilament with forceps Metrics recorded Number threaded Time (limit 150s) Score = Beads Threaded
Exercise 5 Knot Tying
4 knots Any type or technique Metrics recorded Secure knots in appropriate place Time (limit 150s) Score = (knots x 10)+ ((150-time) x (total completed/4))
Testing and Scoring
Forty Volunteers general surgical residents and attending surgeons All participants were scored by an evaluator and independently scored themselves Normalization of scores to the highest score for that exercise score/high score x 100
Construct Validity
Discrimination between 2 levels of expertise: novice and proficient
Exercises 1 - Needle driving 2 - Needle driving (blind) 3 - Needle transferring 4 - Fine Forceps use Evaluator Scoring Novice (24) Proficient (16) p-value 35 (14) 44 (22) 42 (8) 50 (22) 59 (15) 62 (20) 67 (16) 59 (14) <0.01 0.01 <0.01 0.14 5 - Knot tying 45 (26) 87 (9) <0.01 Values are means (standard deviation). Analysis by Mann-Whitney U test. Novice - Junior residents (Postgraduate year level 1-3) Proficient - Senior residents and attendings (Postgraduate year level 4 and above)
Interrater Reliability
Extent of agreement between self-scoring and scoring by evaluators
Exercises 1 - Needle driving Self scoring 51 (18) 2 - Needle driving (blind) 47 (24) Evaluator scoring Difference p-value 45 (19) 51 (22) 6.8 (12) <0.01 -3.9 (13) 0.07 3 - Needle transferring 4 - Fine Forceps use 52 (17) 54 (19) 52 (17) 54 (19) 0 0 5 - Knot tying 62 (30) 61 (29) 0.6 (4) Values are means (standard deviation). Analysis by paired t-test. 1 1 0.32
Internal Consistency
Correlation of parts of the test with each other
0.9
0.83
0.85
Self-scoring Evaluator scoring Highly reliable value Alpha Coefficien t 0.8
0.7
0.75
0.78
Adequate Value 0.6
Overall Exclusion of Fine Forceps use exercise
Limitations
Lack of a significant difference in scores for the forceps use exercise may be the result of a type II error Despite trying to focus on specific components, our exercises likely test multiple skills Only 5 exercises were formally evaluated
Summary
Develop
low fidelity exercises for the assessment of basic, open surgical skills
Discriminate
between two levels of expertise establishing construct validity
Agreement
between raters demonstrating interrater reliability and the ability to self evaluate
Correlation
between the 5 exercises demonstrating internal consistency improved with the exclusion of the forceps use exercise
Future Directions
Establishment of other forms of validity and reliability Development of other exercises to make a comprehensive set Demonstrate evidence of improvement with practice Use of sophisticated technology
Conclusion
These data provide evidence of validity, reliability and consistency for a series of low fidelity exercises with self-evaluation metrics
Thank you for your time
Current Methods of Assessment
Operative Logs 1, 2 Faculty Evaluations 2 In-training Examination scores 3 1.
2.
3.
Cuschieri, A., et al.,
What do master surgeons think of surgical competence and revalidation?
Am J Surg, 2001.
182
(2): p. 110-6.
Reznick, R.K.,
Teaching and testing technical skills.
Am J Surg, 1993. 165(3): p. 358-61.
Scott, D.J., et al.,
Evaluating surgical competency with the American Board of Surgery In-Training Examination, skill testing, and intraoperative assessment.
Surgery, 2000. 128(4): p. 613-22.
Definitions
Face validity Resemblance to real life situations Content validity Domain that is being measured is actually being measured Concurrent validity Correlation of results with the gold standard for that domain
Definitions
Predictive validity Ability to predict future performance Test-retest reliability Consistency of trainee performance on different occasions
Construct Validity
Discrimination between 2 levels of expertise: novice and proficient
Discrimination between 2 levels of exper tise: novice and proficient
Self-scoring Exercises Novice (24) Proficient (16) p-value 1 - Needle drivi ng 46+/-17 60+/-17 0.02 Evalu ator Scoring Novice (24) Proficient (16) p-value 39+/-16 66+/-17 <0.01 2 - Needle drivi ng (blind) 3 - Needle transferring 4 - Fine Forceps use 35+/-15 65+/-23 42+/-8 67+/-16 50+/-22 59+/-14 <0.01 <0.01 0.14 45+/-22 63+/-20 87+/-9 0.01 42+/-8 67+/-16 <0.01 50+/-22 59+/-14 0.14 <0.01 5 - Knot tying 45+/-26 88+/-9 <0.01 45+/-26 Values are means ± standard deviation. An alysis by Mann-Whitney U test.
Internal consistency
0.9
0.8
Cronbach's Alpha Coefficient 0.7
0.6
0.5
Overall 1 2 3 4 Exercise removed 5 Self-scoring Evaluator scoring Highly reliable value Adequate value