No Slide Title
Download
Report
Transcript No Slide Title
Standards-based Teacher
Evaluation: What Have We
Learned?
Steven Kimball, Tony Milanowski &
Herb Heneman
Consortium for Policy Research in Education
Wisconsin Center for Education Research
University of Wisconsin-Madison
Why Study Teacher Evaluation?
Traditional teacher evaluation weak on
accountability and instructional improvement:
– Narrow conceptions of teaching
– Undifferentiated criteria, i.e.,
satisfactory/unsatisfactory
– Limited evidence, e.g., single class observation
– One-way communication in evaluation
– Variability/inconsistency
Why Study Teacher Evaluation?
“To this day, almost all educational personnel
decisions are based on judgments which,
according to the research, are only slightly more
accurate than they would be if they were based
on pure chance.”
(Medley & Coker, 1987, p.243)
Classroom-based teacher evaluation increasingly a
part of new compensation reforms (e.g., Denver
ProComp; Teacher Incentive Fund program)
Performance Evaluation A Key Part of
Knowledge & Skill-based Pay
Base pay increase or bonus (typically $300 - $3,000) for competency
demonstration
- skill blocks – technology, student assessment, curriculum unit
design, etc.
- portfolio completion
- dual certification
- graduate degree in subject taught
Base pay increase or bonus for NBPTS certification ($1,000 $15,000)
Base pay increase or bonus for classroom performance mastery
(typically $1,000 - $3,000), as measured by standards-based
teacher evaluation
May involve changes to single salary schedule
- fewer steps
- fewer or redefined lanes
- performance-linked career ladder progression
Why Standards-based Teacher
Evaluation?
Standards for teaching and related performancebased assessments exist for NBPTS, PRAXIS III,
Connecticut BEST program, Danielson Framework
Is it better than typical evaluation methods?
Can it serve as a foundation for knowledge- and
skills-based pay?
What is Standards-based Teacher
Evaluation?
Developed from effective teaching literature and
newer conceptions of teaching and learning (e.g.,
pedagogical content knowledge)
Comprehensive
Clear expectations with performance differentiation
Typically reflection and growth oriented
Multiple sources of evidence
Framework for Teaching (Danielson, 1996)
4 Domains, 22 components, 66 elements
Domains: Planning and Preparation, Classroom
Environment, Instruction, Professional
Responsibilities
Components from Instruction Domain:
– Communicating clearly and accurately
– Using question & discussion techniques
– Engaging students in learning
– Providing feedback to students
– Flexibility & responsiveness
Example from Instruction Domain,
Communicating Clearly and Accurately
Component
Element
Unsatisfactory
Directions Teacher
and
directions and
Procedures procedures are
confusing to
students.
Source:
Danielson
(1996),
p. 91.
Basic
Proficient
Distinguished
Teacher directions
and procedures
are clarified after
initial student
confusion or are
excessively
detailed.
Teacher
directions and
procedures are
clear to
students and
contain an
appropriate
level of detail.
Teacher direction
and procedures are
clear to students
and anticipate
possible student
misunderstanding.
Let’s Try It!
Using rubric to evaluate practice
CPRE Research Standards-based Teacher
Evaluation
Primary Sites:
- Cincinnati Public Schools
- Vaughn Next Century Learning Center (LA charter school)
- Washoe County (NV) School District
Secondary : Anoka-Hennepin & La Crescent, MN; Coventry,
RI; Newport News, VA
Each site adopted standards-based teacher performance
evaluation systems based largely or in part on Framework for
Teaching (Danielson, 1996)
Vaughn’s Standards
Instructional Planning & Classroom Management
comparable to Framework
3-10 Content-specific Dimensions (Departs from
Framework)
– Literacy
– Language development
– Mathematics
– Special Education, Science, Social Studies, Arts,
Technology, Physical Education, Teaming
Vaughn’s Standards, cont.
Elements of Literacy:
– Teaches phonemic awareness & systematic explicit phonics
– Develops spelling skills
– Strengthens grade-level vocabulary development
– Guides students through the reading process
– Focus on comprehension skills, aiming at higher-order thinking skills
– Guides students through the writing process
– Integrates literature
– Uses appropriate materials & teaching strategies
– Implements appropriate student activities
Evaluation Procedures - Cincinnati
Experienced Teachers
– Peer ‘Teacher Evaluator’ makes 4, administrator 2
observations, Teacher Evaluator rates on management
& instruction using 4 level rubric
– Administrator uses portfolio to rate planning &
professionalism using 4 level rubric
First Year Teachers, Struggling Teachers
– Peer ‘Consulting Teacher’ makes 6 observations to
rate on management & instruction, uses portfolio to
rate on planning & professionalism
Evaluation Procedures
Vaughn
– Teacher, peer, and administrator rate twice per year based
on varied # of observations during a fixed period, using 4
level rubric
Washoe
– Administrator rates using 4 level rubric closely modeled on
Framework. Evidence drawn from 1-6 formal classroom
observations, ‘walk-thru’s’, artifacts and discussions with
teachers
‘Theory of Action’ Linking SBTE to
Student Learning
KSBP System
-Model of Quality
Teaching
-Assessment Process
-Feedback
-PD Opportunities
-Incentives
Shared Conception
of Good Teaching
Teacher Skill
Development
Teaching
According
to the
Model
Attraction &
Retention
Student
Learning
Other
Causes
Key Links We Focused On
Is teaching according to the model associated with
higher levels of student learning?
What are the impacts of assessing teacher skills on
the postulated mediating factors of teacher quality?
- Shared conception of teaching
- Skill development
- Attraction/retention
Statistical Methods
Multilevel random intercept models with controls for prior
year achievement and various student characteristics
( e.g., free/reduced lunch) at Level 1
Two versions at Level 2:
1. no predictors of classroom intercepts EB intercept
residuals obtained and correlated with evaluation scores for
evaluated teachers
2. evaluation score as predictor of classroom intercepts
calculate effect of change in evaluation score on student test
scores (in S.D. units)
Evaluation Score – Student
Achievement Correlations*
Reading
Math
Other
Cincinnati 01-02
Gr 3-8
02-03
03-04
Vaughn
00-01
Gr 2-5
01-02
02-03
Washoe
Gr 3-5 01-02
4-6 02-03
3-6 03-04
.32
.28
.29
.50
.61
.10
.43
.34
.22
.21
.45
.13
.27
-.02 (Science)
.29
.18
.38 (Lang.
.33 Arts)
.21
.25
.19
.19
.24
.21
-
* Combined across grades
Red italic: C.I. Includes 0
Effect of Change in 1 TE Level on
Achievement on Test Scores*
Reading
Cincinnati 01-02
.26
02-03
.14
03-04
.14
Vaughn
00-01
01-02
.24
02-03
.03
Washoe
01-02
.13
02-03
.14
03-04
.10
* Combined across grades
Math
Other
.23
.18
.13
.36
.12
.14
.19
.16
.09
-.01 (Science)
.13
.27 (Lang.
.20 Arts)
-
Reliability/Inter-rater Agreement
Cincinnati:
- 60% to 80% absolute agreement at domain level; higher for
classroom mgmt, lower in new teacher samples
- Estimated reliability of 6 classroom observations over the
year: .73-.89 at domain level; lower for new teachers
Vaughn:
- self, peer, administrator agreement .74-.86 (Coef. Alpha
for 5 common domains)
- “Cross-semester” domain score correlations: .74-.93
Decision-Making Studies
Evaluator Interviews & Review of Evaluation Reports
(Cincinnati, Washoe)
- Cincinnati: well-defined rating process & close
adherence to it; outside evaluators reduce leniency
- Vaughn: less structure, more ‘gut’; strong culture
and agreement on instruction counteract leniency
- Washoe: more variation in process; more rigor for
new teachers, leniency for tenured teachers
Is Experience Better at Predicting
Student Achievement?
Correlation of student achievement with experience
lower than with evaluation score and varies by site:
- Cincinnati: -.20 to .15
- Vaughn: .00 to .32
- Washoe: .07 to .16
Student achievement tends to rise with experience over
first 3-5 years, then levels off
S tandardiz ed E B IR -M ath
Relationship of Experience to Student
Achievement, Washoe County 03-04
4
0
-3
12 3456 7
10
step
15
20
Impact – Shared Conception
Cincinnati – helped make district expectations clear,
especially to new teachers; reinforced emphasis on
student standards.
Vaughn – reinforcement & reflection of Vaughn
culture (“little school that could”)
Washoe – limited impact, competition with other
initiatives and school priorities
Impact – Attraction & Retention
Cincinnati:
- Initial implementation appears to have encouraged senior teachers
with concerns about skill evaluation to retire
- Rigorous process may encourage turnover of new teachers
Vaughn
- Strong communication of Vaughn expectations discourages low
efficacy teachers from joining and lower performers from staying
Washoe
- No impact on recruitment
- Marginal improvement in weeding out marginal teachers
Impact – Skill Improvement
Commonalities:
- Impact wide but not deep
- Promotes reflection
- Classroom management, planning
- Attention to student academic standards
- ‘Tips’, materials, solutions to specific problems
While evaluation process affects teacher practice, to maximize
impact, systematic feedback, coaching, & aligned professional
development are needed
Impact – Skill Improvement
Cincinnati:
- Initially encouraged participation in study groups
- Minimal impact on experienced teachers
Vaughn
- Influences development of new & junior teachers
Washoe
- Varies widely depending on principal & teacher
- Developmental potential limited by leniency
Other Findings about KSBP &
Evaluation
Classroom-level value-added jumps around from year
to year
At Vaughn, by the end of year 3 there was minimal
between-classroom variation in student test scores
Teachers accepted the teaching standards used to
evaluate performance, but had mixed reactions on the
fairness and validity of evaluation ratings
Other Findings, cont.
Administrators accept the teaching standards, reported
increased workload in implementing new system, & had
difficulties providing feedback
Implementation glitches were frustrating to teachers and
administrators in systems with high stakes
Lack of alignment of human resource systems (recruitment,
selection, induction, mentoring, professional development,
compensation, performance management, instructional
leadership) to the teaching standards
Guidelines for Design and
Implementation
Specify that performance improvement is a strategic
imperative
Develop teaching standards and scoring rubrics (i.e.,
competency model)
Prepare for added teacher and administrator workload
Thorough communication
Guidelines, continued
Video-taping classroom practice and/or use of
multiple evaluators
Train and re-train evaluators
Support teachers through feedback and professional
development
Align human resource management systems
Strategic HR Alignment
Student Achievement Goals
Performance Improvement Strategy
(Programs, Plans)
Performance Competencies
(What Teachers & Administrators Need to Know & Be
Able to Do)
Human Resource Programs
Recruitment - Selection - Induction - Mentoring
Prof. Development - Compensation - Performance Management Leaders
Guidelines, continued
Pilot system and monitor implementation
Examine validity and inter-rater agreement
Examples of CPRE Work on Standardsbased Teacher Evaluation (SBTE)
1997: Odden & Kelley, Paying Teachers for What They Know
and Can Do (2nd ed. 2002, Corwin Press)
2005: Milanowski, Kimball, & Odden. Teacher
Accountability Measures and Links to Learning. In Rubenstein,
R., Schwartz, A.E., Stiefel, L., and Zabel, J. (Eds.), Measuring
School Performance & Efficiency: Implications for Practice
and Research, 2005 Yearbook of the American Education
Finance Association. Larchmont, NY: Eye on Education.
2006: Heneman, Milanowski, Kimball & Odden, Standards-
based teacher evaluation as a foundation for knowledge-and
skill-based pay, available at:
http://www.cpre.org/Publications/RB45.pdf
For more information: www.wcer.wisc.edu/cpre