Outline - University of Ottawa

Download Report

Transcript Outline - University of Ottawa

Scoring Provincial Large-Scale
Assessments
María Elena Oliveri,
University of British Columbia
Britta Gundersen-Bryden,
British Columbia Ministry of Education
Kadriye Ercikan,
University of British Columbia
1
Objectives
Describe and Discuss
– Five steps used to score provincial large-scale assessments (LSAs)
– Advantages and challenges associated with diverse scoring models (e.g.,
centralized versus decentralized)
– Lessons learned in British Columbia when switching from a centralized to
a decentralized scoring model
Scoring Provincial Large-Scale Assessments
LSAs are administered to
• collect data to evaluate efficacy of school systems,
• guide policy-making
• make decisions regarding improving student learning
An accurate scoring process examined in relation to the purposes of
the test, and the decisions the assessment data are intended to
inform are key to obtaining useful data from these assessments
3
Accuracy in Scoring
• Essential to having accurate & meaningful scores is the degree to
which scoring rubrics:
– (1) appropriately and accurately identify relevant aspects of responses as
evidence of student performance,
– (2) are accurately implemented
– (3) are consistently applied across examinees
• Uniformity in scoring LSAs is central to achieving comparability
of students’ responses: ensure differences in results are attributable
to differences among examinees’ performance rather than due to
biases introduced by the use of differing scoring procedures
• A five-step process is typically used
4
Step One: “Test Design Stage”
• Design of test specifications
– That match the learning outcomes or construct(s) assessed
– Include particular weights & number of items needed to assess
each intended construct
5
Step Two: “Scoring Open-Response Items”
Decide which model to use to score open-response items:
• Centralized models are directly supervised by provincial
Ministries or Departments of Education in a central location
• Decentralized models often take place across several locations &
are performed by a considerably greater number of teachers; used
for scoring medium to low-stakes LSAs
6
Step Three: “Preparing Training Materials”
• Identify common tools to train scorers, including:
– Exemplars of students’ work demonstrating each of the scale
points in the scoring rubric
– Illustrate potential biases arising in the scoring process (e.g.,
differences in scores given to hand- vs. type-written essays)
7
Step Four: “Training of Scorers”
• Training occurs prior to scoring and can recur during the session
itself, especially if the session spans more than one day
• A “train the trainer” approach is often used
– a small cadre of more experienced team leaders are trained first, then they
train other scorers who will actually score the responses
• Team leaders often make final judgement calls on the assignment
of scores differing from exemplars
• Serves to reinforce common standards and consistency in the
assignment of scores and leads to having fair and accurate scores
8
Step Five: “Monitoring Scores”
• Includes checks for inter-marker reliability, wherein a sample of
papers is re-scored to check consistency in scoring across raters
• May serve as re-training or “re-calibration” activity, with raters
discussing scores and rationales for their scoring procedures
9
The Foundation Skills Assessment
• The Foundation Skills Assessment (FSA) will be used as a case
study to illustrate advantages and challenges associated with
switching from a centralized and decentralized scoring model
• The FSA assess Grade 4 and 7 students’ skills in reading, writing
and numeracy
• Several changes made to the FSA in 2008 as a response to
stakeholders’ demands to have more meaningful LSAs that
informed classroom practice
10
Changes to the FSA
• Earlier administration
– from May to February
• Online administration of closed-response sections
• Parents or guardians received child’s open-response test portions
& summary statement of reading, writing and numeracy skills
• Scoring model changed from a centralized to a decentralized
model
• Ministry held “train the trainer” workshops to prepare school
district personnel to organize and conduct local scoring sessions
• School districts could decide how to conduct scoring sessions
– score individually, in pairs or in groups
– double-score only a few, some or all the responses
11
Advantages of a Decentralized Model
• Professional Development
– A decentralized model allowed four times as many teachers to
work with scoring rubrics and exemplars
– Educators were able to develop a deeper understanding of
provincial standards and expectations for student achievement
– If scorers are educators, they may later apply knowledge of
rubrics and exemplars in their classroom practice and school
environments and consider the performance of their own
students in a broader provincial context
12
Advantages of a Decentralized Model
• Earlier return of test results & earlier provision of feedback to
teachers, students and the school
– More immediate feedback may lead to improving learning and
guiding student teaching
– Data informs teachers about students’ strengths and areas of
improvement in relation to provincial standards
– May be helpful in writing school plans and targeting the areas
upon which particular schools may focus
13
Challenges of a Decentralized Scoring Model
• Increased difficulty associated with
– Less time allocated to implementing cross-check procedures
– Decreased standardization of scoring instructions given to
raters
– Increased costs (higher number of teachers scoring)
– Reduced training time
14
Potential Solutions
• Provide teachers with adequate training time
– e.g., one to two days of training prior to scoring the assessments
• Increase discussion among teachers, which may involve reviewing
exemplars falling in between scale points in the rubric
• Have table leaders
– e.g., teachers with prior scoring experience
• Re-group teachers to verify difficulties or uncertainties related to
the scoring process
15
Final Note
• Closer collaboration among educators and Ministries and
Departments of Education may lead to improved tests as educators
bring their professional experience of how students learn in the
classroom to bear on test design itself
• Strong alignment between the overall purposes of the test, the test
design and the scoring model used may add value to score
interpretation and subsequent use of assessment results
16
Thank you
17