the impact of using evaluation criteria on writing

Download Report

Transcript the impact of using evaluation criteria on writing

THE IMPACT OF EVALUATION CRITERIA
ON WRITING PERFORMANCE:
A STUDY OF PRE-SERVICE ENGLISH TEACHERS
Lina Mukhopadhyay
&
Geetha Durairajan
[email protected]
20 February: TEC14
The objective of this presentation is to address the
following points:
A. Role and types of evaluation criteria (theoretical issues)
B. Impact of evaluation criteria on writing performance
(empirical evidence)
C. Creation of evaluation criteria to assess writing
(practice for teachers)
Overview
PART ONE:
a.
What are issues in assessing writing?
b. What is fairness?
c.
What is positive washback?
PART TWO:
A classroom based study on effect of evaluation criteria on writing
performance
PART THREE:
A practice session on designing task-specific evaluation criteria
PART ONE
What are issues in assessing writing?
1.
What to evaluate?
Content /and Language
2.
How to evaluate?
Scales: Holistic/ Analytical
3.
How to ensure
 inter rater reliability?
 fairness?
 positive washback?
What is fairness?
• Choose/create evaluation criteria according to level of
learners, task requirements and test purpose (construct validity)
• Share evaluation criteria with learners
(justice: inter-learner equity)
• Train learners to use evaluation criteria
(access: educational opportunity to learn)
How do I choose my
evaluation criteria?
Evaluation criteria: types
Holistic
• Only one description of each subfeature (content, language & organization)
• Used in large scale assessments
Analytical
• Level-specific descriptions for each
sub-feature (content, language &
organization)
• Mostly used to provide feedback
Role of Evaluation Criteria
1. Assessment of
learning
2. Assessment as
learning: Feedback
• To check
understanding
• To identify
strengths and
weaknesses
• To check
proficiency
levels
• To track
growth
through
formative
summative
3. Assessment for
learning: Development
• To make the
criteria
transparent to
learners
• To document
how this
effects their
writing
Evaluation Criteria: research
1. Each task needs a different scale and the criterion
should reflect the writing construct (Hamp-Lyons 1991,
1995)
2. What guides rater’s rating: educational background,
interpretations of construct of language proficiency
and task requirements (Cumming, Kantor & Powers 2001; Eckes
2005; Fahim & Baijani 2011)
3. Correlations between raters’ judgment: how to ensure
inter rater reliability (Wang 2009)
What is positive washback?
ESL/ EFL learners will be able to
a. use evaluation criteria as a checklist to fulfill task
requirements
a. understand assessor’s expectations from tasks in a
transparent manner and work to fulfill those
b. do self and peer assessments using criteria (thereby learn
to maintain inter-rater reliability and provide feedback to each
other)
c. generalize from task-specific criteria and use this
knowledge in other writing assessments
PART TWO
The Study
Aim:
To examine the role of evaluation criteria in writing
performance
To show this, we look at
(a) participants’ perception of criteria and benefit (while and post task)
(b) their awareness and use of criteria in writing
Context: A course at PhD level titled ‘Language Testing
and Assessment’ was where this study was
conducted. The course had a formative assessment
model – each assessment had task specific evaluation
criteria which were shared with the learners prior
to doing the tasks. An in-depth study was done to
positive
get evidence of learning through writing assessment.
washback
(assessment for and as learning)
fairness
Research questions
If task specific criteria are provided to adult ESL learners,
(i) will they benefit from this knowledge?
(ii) what kinds of benefit will they
experience?
Method of data collection
13 adult learners (8 female), 24 to 45 years of age, participated in
the study. 8 participants had prior teaching experience and 2 of
them reported to have used criteria in assessment.
Stage 1:
making available task-specific criteria
Stage 2:
perception of criteria
Stage 3:
using criteria (implicit training)
Stage 4:
talking about benefit(s)
Example of a writing assessment
Task prompt:
Look at the proficiency test. This was used as an entrance test for BA
English programme at EFL-U. Does this test pass all the five principles
of assessment (authenticity, reliability, validity, practicality and washback)?
Justify your stance with relevant examples. Write a critical response in
about 500 words.
Evaluation criteria:
1. Does the response contain an overall thesis statement and
comments on all the five principles? Is each principle justified with
at least one example?
(content)
2. Is the response written in academic language (e.g., passivization,
linkers, voice) and includes referencing details?
(language)
3. Is the response presented in three parts (intro-body-conclusion) with
adequate links between them? Are ideas linked at intra and inter
sentential levels?
(organization)
Method of analysis
Qualitative analysis of perceptions from two sources:
A. participants
B. tutor as evaluator
to capture instances of learning (positive washback).
Measurement of learning (positive washback)
Do the participants
a. experience an ease in planning and performing on tasks?
b. understand assessor’s expectations for each task?
a. use criteria for self and peer assessment meaningfully?
b. reflect on strengths and weaknesses post performance?
c. generalize planning and writing techniques to write critical
responses in the course and outside of it?
Findings
Were benefits experienced?
Overall usefulness of evaluation criteria
100%
Usefulness of task-specific criteria to complete tasks
94.5%
Use of criteria as a checklist to revise before submission
98.1%
Usefulness of weight allotted to each feature
96.3%
Use of criteria to understand assessor's expectations
96.3%
Usefulness of analytical criteria with level-specific
description
94.5%
Participants reported benefit at the level of planning and post task reflection,
at 96% . This was experienced due to availability of evaluation criteria to
complete writing assessments.
Benefits experienced:
(positive washback)
1. Participants’ responses
2. One instance of peer
evaluation
3. Tutor’s assessment
1a. Examples: participants’ responses
I liked the idea of writing with the prompt
and evaluation criteria as it helped me to
produce responses that were clear and to a
greater extent, up to the assessor’s
expectations.
By the end of the course my response to using
the evaluation criteria to plan and write my
assignments improved. I think that it is a very
significant and necessary aspect of writing an
assignment. For the other courses, where we
did not receive any evaluation criteria I tried
to speculate the expectations of the assessor
and create the criteria and then write the
assignment.
(S:VI)
•
ease in planning
understand
assessor’s
expectations
generalize
techniques to
other pieces of
writing – post
course
application
1b. Examples: participants’ responses
I could not follow the evaluation criteria that
much meaningfully for the first time. The
problem was not obviously with the criteria,
but with my understanding of the nature of
assignment… But later on, day by day I had
been trying to build a sort of familiarity or
say rapport with the evaluation criteria, and
started adjusting my writing into the criteria.
My later assignments would manifest how
much labor I devoted to follow those criteria.
And the result was satisfactory. I was happy,
indeed.
(S:RU)
ease in using criteria
positive reflection post
performance
Summary of benefits
During task
1. crucial to finish tasks/
assignments on the course in
an organized manner
Post task
1. useful to complete
peer assessment and
provide feedback
2. understand different levels of
performance and check before
submission which level has
their response met
3. understand assessor’s
expectation(s) and features
(content-organization-language)
that were part of different
levels of performance
2. understand strengths
and weaknesses in
one’s own work,
especially in content
development (gaps in
providing evidence
to support claims)
Source: Participants’ responses
2. Peer evaluation
In the course there was one assessment task where the
participants had to critique a test for its degree of usefulness.
Evaluation criteria to complete the task was given to the
participants before they attempted the task. They reported that
they had used the criteria while working on the task.
Later, the same criteria was used by them to do peer assessment
on the same task. It was found that the correlation of the peer
assessment to the tutor’s assessment was at r=.79. This was a
high positive correlation indicating a high degree of inter-rater
reliability.
In a one-to-one discussion (through discussion board on the
internet), the participants said that they found peer evaluation
methodical because of use of task specific criteria. They could
understand the direction in which the writing task had to be
attempted and could give appropriate scores and feedback to
their peers.
2a. Example
1. Did the criteria help you in assessing the response of your peer? If yes,
then why?
Yes, the criteria helped me assessing my peer because it allowed one to look
for specifics in the answer and score against that.
2. If you were not given the criteria but only the prompt then would your
assessment have differed? If yes, then in what way? Would you have been
able to justify the scores that you would have given as a holistic score or
analytical score? Which score type would you be likely to give in the
absence of a criteria?
Yes, if the criteria was not given then scoring would not have easy and it
would not have been based on the specifics. Also, the justification of the
scores would have been difficult. The scoring without criteria would have
been a holistic one.
3. When you were given back your response as evaluated by your peer, did
you agree on the scoring or disagree? Explain why you agreed or disagreed.
I agreed with the scoring because it was objectively scored against the
criteria given.
(S:SH)
positive
washback:
inter-rater
reliability,
feedback
3. Tutor’s assessment
Content
1.
Some attempts at forming an opinion and justifying it through
elaboration and examples.
2.
Most of the key ideas present.
3.
Argumentation is weak.
Organization
1.
Macro coherence attempted (all the key ideas were presented in
their proper order).
1.
Signaling of ideas present (organizational details of the paper
presented).
3.
Micro coherence not well developed (links between paragraphs
and sentences not well developed).
Source: Tutor as evaluator
Why were the benefits experienced?
1.
Cognitively it made tasks easier as it broke them down into
manageable bits (e.g., key ideas, text structure).
2.
It drew learners’ attention to structure content coherently and
present the ideas in an academic manner.
(comprehensible output, Schmidt 2001, 2010)
3
Provided learners with a checklist to edit and revise work prior to
submission. So criterion was made available to the participants and this
yielded positive washback.
(Hughes 2003)
Why were benefits experienced?
4.
Noticing specific details of tasks to do peer assessment
helped learners process ideas at a deeper level.
Consequently, they could give each other meaningful
feedback on responses.
(Robinson 2009)
5.
Learners felt responsible for what they had written and
evaluated: they learned to focus closely on content
development. For instance use of appropriate examples to
substantiate a claim was noticed by the learners due to the
use of evaluation criteria. This created an atmosphere of
democratic method of assessment that lead to further
instances of learning (positive washback). (Shohamy 2002)
twin
Approaches to assessment
2.
1.
Assessment
of learning
Assessment
as learning
(formative)
(summative)
3.
Assessment for
learning
(formative +…)
Nitko 1983, 1989; Earl 2003; English
language arts curriculum, British Columbia
2006; Ontario Report 2010
Assessment as and for learning
Fairness
Washback
(Kunnan 2000)
(Hughes 2003)
Content &
language
development
Pedagogical implications
• Assessment can and should be used to support learning.
• Free response items should have task-specific evaluation
criteria.
• Criteria can be shared to
raise awareness,
notice task requirements,
revise documents
track growth
positive washback
PART THREE
We need to design and share evaluation criteria with our
learners because it can :
a) ensure fairness
b) give rise to instances of learning (positive washback)
Evaluation criteria: examples
TASK: You wish to subscribe the magazine READER’S DIGEST. Write a letter in
100-150 words to the editor requesting him/ her to give you the subscription details.
In your letter, you can ask about the subscription rate, mode of payment, delivery
and any other query that you may have.
Option 1:
General criteria
You will be graded on content, language and organization.
Option 2:
Task-specific criteria
Enquires about subscription details, mode(s) of payment, details of delivery, time to
be taken, whom to contact in case of problems
(Content)
Uses vocabulary appropriate to express each language function and a variety of
sentence structures accurately.
(Language)
Begins with a formal address to the editor and expresses interest about the
magazine
presents all enquiries about the subscription
concludes by
thanking the editor and intends to receive information at the earliest (Organization)
TASK 1
Being and looking fair is important. Do you agree? Discuss
with reference to the following pictures. Write your answer
in 100-150 words.
Picture A
Picture B
Evaluation Criteria: Template 1
Description
CONTENT
(5)
ORGANIZATION
(2)
LANGUAGE
(3)
Task 2
Being and looking fair is important. Do you agree? Discuss with reference
to the following advertisement. Write your answer in 250-300 words.
Evaluation Criteria: Template 2
Description
CONTENT
(5)
ORGANIZATION
(2)
LANGUAGE
(3)
Anand, Ayesha, Barka, Clementine, Jayant, Kezo, Manish, Remya,
Rukan, Shehla, Sunitha, Suraj and Vrishali.
Thank you for your participation and timely responses
without which this project would have remained unfulfilled

Acknowledgements
References
Brown, J. D., and Abeywickrama, P. (2011). Language assessment: principles and classroom
practices (2nd Edn). Pearson Education.
Earl, L. (2003) Assessment as Learning: Using Classroom Assessment to Maximise Student Learning.
Thousand Oaks, CA, Corwin Press.
Hughes, A. (2003). Testing for language teachers. Cambridge: Cambridge University Press.
Kunnan, A. J. (2000). Fairness and validation in language assessment. Studies in Language
Testing 9. Cambridge: Cambridge University Press.
Reid, J.M., 1993. Teaching ESL Writing. Prentice-Hall, New Jersey.
Schmidt, R. (2010). Attention, awareness, and individual differences in language learning. In
W. M. Chan, S. Chi, K. N. Cin, J. Istanto, M. Nagami, J. W. Sew, T. Suthiwan, & I. Walker,
Proceedings of CLaSIC 2010, Singapore, December 2-4 (pp. 721-737). Singapore: National
University of Singapore, Centre for Language Studies.
Shohamy, E. (2001). The power of tests: a critical perspective on the use of language tests. Pearson
Education.
Upshur, J.A., Turner, C.E., 1995. Constructing rating scales for second language tests. ELT
Journal 49 (1), 3–12.
THANK YOU FOR YOUR
ATTENTION!!