As slippery as an eel? Assessing speaking and writing Part One

Download Report

Transcript As slippery as an eel? Assessing speaking and writing Part One

Comparing the
incomparable?
Assessing speaking and writing
Part Two
Ülle Türk
University of Tartu
Estonian Defence Forces
23rd CSW, Tampere, 27-29 March 2009
Questions




Which aspects should we focus on when we
assess students’ productions?
Which is more effective: holistic or analytic
assessment?
How can we standardize our assessments?
How can we involve students in the process of
assessing their speaking and writing skills?
2
Remarks on CEFR


Recommendation of the Committe of Ministers to
member states on the use of C of E’s CEFR and the
promotion of plurilingualism (CM/Rec (2008) 7)
Explanatory note




The CEFR is purely descriptive – not prescriptive, nor normative;
The CEFR is language neutral – it needs to be applied and
interpreted appropriately with regard to specific languages
The CEFR is context neutral – it needs to be applied and
interpreted with regard to each specific educational context in
accordance with the needs and priorities specific to that context
......
3
Brian North, Eurocentres

NOT a harmonisation tool


NOT a theory of language or skills
development


“We have NOT set out to tell practitioners what to do
or how to do it. We are raising questions not
answering them.”
Scales describe learning outcomes, learner behaviours,
not the invisible processes involved.
NOT a test specification

Scales and lists can be consulted when drawing up a
task specification (Ch4) or defining assessment
criteria (Ch5) but need reference to detailed specs for
language & context
4
Development of descriptors

Intuitive Phase:


Qualitative Phase:



Analysis of teachers discussing proficiency
32 teacher workshops sorting descriptors
Quantitative Phase:



Creating a pool of classified, edited descriptors
Teacher assessment of 2800 learners on descriptor checklists
(500 learners, 300 teachers)
Teacher assessment of videos of some learners
Interpretation Phase:

Setting “cut-points” for common reference levels
5
“Life beyond C2”
E
Ambilingual Proficiency
Comprehensive Operational
Proficiency
Adequate / Effective Operational
Proficiency
Limited Operational Proficiency
Basic Operational Proficiency
Survival Proficiency
Formulaic Proficiency
WENS: Well-educated Native
Speaker
D2 Genuine bilinguals (+
Beckett etc.)
D1 Language professionals:
Interpreters, translators,
some university professors
C2 Highly successful learners
C1
B2
B1
A2
A1
6
Salient Characteristics D?

Apparent ambilingualism:


Convey, elaborate or translate to explicit expression
the nuances and subtleties of their own and of others’
meaning by exploiting a comprehensive knowledge of
the language to do so
function in all situations to all intents and purposes
exactly as the mother tongue; use the language in a
sophisticated, natural, accurate manner apparently
indistinguishable from the performance of a native
speaker
7
CEFR Levels: Key Problems




Danger of differing interpretations for different
languages
Under-definition of C2, + some reversals of
C1/C2 descriptors (ALTE, DIALANG, Catalonia)
Weak definition of socio-linguistic competence
(and some contradiction to Cambridge
qualitative research)
Unrealistic expectations in relation to receptive
skills
8
Under-definition of C2

Mostly uncalibrated as very few C2 descriptors calibrated
in CEFR/Swiss project


Occasional C1/C2 reversals


Integrate suitable descriptors from ALTE, DIALANG,
Catalonia, Portfolio bank
Investigate cases; Incorporate insights from qualitative
analysis of samples (e.g. Cambridge)
C1 descriptors tend to be more concrete, C2 descriptors
less so – but try to avoid “native speaker” attributes

Define Level D, at least in outline, to give upper
boundary; Consult curriculum descriptors
9
Reliability of assessment






Do all markers agree on the mark I got?
= inter-marker reliability
If the same marker marks my test paper again
tomorrow, will I get the same result?
= intra-marker reliability
Objectively marked tasks
Subjectively marked tasks
10
Methods of marking

Analytical marking



according to detailed criteria
The final score is a composite of all the subscores or a
profile.
Holistic or impression marking

There is no breakdown into separate marks for
separate aspects of writing skill, but some criteria still
have to be kept in mind.
11
Holistic marking scheme
18-20 Task successfully carried out with a wide range of expressions
and minimal, if any, errors.
16-17
An ability to produce more than a collection of simple
sentences with only occasional lapses; task successfully carried
out.
11-15
Simple but correct realisation of task with some errors which
do not distract from the content; appropriate credit for range.
8-10
Message communicated, but errors noticeable; attempt at task
not entirely successful.
5-7
Lack of language control shown by frequent basic errors; task
only partly realised/rubric neglected.
0-4
Language breakdown; content irrelevant or too little for
assessment.
12
Sara Cushing Weigle, 2002: 121
Quality
Holistic Scale
Analytic Scale
Validity
assumes that all relevant aspects of
writing ability develop at the same rate
and can thus be captured in a single
score;
holistic scores correlate with superficial
aspects such as length and handwriting
more appropriate for L2 writers as
different aspects of writing ability
develop at different rates
Reliability
lower than analytic but still acceptable
higher than holistic
Practicality
relatively fast and easy
time-consuming; expensive
Impact
single score may mask an uneven
writing profile and may be misleading
for placement
provide useful diagnostic
information for placement and/or
instruction; more useful for rater
training
Authenticity
White (1995) argues that reading
holistically is a more natural process
than reading analytically
raters may read holistically and
adjust analytic scores to match
holistic impression
13
Scale functions (CEFR, p 37-38)

user-oriented scales





report typical or likely behaviours of learners at any
given level
what the learner can do
tend to be positively worded, even at low levels
often holistic, offering one descriptor per level
assessor-oriented scales




guide the rating process
how well the learner performs
often negatively worded even at high levels
Some are holistic scales, others are analytic scales.
14
Task specific vs general scales

Task specific easier to use, but more timeconsuming and expensive



Designing different scales
Training assessors to use them
A combination of the two:


Linguistic competence the same (Ch 5)
Task-specific aspects different (Ch 4)
15
Addressing audiences
C1
Can give a clear, well-structured presentation of a complex
subject, expanding and supporting points of view at some length
with subsidiary points, reasons and relevant examples.

Can handle interjections well, responding spontaneously and
almost effortlessly.
B2+ Can give a clear, systematically developed presentation, with
highlighting of significant points, and relevant supporting detail.

Can depart spontaneously from a prepared text and follow up
interesting points raised by members of the audience, often
showing remarkable fluency and ease of expression.
B2 Can give a clear, prepared presentation, giving reasons in support
of or against a particular point of view and giving the advantages
and disadvantages of various options.

Can take a series of follow up questions with a degree of fluency
and spontaneity which poses no strain for either him/herself or
the audience.
16
Reading for information and
argument
C1
Can understand in detail a wide range of lengthy,
complex texts likely to be encountered in social,
professional or academic life, identifying finer points
of detail including attitudes and implied as well as
stated opinions.
B2+ Can obtain information, ideas and opinions from
highly specialised sources within his/her field.

Can understand specialised articles outside his/her
field, provided he/she can use a dictionary
occasionally to confirm his/her interpretation of
terminology.
B2 Can understand articles and reports concerned with
contemporary problems in which the writers adopt
particular stances or viewpoints.
17
How many levels?



Proficiency levels too broad to trace achievement
The higher the level the longer it takes to reach it
Split into narrower levels


Finnish National Curriculum levels
NB! There is a level between B2 and C1!
18
Examiner training





NB! The less analytic and mechanical the
method of marking, the more highly skilled and
trained examiners need to be.
Standard setting
Standardisation
Monitoring examiners
Evaluation of examiners
19
Questions to ask

What competences should my students have in








Spoken interaction
Spoken production
Written interaction
Written production
Mediation?
What tasks/activities should they be able to perform to
demonstrate their mastery of the competences?
How well should they be able to perform them?
How are we going to assess their written/spoken
performance reliably?
20
Sources

Council of Europe: Language Policy Division:
http://www.coe.int/T/DG4/Linguistic/Default_
en.asp

European Language Portfolio:
www.coe.int/portfolio/
Cushing Weigle, Sara (2002) Assessing Writing,
Cambridge University Press, Cambridge.

21