Transcript Document

Writing assessment descriptors
Ensuring validity and reliability in assessment
Olwyn Alexander
February 2015
Outline
 Principles of test/assessment usefulness
 Focus on validity: assessment construct
 Focus on reliability: assessment criteria
Principles of test usefulness






Validity
Reliability
Authenticity
Interactiveness
Impact/Washback
Practicality
Bachman & Palmer,1996
Principles of test usefulness
 Validity – the test actually measures the performance
it claims to measure and this is appropriate for the
test purpose
 Reliability – measurement is consistent between
different assessors and different test takers
 Authenticity – the tasks in the test are representative
of tasks in the target situation
Bachman & Palmer,1996,
Alexander et al., 2008
Principles of test usefulness
 Interactiveness – the tasks in the test engage the test
takers’ communicative language ability, background
knowledge and strategic competence
 Impact – the test will affect classroom teaching and
learning (and wider institutional policy – entry levels)
 Practicality – resources will be required to develop
the test and train assessors to administer it
Bachman & Palmer,1996,
Alexander et al., 2008
Test specification
Who
individual/one class/all classes?
Why
achievement/proficiency?
What
construct/knowledge/skills?
How
format/rating scales?
Bachman & Palmer,1996
Test specification
Who
students in10-week pre-sessional classes
at or near CEFR B2
Why
to measure achievement of target
performance in subject-specific context
What
construct/knowledge/skills
How
2,000 word literature review exploring a
subject-specific research question
Bachman & Palmer,1996
Test specification – construct
What is a literature review?
1. Audience/Purpose/Structure
2. Level of target discipline knowledge
3. Level of critical engagement with sources
Bachman & Palmer,1996
Literature review construct
1) Involves more than linguistic knowledge and skills
2) Requires students to engage with texts in the
disciplines & struggle to understand them
3) Requires evaluation and management of information
4) Raising awareness of graduate attributes
5) Target performance described at SQA level 10: exit
level of UG and entry level of PG degrees
Graduate attributes
•
a sense of ‘research-mindedness’ enabling a wider
more analytical perspective on individual practice
•
the ability to identify problems, formulate research
questions & interpret complex data to seek answers
•
the ability to derive meaning from complexity and
make informed judgments on the basis of evidence
•
an openness to learning and positive orientation to
new opportunities, ideas and ways of thinking
•
a tolerance for ambiguity and unfamiliarity.
SCQF level 10 knowledge
 A critical understanding of the principal theories,
concepts and principles in a subject/discipline.
 Detailed knowledge and understanding in one or
more specialisms some of which is informed by
or at the forefront of a subject/discipline.
 Knowledge and understanding of the ways in
which the subject/discipline is developed,
including a range of established techniques of
enquiry or research methodologies.
SCQF level 10 application
 Execute a defined project of research,
development or investigation and identify and
implement relevant outcomes.
 Practise in a range of professional level contexts
which include a degree of unpredictability and/or
specialism.
SCQF level 10 cognitive skills
 Critically identify, define, conceptualise, and
analyse complex/professional level problems and
issues.
 Critically review and consolidate knowledge,
skills, practices and thinking in a
subject/discipline.
 Make judgements where data/information is
limited or comes from a range of sources.
Test specification – construct
What is a literature review?
1. Audience/Purpose
2. Structure overall & within sections/paragraphs
3. Level of target discipline knowledge
4. Level of critical engagement with sources
Task: Choose one of the areas above and
brainstorm ideas for this aspect of the construct
Audience
 peers, novices (non-specialists) in the
community of practice
 lecturers (specialists) who assess research
projects, dissertations, theses
 educated non-specialists, e.g. in funding
councils, NGOs or government departments
who award grants or use the findings
Purpose
 in-depth exploration of a specific aspect of a
research area
 to indicate the current state of the art and suggest
new research directions
 OR to define and limit the scope of and provide a
context and framework for a piece of research.
 In-depth = reference to wide range of sources
 Context = current state of the art
 Framework = theoretical basis for research design
Structure
 Thematic comparison and classification of sources,
based on clear criteria usually specified in advance.
 Relates key papers & ideas to each other, e.g. shows
which make similar/opposite/more developed claims;
which ideas stimulated by which earlier contributions
 Paragraphs develop from general to specific with
claims supported by evidence from relevant sources.
 Review (& sections/paragraphs) structured from
familiar to new; structure made explicit for the reader
by summarising at the end of each section what has
been discussed and how this links to what follows.
Subject knowledge
 Responsibility of student (or subject expert if joint
marking opportunity).
 Includes recognition of key figures/papers which have
moved the field on and therefore should be cited.
 EAP tutors do not have this knowledge
 Nevertheless have to be able to read a text they do
not fully understand to assess whether it achieves its
purpose for the stated audience through its structure
and the level of critical engagement it demonstrates.
Critical engagement
 Not simply description/summarising
 (SCQF, 2012) identify, define, conceptualise, & analyse
complex/ professional level problems & issues.
 (Bruce, 2014) evaluative judgement made within any
field of human activity about some aspect, object or
behaviour of that field.
 (Dodd, 2014) the different usefulness of knowledge…,
appreciating how knowledge in its various forms holds
potentially different value for different people in different
places at different times.
Critical engagement
 Not simply description/summarising
 (Argent, 2013) showing awareness of different
perspectives/stances
(Argent, 2014) arguing cogently why some work makes
a more relevant, useful or powerful contribution to the
field generally or in the context of specific research.
 (Spencer & Alexander, 2014) relates not to finding fault
(criticising) but to criteria; evaluation must be in terms of
/based on specific criteria, stated explicitly.
Test specification
How
2,000 word literature review with
subject-specific research question
Submitted to Turnitin
Assessed using customised holistic
assessment descriptors
Bachman & Palmer,1996
Assessment descriptors
Analytic (e.g. IELTS) or holistic (e.g. TOEFL)
Holistic – quicker/easier to give an impression mark
 A response at this level largely achieves all of the
following
 A response at this level is marked by one or more of
the following
 A response at this level may reveal one or more of
the following weaknesses
Assessment descriptors
Task: develop a set of statements which capture
the aspects outlined in the construct
A response at this level largely achieves all of
the following
1. Content & task achievement (audience/purpose)
2. Structure overall & use of sources & stance
3. Paragraph structure & use of language
Assessment descriptors
Task: develop sets of statements which capture
varying levels of achievement
A response at this level largely achieves all of the
following
A response at this level is marked by one or more of
the following
A response at this level may reveal one or more of
the following weaknesses
Assessment descriptors
Example: task achievement and content
Fully addresses the specific area (concept, problem,
method) to be explored, showing achievements/gaps.
Addresses the specific area in sufficient depth to cover
the main points, showing some achievements/gaps
The main points are covered but with some redundant
ideas. Achievements/gaps are not shown clearly.
Not all aspects of the area are covered and
achievements/gaps in research are not shown.
Discussion
Any questions & comments?