Ecologically valid uses for assessment at the nexus between language, content, and task John Norris & Barry O’Sullivan Copyright © 2007, John M.
Download ReportTranscript Ecologically valid uses for assessment at the nexus between language, content, and task John Norris & Barry O’Sullivan Copyright © 2007, John M.
Ecologically valid uses for assessment at the nexus between language, content, and task John Norris & Barry O’Sullivan Copyright © 2007, John M. Norris Inevitable changes in L2 education Sources Implications •Philosophy of education >Learning is experiential •Cognitive psychology >Expertise = K+S+D (holistic dvlpt) •SLA theory and research >Learner+context interact •The language ‘crisis’ >Insufficient scope and achvmt •The value of L2 learning >Language enables, empowers •The realities of education >Learning is local, needs differ The ecologies of language education are changing… How will language educators evolve? Integration Innovation Goal: Learning to … use language to communicate particular meanings for particular purposes LANGUAGE CONTENT Innovation: TBLT – CBI – LSP – LAC – etc. TASK The role of assessment…? Possible contributions of assessment: Objective measurement of language proficiency constructs Quality assurance/accountability mechanism Tool for policy implementation, gate-keeping, etc. Understanding: Heuristic for awareness-raising and program illumination Improving: Integral feedback component in curriculum-instruction-learning How can assessment support innovative teaching and learning of integrated content-language-task objectives? What are the characteristics of assessment systems that respond to the needs of language educators in this new ecology? The language – content – task ‘problem’ in assessment Language testers’ point of view Content •Meaningful stuff about which humans tend to want to talk •Background, subject, situational, experiential knowledge •Problematic because it comes in variable amounts and types, and it interacts with L2 use Language •L2 knowledge, skills •Grammatical, textual, functional, sociolinguistic Communicative Language Ability Task •Devices for eliciting samples of L2 knowledge, skills •Constellation of purposes, features, conditions, interlocutors in use domain •Problematic: the more they resemble what we actually do, the less they can be trusted to tell us about L2 knowledge •Enables/disables talking docontrol we get maximally accurate measure of the •So, weHow need to for itsat a about stuff for purposes effects by careful selection, •So, we have to sample a •Mediated bydespite strategic content language ability construct and task (or maximizing the generic, and bunch of them to get at competence learning context factors)? Technical problemand we through other observation/analysis reliable estimates, of bias have to control them from becoming too ‘tasky’ Characteristics of assessment Emphasize… But what about… •Test ‘tasks’ doable in a single sitting of an exam >Complex tasks not authentically completed in exam sitting/setting? •Rating criteria based on models of CLA >Indigenous rating criteria used by actual interlocutors? •Generalization and extrapolation about abilities beyond test context >Task-specific abilities as realized in context? •Maximal control over nature of performance elicited >Open-ended, creative, unanticipated performance? •Norm-referenced (between learner) comparisons >Criterion- or individual-referenced comparisons? •Interpretations about language ability constructs >Interpretations about languagecontent-task abilities? Language educators’ realities Innovative language educators ask… How do we place learners into integrated language+content+task curricula? What do we provide feedback on (language form, content coverage, task completion), and how can we do it effectively on different target tasks? How do we gather maximally useful information—with How respect can we track development in language, and task, and what is tolearner language and content andcontent, task—for doing the relationship between the three? specific things in our classes and programs? How should we determine weighting of language, content, and task abilities or knowledge in assigning grades? Functional problem What do we target as outcomes of our programs—what learners really can do—and what are the best ways of demonstrating them? How do we encourage student learning of what we target to the levels they need? How do we improve our classes and programs? A fundamental tension “… assessments can have many different functions. What is appropriate for one assessment purpose may be inappropriate for another…” Council of Europe (2001), p. 180 “…the inferences we want to make are about underlying ‘language ability’ or ‘capacity for language use’ or ‘ability for use’…” Bachman (2002), p. 454 Reconceptualizing the ‘problem’ Sum 1: If ‘language’ education is going to be about more than ‘communicative language ability’, then shouldn’t assessment follow suit? Sum 2: Monolithic and prescriptive practices of language assessment belie the realities of multiple assessment uses in education. Sum 3: Treating assessment as a primarily technical problem of ‘good measures’ probably won’t resolve the primarily functional needs of language educators for useful assessments. Developing useful assessments that do educational good Assessments that do good… What is the starting point in assessment development? Measurement approach Ecological approach What’s the construct to be measured? Why are we assessing? How do we create reliable measures of that construct? How will assessment processes and outcomes be used by whom? How do we eliminate threats to the validity of interpretations about that construct? How do we maximize the utility of assessments for specific users in specific educational settings? Even though we might assess different L2 To meet the actual uses for assessment, constructs, there is one right way to do it. there are lots of right ways to do it. Intended uses for assessment WHO? Test Users WHAT? Test Information INTENDED ASSESSMENT USE IMPACT? Test Consequences WHY? Test Purposes Intended uses for assessment WHO? Test the assessment Users WHAT? Testknow? need to Information What do they Task success? Learners Content coverage? Teachers L2 knowledge? Program administrators Accuracy/complexity/fluency dvlpt? INTENDED Policy makers To what end? Declarative What will they knowledge? do with it? Parents Effective learning ASSESSMENT Performance Placement ability? Other programs USE Capacity Improved teaching Feedback to learn? Well-articulated courses Motivation Valued learning outcomes Curriculum development Evolving programs Program evaluation & change IMPACT? WHY? Enlightened Test education Test Articulation Consequences Purposes Satisfied learners Certification Who are users? Developing intentional assessments Primary Intended Users Negotiate & specify: Assessment Consultants • priority uses • methods Stakeholders & Audiences • analyses •reporting ENABLING USE • constraints EMPOWERING USERS Why bother? Prioritizes needs of assessment users in the specific ecology of an educational setting Shifts ownership and responsibility into the hands of the users Rules out irrelevant concerns before they bog down development Forces users to be clear about the interpretations they want to make Illuminates gaps in educational planning and implementation; demands content-language-task expertise from educators Puts specific information into the hands of specific users in ways that enable them to take action Educationally relevant and useful assessments Resolving the ‘problem’ in practice What happens if we ignore use? Common European Framework: Language+Pluriculturalism+Anti-racism Online Advanced French Course: Advanced intercultural evaluation Targeted learning outcomes: Lang: Language/Writing dvlpt Content: Socio-cultural awareness Task: Intercultural evaluation Assessment realities (writing feedback): •Language development •Socio-cultural awareness •Intercultural evaluation From: Starkey & Osler (2001) Using Intended Use in situ Context •German at Georgetown University •New curriculum: Developing Multiple Literacies •Fully integrated Language & Content courses •Task- & genre-based instruction •Advanced L2 literacy targets Using Intended Use in situ “Taken together, these documents are intended to guide not only the development and implementation, but also the evaluation and revision of all quizzes, tests, examinations, written and oral performances, and other forms of assessment which play an integral role in the success of the GUGD’s educational efforts.” From: Assessment policies in the GUGD (rev. August, 2002) Intended use 1: Placement Intended test use Constraints Who: Faculty decision makers; incoming students What: Estimate of incoming students’ curriculum-related German knowledge/abilities; capacity to benefit from courses Why: Placement into curricular level acknowledging abilities and addressing learning needs Impact: Efficient/effective teaching and learning for learners grouped by similar ability and need Wide range of learner abilities Administration time (2 hrs.) Scoring time (same day) Decision-making efficiency Transportability (off-site administration) Migration to computer-based administration Language v. content v. task as basis for placement? Intended use 1: Placement Basis for assessment: score Learner abilities to process texts (receptively and LCT productively) selected as representative of critical transition points from one curricular level to the next. Focus on language ability inCurriculum context. Content and task reflected in nature of texts selected (basis for elicitation), but not used as basis for interpretation. Level Adjudication score C-test Development: Identification and vetting of texts (auralRecommendation and written) by teacher curriculum expertsinvestment in placement process, accuracy, impact on teaching and learningincreased awareness about learner + RCT abilities vis-à-vis curricular expectations. score Communication Background info form Intended use 2: Writing development and outcomes Intended test use Constraints Who: Faculty, instructors, curriculum developers, learners What: Representative samples of writing performance abilities (task + L2 + content) at the end of each curricular level Why: Understanding student development & achievement of targeted abilities for improving C&I Impact: Feasible curricular expectations supported by effective pedagogy Explicitness of curricular expectations Availability/agreement on ‘prototypical’ performance tasks, content, L2 Competing uses for assessment (feedback, tracking, etc.) End-of-semester timing Learner investment, understanding of assessment expectations Intended use 2: Writing development and outcomes Prototypical performance writing tasks T E A C H E R S Level performance profiles Deliberation Development Task assignment sheets Analysis Revision Sample student performances L E V E L G R O U P S Intended use 2: Writing development and outcomes Consistent assignment framework Task Curricular level expectations Content Language Assignment 1 Assignment 2 Semester… Assignment 3 Assignment 4 Explicit performance criteria Prototypical Performance Writing Task Intended use 2: Writing development and outcomes “Assessment in this kind of a context is, I would almost say probably an indispensable aspect in order to clarify any number of things. Because it is in the discourse about assessment and how we would do that that our knowledge became articulated or the holes in that knowledge became clearer to ourselves. Or the cover-ups that we had engaged in were no longer possible if we wanted to be honest with ourselves about it.” Intended use 2: Writing development and outcomes Using Assessment for Curricular Change • Forced the curriculum to become real • Close specification of L2 progress within/across curricular levels • Disambiguation of learning outcomes in terms of task, content, language • Curricular ‘map’ for use by teachers and learners (what happens when?); basis for feedback on task, content, language • Forged agreement between curricular levels on what can and cannot be expected • Basis for longitudinal developmental analysis (L2 in task with content) Why Bother to Rethink Assessment? What we value is what we assess By adopting intended use as the starting (and ending) point for assessment, we… increase the likelihood that assessments will be used and useful decrease misuse/abuse of assessment force educators to make explicit their assumptions about the relationship between language, content, task enable educators to gather empirical information for making decisions and taking actions relevant to curriculum, teaching, learning situate assessment practices within specific educational ecologies (localization), rather than situating education within generic ecologies of assessment (globalization).