First Generation Task-based Assessment in Chinese Schools

Download Report

Transcript First Generation Task-based Assessment in Chinese Schools

Re-examining Factors that Affect
Task Difficulty in TBLA
Shaoqian Sheila Luo
English Department, CUHK
English Department, BNU, China
Supervisor: Professor Peter Skehan
[email protected]
Task-Based Language Teaching 2005
The presentation structure…
the rationale of the research



defining the problem
tasks and assessment
Previous findings: weaknesses
research questions and research methods
studies
findings and future plans
implications
Research Rationale: Defining the problem
Identification of valid, user-friendly sequencing
criteria for tasks and test tasks is a pressing but
old problem
Grading task difficulty and sequencing tasks
both appear to be arbitrary processes not based
on empirical evidence (Long & Crookes, 1992)
Not much of an effort been made to define task
descriptors in operational terms (see Robinson,
1991)
Research Rationale: Tasks and Assessment
Grading and sequencing issues assume great
importance for testing and assessment of
communicative performance
“…to elucidate the potential for using task-based
performance assessment to generalize about students’
second language abilities” (Brown et al., 2002, p. 1).
The Brown-Norris matrix (1998; 2002; influenced
by Skehan (1996) offers one way of characterising
test task difficulty, but lacks obvious connection to
a Chinese secondary context
Previous findings: weaknesses
…previous findings (on task difficulty)
were of only moderate support for the
proposed relationships between the
combinations of cognitive factors with
particular task types…
(Elder et al., 2002)
This research…
investigates the development and use of
a prototype task difficulty scheme based
on current frameworks for assessing task
characteristics and difficulty, e.g. Brown et
al, and Skehan (1998).
Hypothesis:
There is a systematic relationship
between task difficulty and hypothesized
task complexity (see also Elder , 2002)
Research questions
How can language ability in TBLT in mainland
Chinese middle schools best be assessed?
1. Is the Brown et al. task difficulty framework
appropriate to the mainland Chinese school
context? If it is not, then what is an alternative
framework?
2. Is it possible to have a task difficulty framework
that can be generalized from context to context?
3. What are the teachers’ perceptions of task
difficulty in a Chinese context?
4. What are the factors that are considered to affect
task difficulty in this context?
Research methods
(1) a quantitative analysis of ratings of the tasks on the
modified task difficulty matrix;
(2) a qualitative analysis of verbal self-report
data(introspection) and the focus group interviews
on the factors that affect task difficulty.
Methodological triangulation was accomplished by
using (a) an analytical task difficulty rating scheme
(b) a holistic task difficulty vertical line, (c) verbal
self-report (introspection), (d) focus group
interviews and questionnaires. Location triangulation
was achieved by collecting data from test writers,
material developers, experienced teachers and
(students) from different regions of China and
abroad.
Participants and tasks
in the development and refining of the task difficulty
matrix for prototypical tasks in task-based testing, nine
groups of 48 Chinese, English, Swedish test writers,
experienced teachers and EFL material developers
participated in the rating of 86 tasks, interviews and
introspection (verbal self-report)
data of six tasks from 800 students in eight regions
(randomly chosen from about a population of 500,000)
was collected and analysed.
tasks were designed by Chinese and English test
writers, EFL material developers and experienced
teachers according to the themes in the Chinese English
Curriculum (experimental version, 2001).
The research stages
1. First stage: April and May 2004
trial of Norris and Brown task difficulty matrix
2. Second stage: developing and refining the matrix:
Oct 2004: trial on the IPO-CFS task difficulty matrix
3. Third stage: refining the matrix on 24 tasks (Nov 2004)
December 2004: data from 800 students on six tasks
Jan 2005: refining the matrix - ratings of 24 tasks
4. A comparison between Brown et al.’s matrix and the
modified matrix
5. Introspection from David, Olov and Prof. B (Feb 2005)
6. Finalizing matrix (Mar – July 2005)
Research studies
1. First stage: April and May 2004
To Find out the factors that affect task difficulty among three
groups of 26 mainland Chinese English teachers by using Norris et
al. (1998)’s task difficulty matrix (Appendix 1)
The results of the test of the Norris et al. approaches to task
difficulty among three groups of mainland Chinese English teachers
show that there is tremendous disagreement between the Chinese
teachers’ and Norris et al.’s predicted difficulty level (Table 1).
Among fourteen tasks, both sides agree on only three tasks,
Planning the weekend, Shopping in supermarket and Radio weather
information which are common general topics in the daily life. The
other tasks generated disagreement, especially in relation to
cognitive skills, because of different assumptions regarding relevant
background and cultural knowledge; different interpretations of the
requirements made by different tasks; and different interpretations
of how more abstract tasks should be handled.
Modified Task Difficulty Matrix
Code C
Task
1
2
3
4
5
6
Cognit C Comm
stress
Task
condition
Code complexity: linguistic complexity;
linguistic input
Cognitive complexity: cognitive
familiarity; cognitive processing;
amount of input
Communicative stress: time; interaction;
context
Task conditions: Language proficiency;
language abilities; language skills;
culture & other
2. Second stage of research: Oct. 2004
2-teacher trial on the IPO-CFS task difficulty scheme and
task analysis on 48 tasks (designed by DL and SL) based on
the 24 themes in the Chinese National Curriculum (Table 2)
Findings
Most of the ratings show agreement between the two teachers.
Correlation for the means of both teachers : .65
There is a huge gap (above 6 considered as a big gap between
the ratings of Task group 1 (SL) and Task group 2 (DL) of the
two teachers) between the ratings of some Group 1 tasks and
Group 2 tasks. Table 3 is the analysis of the nine pairs of tasks
from the task requirements – to see how demanding the input,
the processing and the output are in each task:
24 Themes in the Chinese National
English Curriculum (2001)
Personal information; Family, friends and people
around; Personal environments; Daily routines;
School life; Interests and hobbies; Emotions;
Interpersonal relationships; Plans and intentions;
Festivals, holidays and celebrations; Shopping;
Food and drink; Health and fitness; Weather;
Entertainment and sports; Travel and transport;
Language learning; Nature; The world and the
environment; Popular science and modern
technology; Topical issues; History and
geography; Society; Literature and art
3. Third stage: refining the matrix
Nov 2004: Refining the matrix by collecting
data from 5 experienced teachers and test writers,
Sunny, Peter, DL, SL and Simon on 24 tasks
(designed by test writers and experienced
teachers. Table 4 & 5)
Results of the ratings on the refined matrix again
show their agreement of the easy and difficult
tasks (Appendix 2).
December 2004: Six tasks (1, 4, 7, 13-easy and 6, 12difficult.) tested: data from 800 students in eight
different cities and provinces.
Pj: difficulty level of test items. difficulty range: (0.3~0.7).
above 0.7, difficult; below 0.3, easy.
Task
1
4
Total
15
5
6
20
7
5
12
13
15
15
M
11.09 2.91
7.46
2.77
4.97
10.71
SD
3.81
1.76
4.87
1.15
3.36
4.54
Pj
0.26
0.41
0.63
0.44
0.67
0.28
Six tasks
Task 1
Listen and choose: Where does Linda live?
Task 4
Task 6
Read the class timetable and fill in the
blanks.
Suggestion day: chart reading & writing.
Task 7
Listen and put the pictures in order.
Task 12
Complete the ‘Customer Satisfaction
Form
Read and match.
Task 13
Jan 2005: refining the matrix - ratings of 24
tasks from 6 who have 1) interest; 2) at least are
with masters degrees, or even better with PhD
degrees; 3) five years of teaching experience or is a
test developer or an EFL material writer:
SL, Dodie, Lihy, PS, David, Sunny
Results of the ratings on the matrix (both holistic
and analytical to validate the matrix) show a range
of correlation from .52 to .83 with only one pair of
exception: .34.
SL
Dodie
Lihy
PS
David
Sunny
corr
corr
corr
corr
corr
corr
SL
Dodie
Lihy
PS
David
Sunny
1
.53
.61
.83
.82
.72
.53
1
.52
.70
.54
.34
.61
.52
1
.79
.64
.76
.83
.73
.80
1
.81
.74
.82
.54
.64
.81
1
.70
.72
.34
.76
.74
.70
1
4. A comparison between Brown et al.’s matrix
and the modified matrix
Similarities (5):
Primary research question; Similar purposes; similar
design of matrix; an example of an assessment
alternative; Sources
Differences (10):
Test Objects; Task Themes; Task Focus; +(-)related to
curriculum; Task Selection; Definitions/Labels;
Characteristics; Layout; Rating System; Raters
5. Introspection from David, Olov and Prof. B:
they gave detailed verbal self-report data which
identified a variety of strategies followed in
rating the tasks which help refining the matrix.
6. Finalizing the matrix
the finalized task difficulty matrix sequences
tasks from three dimensions, Input, Processing
and Output and the following components:
Task difficulty matrix for prototypical tasks in task-based language testing:
Dimension
Input
component
task by
theme
content
form
Processing
modalit
y/
support
content
form
Response/output
modalit
y/
support
content
form
modality/
support
1
2
…
24
Please mark in each column 0 (= very easy); 1(=easy); 2 (=satisfactory); 3 (=very difficult) under each
category for each task.
Input
Processing
Output
Content
Content
Content
Form
Form
Form
Modality
Modality
Modality
Support
(making input
clearer)
Support
(making processing
more efficient)
Support
(making oral/written
expression more
accurate and fluent)
A.
Content:
1. Information:
l Immediate vs. remote:
1) Here & now vs. there & then;
2) Abstractness vs. concreteness;
3) Familiarity vs. unfamiliarity
2. Amount:
l Total amount:
l Organization:
3. Transformation:
(retrieval and transformation in PROCESSING;
operations in OUTPUT)
B. Form:
4. Level of syntax
5. Level of vocabulary
C. Modality: Visual/aural Presentation;
Reading; Writing; Listening; Speaking;
Others
D. Support: Pictures; Clues; Situation;
Authenticity; World knowledge; Personal
experience; Common sense; Resources;
Tools; Others
Plans for Future Research
1. To define the notion of task difficulty
2. To validate the task difficulty matrix and
sequence the 24 themes and prototypical
tasks in the Chinese National English
Curriculum by collecting more data from
raters and students.
3. To define the task descriptors in operational
terms
Implications (1)
With such a system for estimation of task
difficulty, learner performances on carefully
sampled tasks can be used to predict future
performances on tasks that are constituted by
related difficulty components. (Norris et al.,
1998:58)
Students with greater levels of underlying
ability will be able to successfully complete
tasks which come higher on such a scale of
difficulty. (Skehan, 1998:184)
Implications (2)
A fundamental important reason for
using pedagogic tasks, sequenced in
order of increasing cognitive complexity,
as the basis of syllabus design is such a
sequencing decision should effectively
facilitate L2 development, the
acquisition of new L2 knowledge, and
restructuring of existing L2
representations. (Robinson, 2001:34)
References
Brown, J. D., Hudson, T., Norris, J. & Bonk, W. J. (2002). An investigation
of second language task-based performance assessments. Second
Language Teaching & Curriculum Center, University of Hawai’i at
Manoa.
Elder C., Iwashita N., & McNamara, T. (2002). Estimating the difficulty of
oral proficiency tasks: What does the test-taker have to offer?
Language Testing, 19,4, 343-368.
Long, M., & Crookes, G. (1992). Three approaches to task-based syllabus
design. TESOL Quarterly. 26, 27-56.
Norris, J. M., Brown, J. D., Hudson, T. D., & Bonk, W. (2002). Examinee
abilities and task difficulty in task-based second language
performance assessment. Language Testing, 19(4), 395-418.
Robinson, P. (2001). Task complexity, task difficulty, and task production:
Exploring interactions in a componential framework. Applied
Linguistics, 22 (1), 27 – 57.
Skehan, P. (1996). A framework for the implementation of task-based
instruction. Applied Linguistics, 17 (1), 38-62.
Skehan, P (1998). A Cognitive approach to language learning. Oxford:
Oxford University Press.