Transcript Slide 1

14th International GALA conference, Thessaloniki, 14-16 December 2007
Behavioural scales of language
proficiency: insights from the
use of the Common European
Framework of Reference
Spiros Papageorgiou
University of Michigan
English Language Institute
Testing and Certification Division
www.lsa.umich.edu/eli
Outline
•
•
•
•
•
•
Background
Aims
Data collection
Data analysis
Results
Implications
University of Michigan
English Language Institute
Testing and Certification Division
www.lsa.umich.edu/eli
Background
• Advent of the CEFR: increased interest in
behavioural scales of language proficiency
• Using the CEFR scales: Problems




Designing test specifications (Alderson et al., 2006)
Measuring progression in grammar (Keddle, 2004)
Describing the construct of vocabulary (Huhta &
Figueras, 2004)
Designing proficiency scales (Generalitat de Catalunya,
2006)
University of Michigan
English Language Institute
Testing and Certification Division
www.lsa.umich.edu/eli
Background (2)
• Using the CEFR scales: Criticism



Equivalence of tests constructed for different purposes
(Fulcher, 2004b;Weir, 2005)
Danger of viewing a test as non valid because of not
claiming relevance to the CEFR (Fulcher, 2004a)
Progression in language proficiency not based on SLA
research but on judgements by teachers (cf. North 2000;
North & Schneider 1998)
University of Michigan
English Language Institute
Testing and Certification Division
www.lsa.umich.edu/eli
Aims of the study
• Investigation of three research questions:



Can users of the CEFR rank-order the scaled descriptors
in the way the appear in the 2001 volume?
If differences in scaling exist between the users of the
CEFR and the 2001 volume, why does this happen?
Can training contribute to more successful scaling?
University of Michigan
English Language Institute
Testing and Certification Division
www.lsa.umich.edu/eli
Data collection
• 12 users of the scales acting as judges in relating
two language examinations to the CEFR
• Data collected during Familiarisation sessions
described in the Manual for relating examinations to
the CEFR
• Part of a doctoral thesis at Lancaster University
(Papageorgiou, 2007) and a research project at
Trinity College London
• Task: sort descriptors into the six levels
University of Michigan
English Language Institute
Testing and Certification Division
www.lsa.umich.edu/eli
Data collection (2)
Descriptors N
Number of judges per administration Ratings
Sept Sept
November February July
2005 2005
2005
2006
2006
1st 2nd
Speaking 30
12
12
10
11
-
1350
Writing 25
12
12
10
11
-
1125
Listening 19
12
12
10
11
-
855
Reading 20
12
12
10
11
11
1120
Global 30
12
12
10
11
-
1350
Total 124
5800
University of Michigan
English Language Institute
Testing and Certification Division
www.lsa.umich.edu/eli
Data analysis
•
•
•
Analysis: FACETS Rasch computer program
3 facets: descriptors-raters-occasions
Rank-ordering of elements of facets on a common
scale
• Fit statistics (Bond and Fox, 2001; McNamara,
1996)

Overfit: too predictable pattern

Misfit: more than expected variance
• Acceptable range of fit statistics
 Descriptors: .4-1.2 (Linacre & Wright, 1994)
 Raters: .5-1.5 (Weigle, 1998)
University of Michigan
English Language Institute
Testing and Certification Division
www.lsa.umich.edu/eli
Results: Writing Levels A1-B1
+ -2 +
|
|
+ -3 +
|
|
+ -4 +
|
|
+ -5 +
|
|
+ -6 +
|
|
+ -7 +
|
|
+ -8 +
|
|
+ -9 +
|
|
+ -10 +
|
|
W11 B1
W15 B1
W12 B1
W16 B1
W4 A2
W19 A2
W6 A2
W24 A1
W25 A1
University of Michigan
English Language Institute
Testing and Certification Division
www.lsa.umich.edu/eli
Results: Writing Levels B2-C2
+
|
+
|
+
|
+
|
+
|
+
|
+
|
*
|
+
|
7 +
|
6 +
|
5 +
|
4 +
|
3 +
|
2 +
|
1 +
|
0 *
|
-1 +
|
W18 C2
W1 C2
W14 C2
W23 C2
W10 C2
W2 C1
W13 C1
W21 C1
W20 C1
W3 C2
W9 C2
W17 C1
W5 B2
W22 B2
W7 B2
W8 B2
University of Michigan
English Language Institute
Testing and Certification Division
www.lsa.umich.edu/eli
+
|
+
|
+
|
+
|
+
|
+
|
+
|
*
|
+
|
7 +
|
6 +
|
5 +
|
4 +
|
3 +
|
2 +
|
1 +
|
0 *
|
-1 +
|
Results: Raters
Claudia
Matt
Alice
George
Nicola
Andrew
Rita
Kate
Lora
Roseanne
Sally
Tim
University of Michigan
English Language Institute
Testing and Certification Division
www.lsa.umich.edu/eli
Results: Occassions
+
|
+
|
+
|
+
|
+
|
+
|
+
|
*
|
+
|
7 +
|
6 +
|
5 +
|
4 +
|
3 +
|
2 +
|
1 +
|
0 * Feb 06
|
-1 +
|
Nov 05
Sept 05 1st
Sept 05 2nd
University of Michigan
English Language Institute
Testing and Certification Division
www.lsa.umich.edu/eli
Results: Correlations
Correlations of scaling between the judges and the CEFR volume
Descriptors
Spearman
Speaking
.959
Writing
.946
Listening
.968
Reading
.975
Global
.980
University of Michigan
English Language Institute
Testing and Certification Division
www.lsa.umich.edu/eli
Summary of results
•
•
•
•
•
•
Trained judges perceived language ability as
intended in the CEFR
Almost identical scaling
Cut-offs between B2-C1 and C1-C2 unclear
Competences other than linguistic: misfitting
descriptors
Unclear and inconsistent wording resulted in level
misplacement by the judges
Mixed effect of training
University of Michigan
English Language Institute
Testing and Certification Division
www.lsa.umich.edu/eli
Implications of findings
•
•
•
•
Common understanding of the construct in the
CEFR scales can be achieved, but
How valid is it to claim that a test is linked to B2
instead of C1 and C1 instead of C2?
How can sociolinguistic and strategic competences
be tested in relation to the CEFR?
Can SLA research help better understand these
issues?
University of Michigan
English Language Institute
Testing and Certification Division
www.lsa.umich.edu/eli
Contact details
Spiros Papageorgiou
University of Michigan English Language Institute
500 East Washington Street
Ann Arbor, MI
48104-2028
USA
[email protected]
University of Michigan
English Language Institute
Testing and Certification Division
www.lsa.umich.edu/eli