Transcript Document
Developing a vocabulary size
test
in Greek as a foreign
language
James Milton
Thomaï Alexiou
Vocabulary
the core component of all the language skills
(Long and Richards, 2007, xii)
without grammar very little can be conveyed,
without vocabulary nothing can be conveyed
(Wilkins, 1972, 111)
But formal tools to model and measure
vocabulary knowledge are very recent
and mainly restricted to EFL
Vocabulary Size Estimates
tend to…
Sample of the most frequent vocabulary
the most frequent vocabulary tends (but only tends) to be
learned earliest (Alexiou & Konstantakis, forthcoming)
the most frequent vocabulary gives greatest coverage (and
comprehension)
textbook neutral (unless they are very odd)
give reliable, believable estimates of a learner’s knowledge
But
underestimate
Please look at these words. Some of these words are real
French words and some are invented but are made to look like
real words. Please tick the words that you know or can use.
Here is an example.
chien
Thank you for your help.
de
distance
battre
absurde
achevé
manchir
Vocabulary learning profile
(Meara, 1992)
% words known
100
80
60
40
20
0
1
2
3
4
1000 word frequency bands
5
Vocabulary learning profile
(Meara, 1992)
% words known
100
80
60
40
20
0
1
2
3
4
1000 word frequency bands
5
Vocabulary learning profile
(Meara, 1992)
% words known
100
80
60
40
20
0
1
2
3
4
1000 word frequency bands
5
Vocabulary learning profile
(Meara, 1992)
% words known
100
80
60
40
20
0
1
2
3
4
1000 word frequency bands
5
Vocabulary learning profile
(Meara, 1992)
% words known
100
80
60
40
20
0
1
2
3
4
1000 word frequency bands
5
Vocabulary learning profile
(Meara, 1992)
% words known
100
80
60
40
20
0
1
2
3
4
1000 word frequency bands
5
Vocabulary learning profile
(Meara, 1992)
% words known
100
80
60
40
20
0
1
2
3
4
1000 word frequency bands
5
Vocabulary and placement
Mean annual progress in EFL school
5000
4000
3000
2000
1000
0
Junior
A
B
C
D
E
FCE
Vocabulary size and CEFR
CEFR level
XLex (5000 max)
English
French
A1
<1500
1160
A2
1500 - 2500
1650
B1
2750 - 3250
2422
B2
3250 - 3750
2630
C1
3750 - 4500
3212
C2
4500 - 5000
3525
A Greek vocabulary test
drawn on the Hellenic National Corpus (with
thanks to Dr George Mikros)
derived from NEA a high circulation newspaper
in Greece
Words
No of Files
Culture
3000048
5967
Sociopolitical
3000275
8480
Sports
3000296
10349
Total
9000619
24796
To give us a workable frequency list
to draw items from…
proper names and other named entities stripped out
corpus is lemmatised
common inflections work differently in English and in Greek
But this process brings the corpus into line with the English and
French corpora and makes them more similar
most frequent 5000 words taken as the basis of a test
equivalent to the EFL and French tests shown
20 words from each 1000 word frequency band
20 pseudo-Greek words
Frequency and coverage
100
coverage
80
60
French
English
Greek
40
20
0
0
1000
2000
3000
words by frequency
4000
5000
Frequency and coverage
A1
100
A2
B1
B2
coverage
80
60
French
English
Greek
40
20
0
0
1000
2000
3000
words by frequency
4000
5000
Objectives
To examine:
whether the test is reliable
whether the frequency effects observable in
other language can be seen in Greek
whether the frequency profile changes with
level and knowledge in the expected manner
whether the test differentiates between
learners of different levels in predictable ways
(and suggests vocabulary knowledge required
for each CEFR level)
Reliability: An individual’s scores
20
15
1500
1650
1500
1500
1450
10
5
0
1
2
3
4
5
A larger pilot study
64 adult students
Learning Greek in Thessaloniki at the School of
Modern Greek
From 1 month to 2 years
They were tested end of October
Ranked at 4 CEFR levels
A1
A2
B1
B2
Many Thanks go to Mrs MarthaVazaka,
her colleagues and the students.
estimated knowledge
Frequency effect
1000
900
800
700
600
500
400
300
200
100
0
A1
A2
B1
B2
1000
2000
3000
vocabulary band
4000
5000
Mean scores by CEFR level
4500
estimated vocabulary size
4000
3500
3000
2500
2000
1500
1000
500
0
A1
A2
B1
CEFR level
B2
Mean scores by CEFR level
5000
estimated vocabulary size
4500
4000
3500
3000
2500
2000
1500
1000
500
0
A1
A2
B1
CEFR level
B2
Vocabulary size and CEFR
CEFR level
XLex (5000 max)
English
French
A1
<1500
1160
A2
1500 - 2500
1650
B1
2750 - 3250
2422
B2
3250 - 3750
2630
C1
3750 - 4500
3212
C2
4500 - 5000
3525
Greek
Vocabulary size and CEFR
CEFR level
XLex (5000 max)
English
French
Greek
A1
<1500
1160
1486
A2
1500 - 2500
1650
2237
B1
2750 - 3250
2422
3288
B2
3250 - 3750
2630
3956
C1
3750 - 4500
3212
C2
4500 - 5000
3525
Conclusions
This frequency based vocabulary size test in
Greek as a foreign language is very workable
The test successfully distinguishes between
learners at different levels of the CEFR
framework and appears to give believable figures
for learners’ level of vocabulary knowledge
The figures seem to mesh well with the
predictions for vocabulary suggested by the
coverage obtained from the frequency data
Next steps
This study is a first step in validating this testing tool
and in order to confirm its reliability, we intend to carry
out more tests at the end of this academic year.
We also have some supporting evidence that by using
coverage figures drawn from word frequencies, we can
tie the CEFR levels to vocabulary sizes in a whole
variety of languages other than English, French and
Greek. And that should help to make the CEFR system
both more robust and more transparent.
References
Alexiou, T. & Konstantakis, N. ‘Lexis for Young Learners: Are
we heading for frequency or just common sense?’, Selection of
papers for the 18th Symposium of Theoretical and Applied
Linguistics, Aristotle University of Thessaloniki.
Meara, P. (1992) EFL Vocabulary Tests. University College
Swansea: Centre for Applied Language Studies.
Long, M.H. and Richards, J.C. (2007) Series Editors’ Preface. In
Daller, H., Milton, J. and Treffers-Daller, J. Modelling and Assessing
Vocabulary Knowledge. Cambridge; Cambridge University press, xiixiii.
Wilkins, D.A. (1972) Linguistics in Language Teaching. London;
Arnold.