Measuring school effectiveness

Download Report

Transcript Measuring school effectiveness

Measuring School Effectiveness:
Introduction and Background
Lorraine Dearden,
Institute of Education, University of London
Introduction
• Set the scene for the rest of the day
• Introduction to school testing regime in England
• Potted history of how school effectiveness has been
measured in England over time
– Point out the strengths and difficulties with approaches that
have been taken (briefly)
– Give some illustrations using work I am currently doing with
Alfonso Miranda and Sophia Rabe-Hesketh
• Point out the similarities and differences with
experience/approaches in some other countries
Testing/Data in England
• Have forms of testing at age 5 (FSP), age 7 (Key Stage 1),
age 11* (Key Stage 2), age 14 (Key Stage 3 – now
abolished); age 16* (GCSEs – Key Stage 4) and at age 18* (A
levels – Key stage 5)
• Have individual background information on students every
year from 2001/02 (basic background characteristics PLASC)
• Data has been linked to vocational courses done in e.g. FE
colleges (ILR/NISVQ) as well as HE
participation/outcomes(HESA)
* Externally marked
History of School Accountability in
England
• 1988 National Curriculum introduced which all
maintained schools are obliged to follow (but
not private schools)
• For KS1, KS2, KS3 national system of
test/assessment introduced (ages 5 to 14)
• First school league tables published in 1992 for
GCSEs and A levels (secondary schools)
– Aim: to inform parents about choice of school and to
provide an incentive for schools to raise standards
Development of School Accountability
• In 1992 first secondary league tables showed ‘raw’
school GCSE results and one A level indicator of results
• Similar tables introduced for primary schools in 1996
based on KS2 test results in English, Maths and Science
• Problems of using these ‘raw’ measures immediately
pointed out (e.g. Goldstein & Spiegelhalter (1996))
– Simply reflected differences in school intake, background
of students, etc
– Simple change in composition of intake could
improve/worsen results from one year to next
– ......
Value Added Models
• If want to measure school effectiveness need to come up
with measure that measures the value actually being added
by school (e.g. Ladd and Walsh (2000))
– Raw test results do not come close to doing this for most
schools
• Couldn’t actually estimate VA models until could link
individual school data
– some pilot studies were done in the 1990s but nothing
nationally
– Only could start doing this for secondary schools in 2001/02
when had baseline KS2 performance data for children who sat
KS4 exams in that year 2001/02
Other details
• KS4 measure is capped score of 8 best GCSE scores
(maximum mark is 58 for A* in subject) so maximum
score is 464
• KS2 score is based on levels achieved in English, Maths
and Science (not actual test score)
– Level 4 (27 points) is expected level, Level 5 (33 points) is 2
years ahead of where expected to be, level 3 (21 points) is 2
years behind where expected to be, level 2 (15 points) is 4
years behind where meant to be at age 11.
– For CVA now use raw test scores to give individuals APS
between 15 and 36 in each subject
– VA and CVA average 3 KS2 APSs (if one missing just average 2).
Value Added
• VA model introduced in 2002/03 and used average of KS2 crude APS in
English, Maths and Science
– Split this KS2 score into 10 groups and looked at the median outcome
for these 10 groups
– join the median points to get the 'national median line'
– Value added was simply the school average of individual deviation
from this line (added to 100)
CVA
• Critisized that still problematic (see Ray (2006) for full
background)
– need to take other factors into account to truly measure
the value added by schools
– Ceiling effects (e.g. KS1 to KS2 VA)
– Stability issues
• Moved to a CVA model where controlled for prior
attainment (fine APS), FSM status, ethnicity, gender,
SEN, deprivation(local area), relative age, EAL, mobility
– Regression model, with school clustering (multilevel/heirarchical model)
– CVA is school average of residuals from this model (+1000)
Ready Reckoner
• Model estimated every year on latest data and
so way CVA calculated changes every year
• Schools given ready reckoner so they can see
CVA for every child in school
• Pretty obvious that if they have choice overclassifying a student’s background, they know
which is more favourable for CVA
– Do not have to add as much value (other things
being equal) if non-EAL, or White British or
unclassifiable ethnicity
Do school understand CVA?
1) Has the school CVA measure solved the
problem of informing parents about school
choice and ensuring school accountability?
No – and this will be the subject of some talks today
2) Could we do better?
Yes – and this will be the subject of the other
talks today
3) Should we even attempt to measure school
effectiveness?
.............
Problems with CVA in England
• Not very stable (Leckie and Goldstein)
• Differences between schools rarely statistically
significant (Leckie and Goldstein)
• Not very transparent and difficult to understand
(Gorard; Dearden, Micklewright and Vignoles)
• Evidence of differential effectiveness within school
(Brown and Tzavidis; Dearden, Micklewright and
Vignoles)
• Evidence that parents don’t use it when making
decisions – prefer ‘raw scores’ (Machin and Hansen)
• Newspapers don’t highlight it generally – still go back
to ‘easy to understand’ raw scores
Other problems
• Drawing on work I am doing with Alfonso
Miranda and Sophia Rabe-Hesketh (using linked
survey “Next Steps” data and NPD data for kids
born in 1991/92)
• Big left censoring in KS2 scores used which
disadvantages school with kids at bottom of KS2
ability distribution (and some right censoring)
• Also –way the fine point KS2 APS is constructed
(transforming raw test scores into the APS system
slightly strange)
Raw KS2 Maths Scores
Maths score put in CVA Model
Left censoring makes it
much harder to add
value – why do it?
How serious is it that key covariates
not measured?
• Key variables, like parental education not
measured in the administrative data
• But we know, that more highly educated
parents are likely, on average, to provide more
educational input in the home
• Also know that education of parents varies
markedly by school – not random
• Has implications for CVA
Is this a serious problem in England?
• Our work suggests it is (Dearden, Miranda and
Rabe-Hesketh (2010))
• Use “Next Steps” survey data linked to the
NPD for cohort of children who took KS2 in
2001 and KS4 in 2006
• Observe parental education for just over
12,000 children in our sample (total cohort in
NPD just over 550,000)
Regress individual CVA on mother’s
education
Mother’s
Education
Girls
Boys
Estimate
(SE)
Estimate
(SE)
991.7
(2.6)
998.5
(1.9)
Level 2
7.5
(2.7)
9.9
(2.2)
Level 3
14.2
(2.9)
18.5
(2.4)
Level 4/5
25.7
(3.3)
30.4
(3.2)
Constant
No. obs
6442
6551
Note: Cluster at school level (698 out of total of just over 3,000 schools and use “Next Steps” survey weights)
What can we do about this?
• Use model of Miranda and Rabe-Hesketh
(2010) to re-calculate CVA model for whole
NPD sample accounting for missing mother’s
education
– Paper will be presented on Wednesday at Festival
– Exploits that fact that for some individuals have
survey and NPD data
– Explicitly models missingness in survey and
administrative data
Results
School league tables in other
countries?
• Increasingly in a number of States in US being used but
not nationally
• In Australia introduced national testing (NAPLAN) tests
with report cards to parents in Years 5, 7 and 9
– Have website where parents can check performance of
schools next to statistically ‘similar’ school
• School accountability becoming an issue world wide
• PISA world league tables...
Conclusions
• School League tables here to stay in England
• But very difficult to measure the value added
by schools
• Given not likely to go away, we need to do it
better
• Look forward to rest of today’s talks to take us
forward on this issue
References
•
•
•
•
•
•
Goldstein, H. and Spiegelhalter, D. J. (1996) League tables and their limitations:
statistical issues in comparisons of institutional performance. Journal of the Royal
Statistical Society: Series A, 159, 385-443.
Goldstein H, Rasbash J, Yang M, Woodhouse, G, Pan H, Nuttall, D, and Thomas, S
(1993) ‘A multilevel analysis of school examination results’ Oxford Review of
Education, 19: 425-33.
Gorard, S. (2010) All evidence is equal: the flaw in statistical reasoning, Oxford
Review of Education, (forthcoming).
Ladd and Walsh (2000) ‘Implementing value-added measures of school
effectiveness: getting the incentives right’, Economics of Education Review, vol. 2
part 1 pp. 1–17.
Leckie, G. and Goldstein, H. (2009) The limitations of using school league tables to
inform school choice. Journal of the Royal Statistical Society: Series A. vol. 127 part
4, pp835-52.
Ray, A. (2006) School Value Added Measures in England. Paper for the OECD
Project on the Development of Value-Added Models in Education Systems.
London, Department for Education and Skills
http://www.dcsf.gov.uk/research/data/uploadfiles/RW85.pdf.