Transcript Document
Oxford University Centre for Educational Assessment Assessment & Learning: fields apart? Jo-Anne Baird, Therese Hopfenbeck David Andrich & Gordon Stobart July 16, 2015 The need for a review on Assessment and Learning • Knowledge Economy – economic importance • The Audit Society – important societal control function – assessment defines what counts as valuable learning through these mechanisms • Multiple, high-stakes – assessment’s domination over learning • Assessment is agenda-setting • 21st Century has seen interesting developments already • Build cumulatively on what has already been done What do we mean by theory - functions Abstraction Abductive reasoning Wallander has a theory that Schwarzman killed Inga Distinguished from practice It’s just a theory Normative How things ought to be Explanatory Descriptive May be causal May be predictive May be formalised in logic or mathematical equations July 16, 2015 Page 3 What do we mean by theory - focus Scientific theory Relates to empirical phenomena Has an internal logic Should be empirically testable Substantive theory Learning Developmental psychology Test theory Metrology Psychometrics July 16, 2015 Page 4 Overview 1. Relationships between substantive learning theory & assessment 2. Theoretical and philosophical dilemmas 1. Case Studies - applications International tests Assessment for Learning 2. Conclusions July 16, 2015 Page 5 Oxford University Centre for Educational Assessment Assessment and Learning Jo-Anne Baird & David Andrich July 16, 2015 Behaviourist theory of learning Learning is demonstrated in behaviour Mental processes are not important Study of animals tells us about human learning (eg rats & pigeons) Learning as a reaction to stimuli in the environment, such as teaching Behaviourist approach to assessment Control conditions Measure memory for facts Compare performance with criteria or norms Global score for performance on ability in subject area Norm- or criterion-referenced Cognitive-constructivist theory of learning Learning occurs in the brain Cognition, especially meta-cognition important Memorisation of facts not so impressive Building of mental models of the world Integrate and build upon previous knowledge and learning Novice-expert differences Cognitive-constructivist assessments Higher order skills synthesis Evaluation Problem solving Extended tasks Assessed in terms of novice-expert continuum Socio-cultural theory of learning Learning is a social event Learning is situated and contextdependent Learning is value-laden Learning does not happen within, but between people Socio-constructivist approach to assessment Holistic, qualitative feedback emphasised Authentic tasks important Groups as well as individuals assessed Self- and peer-assessment important Engagement with criteria Theories of learning and assessment practices Have learning theory and assessment practice informed each other? Or have they been growing apart? Chronology and relationships Learning theories have been contemporaneous Cognitive psychology superseded behaviourism Vygotsky’s work overlapped considerably with behaviourist thinking Cognitive constructivism and social constructivism Links between forms of assessment and learning theory not clear e.g.Multiple choice format can be used to assess cognition Implications of theories for assessment practice not straightforward e.g. Skinner (1989) Good instruction demands two things: students must be told immediately whether what they do is right or wrong and, when right, they must be directed to the step to be taken next. July 16, 2015 Page 14 Assessment and psychometrics: they are different Lawn (2008) Crossing the Atlantic – history of the development of different approaches to assessment in European countries and the US Early 80s ‘Rasch wars’ in the UK Nuttall, Gipps, Broadfoot, Black, Harlen … argued for a more educationally sound approach to assessment that was learnercentred Baird & Black (2013) outlined how psychometrics does not fit well with a range of educational assessment purposes: curricula change (construct), criteria public, correlation between questions, qualification-focus (not item), multiple dimensions in tests, pre-testing not always feasible … Psychometrics literature – how has it related to learning theory July 16, 2015 Page 15 Psychometric models – representational measurement The Ferguson Committee – 1940 British Association for the Advancement of Science No evidence that psychological assessments were quantitative Based upon Campbells 1920s arguments against psychophysics Rebuttal by Stevens (1946), giving us the different forms of ‘measurement’ – nominal, ordinal, interval, ratio [learning can be measured in each of these ways] Measurement as a product of the instrument Luce and Tukey’s additive conjoint measurement – mathematical proof that ratio scales could be produced from transformations of ordinal variables. BUT the assumptions of additive conjoint measurement are not met by assessment data. For example, transitivity (If A>B and B>C then A>C): item parameters change across time and sub-populations Michell (1997) says this isn’t measurement in the scientific sense, but he says that about all of psychological assessment and by implication educational assessment July 16, 2015 Page 16 Psychometric models – classical test theory Based upon centuries of statistical work Lord and Novick (1968) made the great leap forward BUT suffers from the same problems as representational theory True score = learning part of the equation, but what is it? Theoretical July 16, 2015 Page 17 Psychometric models – latent trait theory Come from factor analytic models produced by Spearman (1904) One parameter (difficulty), two parameter (& discrimination), three parameter (& guessing) typically used. Multi-dimensional forms also available Only the Rasch form (one parameter) can deal with the transitivity problems Unlike in the representational theory approach, Rasch is probabilistic, so deviations from the model are handled More than one parameter causes problems for transitivity BUT still does not deal with Michell’s criticisms Are psychological constructs quantifiable? Not just a problem for psychometrics – for the assessment field broadly July 16, 2015 Page 18 What does it mean to assess attainment? Cronbach & Meehl (1955) A construct is some postulated attribute of people, assumed to be reflected in test performance. In test validation the attribute about which we make statements in interpreting a test is a construct. Educational attainment constructs – grading & scoring scheme Assessment by association Dimensionality All based upon correlation How can we combine different things? Invariance Do the scores mean the same thing across tests and groups of students? Presentation title, edit in header and footer (view menu) July 16, 2015 Page 19 Physics envy Michell (2008) – requirements for real numbers should be satisified Kane (2008) Educational assessments do not meet these strictures Physics is held up as an idealistic example of measurement, but Realist - numbers exist independently from humans Physical measurements took a long time to develop – theories & apparatus Measurement in physics shows inconsistency across too (Hedges, 1987) Need externality to our measures & multiple ways of measuring to substantiate that there is a real phenomenon Presentation title, edit in header and footer (view menu) July 16, 2015 Page 20 Creation of constructs 1. 2. 3. 4. Theory-based Emprically-driven Subject-matter expert devised Policy-driven Agenda-setting activity “Different methods and theories have implications for the ways in which concepts such as learning or educational reform or fairness are formulated, studied and promoted as a practical activity. Perhaps more profoundly and subtly, these methods and theories affect the ways human beings are represented and, ultimately the ways they come to understand themselves and others … Moss (2005) “…it may not always be clear to what extent an attribute is conceptually independent of the methods of measurement, especially in human science applications.” Maul (2013) July 16, 2015 Page 21 Philosophical position Field is essentially modernist, Borsboom claims realist too Attributes have an independent existence They are discoverable using scientific methods Neopragmatic, postmodern test theory Agnostic as to the nature of the attributes and their independent existence Accept that the attributes might be defined in part, or entirely, by the assessment apparatus Use triangulation to advance knowledge of the attribute and measurement system Presentation title, edit in header and footer (view menu) July 16, 2015 Page 22 Fields apart Goldstein, Laming and others have pointed out that there is no theory in test theory – only mathematical models McGrath (2005) – psychometrics has caused the problem Borsboom blames psychologists for lack of theory underlying psychological tests Andrich has argued that it should be a collaboration between substantive theorists and psychometricians Sijtsma (2006) – models of learning that underpin test design are often either not referred to or remain in puberty, infancy or even at the foetal stage Has psychometrics (or assessment) helped us to understand learning? July 16, 2015 Page 23 An answer to somebody else’s problem – Baird & Black (2013) Measurement systems have their own, internal logic Measurement doesn’t tell us about the phenomenon of interest Nothing about a set of numbers tells you what they measure (Maraun, 1988) Educational attainment – setting out the construct has an intentional element Presentation title, edit in header and footer (view menu) July 16, 2015 Page 24 How could assessment inform learning theory and vice versa? Empirical data can test theory and help to move it forward We do not seem to have used it systematically Craft knowledge of examiners and other educators – raises questions about what kind/level of theory we expect from assessment data Mark Wilson making serious attempts to do this Dependency of the data upon curriculum exposure and other population characteristics leads to problems with invariance: frame of reference needs to be taken more seriously Sociocultural learning theory do not fit well with standardisation-asfairness principles of assessment Assessment is an expression of educational values July 16, 2015 Page 25 Oxford University Centre for Educational Assessment International tests & learning Therese Hopfenbeck & Jo-Anne Baird July 16, 2015 International tests Learning theory Constructs • Cognitive • Uni-dimensional Quantifiability • Modernist, realist Invariance • Crucial to their interpretation International tests influence learning based upon three processes 1) what counts as valuable learning 2) how national assessment systems are developed around the world 3) how students approach learning since there is evidence that students adopt their learning approaches according to the tests given Example: the case of Norway THE CURRICULA APPROACH – TIMSS, Science, grade 8 Most underground caves are formed by the action of water on a) Granite b) Limestone c) Sandstone d) Shale Cognitive domain: Factual Knowledge, main topic Earths structure and physical features. How international tests have influenced national assessment systems 1) 2) 3) 4) 5) Norway : Introduced national tests in 2004. The reading tests are based upon the PISA reading framework (Frones et al 2012). Denmark: Introduced national tests after low performing in PISA (Egelund, 2008). Japan: changed item format on their national tests to more open-responses like those in PISA (Schleicher, 2009). Korea: PISA like tasks on their University Entrance Exam (Schleicher 2009). Germany: introduction of national educational standards and more focus upon external assessment (Ertl, 2006) Literature review in three steps Step 1 Broad search AHELO, PIAAC, PIRLS, PISA, TIMSS More than 1000 articles detected. Step 2 Narrowing to peer-reviewed (805) No grey literature, but categorized reports, book reviews and conference papers separately. Step 3 Quality checks of relevant articles based upon reading abstracts. Developing categories based upon model-article and coding in End-note. Meetings with discussions on categories and coding. Still in process of quality checking. Published peer-reviewed articles from 1993 - 2013 140 120 100 80 60 40 20 0 1990 1995 2000 2005 2010 2015 Interest in Science Secondary analysis of student questionnaire data Presentation title, edit in header and footer (view menu) July 16, 2015 Page 36 Presentation title, edit in header and footer (view menu) July 16, 2015 Page 37 Oxford University Centre for Educational Assessment Formative assessment & learning Gordon Stobart & Therese N. Hopfenbeck July 16, 2015 Assessment for learning Learning theory • Socio-cultural? (Cognitive) Constructs • Concepts, not constructs Quantifiability • Postmodern, qualitative feedback (largely) Invariance • Not assumed Presentation title, edit in header and footer (view menu) July 16, 2015 Page 41 Elevers forutsetninger for å lære kan styrkes dersom de 1. 2. 3. 4. Forstår hva de skal lære og hva som er forventet av dem. Får tilbakemeldinger som forteller dem om kvaliteten på arbeidet eller prestasjonen. Får råd om hvordan de kan forbedre seg. Er involvert i eget læringsarbeid ved blant annet å vurdere eget arbeid og utvikling. Fire prinsipper om vurdering, Utdanningsdirektoratet. The challenges Lack of theoretical consensus on AfL and formative assessment Few researchers use the original articles Development of stories – which is flawed if you look at the original work July 16, 2015 Page 43 The vast majority of studies on AfL are small-scale action research designs and are published in a wide range of journals. A concern for the review is that current definitions of formative assessment/AfL cover a wide range of teaching and learning practices while research designs often lack an action theory (what is causing change), often accompanied by a lack of systematic data collection (for example baseline data before a research initiative). July 16, 2015 Page 44 Overselling? Whilst claims for large effect sizes are regularly made in the literature, the evidence for these has increasingly been critiqued, for example by Bennett (2011) and Kingston and Nash (2011). The effects of formative assessment upon learning have been over-sold by some authors. This is unfortunate because the limited empirical research suggests a modest, but educationally significant, impact on teaching and learning. July 16, 2015 Page 45 “Good instruction demands two things: students must be told immediately whether what they do is right or wrong and, when right, they must be directed to the step to be taken next. We presume that Skinner meant that students be directed to the next step even if they were wrong, although he did not write that”. B.F. Skinner, in a letter to Science in 1989 July 16, 2015 Page 46 Conclusions • Assessment models need to have a closer relationship with learning • Educational assessment is a social construct – it is often an intentional, agenda-setting activity • Test theory provides statistical models which will vary in utility with context • Cognitive learning theory is the current model in assessment • International tests influence learning through policy • Assessment for learning influences learning through practice • Assessment outcomes have powerful effects – the numbers have a life of their own. Unless we change our practices, the effects will be more detrimental upon learning July 16, 2015 Page 48