Transcript Slide 1
Using Summative Data to Monitor Student Performance: Choosing appropriate summative tests. Presented by Philip Holmes-Smith School Research Evaluation and Measurement Services Overview of the session 1.Diagnostic vs. Summative Testing 2.Choosing Appropriate Summative Tests • The reliability of summative (standardised) tests. • Choosing appropriate summative tests. • When should you administer summative tests? 1. Overview of Diagnostic vs. Summative Testing Examples of Diagnostic Testing • Assessment tools such as: – – – – – – Marie Clay Inventory, English Online Assessment (Government schools), Maths Online Assessment (Government schools), SINE (CEO schools), On-Demand Linear Tests, Probe. • Teacher assessments such as: – teacher questioning in class, – teacher observations, – student work (including portfolios). Diagnostic Testing • Research shows that our most effective teachers (in terms of improving the learning outcomes of students) constantly use feedback (including the use of diagnostic information to inform their teaching). Hattie (2003, 2009)* shows that using feedback (including using diagnostic information about what each student can and can’t do to inform teaching) has one of the biggest impacts on improving student learning outcomes. * 2003: http://www.acer.edu.au/documents/RC2003_Hattie_TeachersMakeADifference.pdf 2009: Hattie, John. (2009). Visible Learning: A synthesis of over 800 meta-analyses relating to achievement. NY: Routledge. Examples of Summative (Standardised) Testing • Government sponsored assessment tools such as: – NAPLAN, – English Online Assessment (Government schools), – On-Demand Adaptive Tests. • Other commercial tests such as: – TORCH, – PAT-R, – PAT-Math (together with I Can Do Maths). Summative (Standardised) Testing • Summative testing is essential to monitor the effectiveness of your teaching. • But, research shows that summative tests do not lead to improved learning outcomes. As the saying goes: “You don’t fatten a pig by weighing it” • So, although it is essential, keep summative testing to a minimum. 2. Summative Tests Summative (Standardised) Testing • Summative testing is essential to monitor the effectiveness of our teaching, but: – Is NAPLAN reliable for all students? – Are the other summative tests you administer reliable for all students? • We need to maximise the reliability of the tests we use to monitor the effectiveness of our teaching. Summative (Standardised) Testing • Summative testing is essential to monitor the effectiveness of our teaching, but: – Do we currently gather enough information to monitor the effectiveness of our teaching of ALL students? e.g. • Year 3 NAPLAN reflects the effectiveness of your Prep-Yr2 teaching but what about the Prep teaching vs. Yr1 teaching vs. the Yr2 teaching? • Year 9 NAPLAN reflects the effectiveness of your Yr7-Yr8 teaching but what about the Yr 7 teaching vs. Yr 8 teaching? • We need choose appropriate summative tests to monitor the effectiveness of our teaching at all year levels from Prep – Yr10! The Reliability of Summative Tests Three Questions 1. Do you believe that your students’ NAPLAN and/or On-Demand results accurately reflect their level of performance? Three Questions 1. Do you believe that your students’ NAPLAN and/or On-Demand results accurately reflect their level of performance? 2. If we acknowledge that the odd student will have a lucky guessing day or a horror day, what about the majority? – Have your weakest students received a low score? – Have your average students received a score at about expected level? – Have your best students received a high score? Three Questions 1. Do you believe that your students’ NAPLAN and/or On-Demand results accurately reflect their level of performance? 2. If we acknowledge that the odd student will have a lucky guessing day or a horror day, what about the majority? – Have your weakest students received a low score? – Have your average students received a score at about expected level? – Have your best students received a high score? 3. Think about your students who received high and low scores: – Are your low scores too low? – Are your high scores too high? High highs and Low lows Is this reading score reliable? Is this reading score reliable? Item difficulties for a typical test Summary Statements about Scores • Low scores (i.e. more than 0.5 VELS levels below expected) indicate poor performance but the actual values should be considered as indicative only (i.e. such scores are associated with high levels of measurement error). • High scores (i.e. more than 0.5 VELS levels above expected) indicate good performance but the actual values should be considered as indicative only. (i.e. such scores are associated with high levels of measurement error). • Average scores indicate roughly expected levels of performance and the actual values are more reliable (i.e. such scores are associated with lower levels of measurement error). Choosing appropriate summative tests Item Difficulties for Booklet 6 on the PAT-R (Comprehension) scale score scale Average Item Difficulty Converting Raw test Scores Booklet 6 To PAT-R (Comprehension) scale score Test difficulties of the PAR-R (Comprehension) Tests on the TORCH score scale together with Year Level mean scores PAT-C: Selecting the Correct Booklet for a Given Ability Level - November 160 150 140 PAT-Comprehension Score 130 120 110 One year above expected/ 75th Percentile 100 YrLev Mean/ Mean Difficulty 90 One year below expected/ 25th Percentile 80 70 60 50 40 Year Level Averages / Average Booklet Difficulties Different norm tables for different tests Test difficulties of the PAT-Maths Tests on the PATM scale score scale together with Year Level mean scores Year 10 Year 8&9 Year 6&7 Year 5 Year 4 Year 3 Which is the best test for an average Year 4 student? Year 2 • • Year 1 Source: ACER, 2006 Test difficulties of the PAT-Maths Tests on the PATM scale score scale together with Year Level mean scores Year 10 Year 8&9 Year 6&7 Year 5 Year 4 Year 3 The best test for an average Year 4 student is probably Test 4 or 5 Year 2 • • Year 1 Source: ACER, 2006 Things to look for in a summative test • Needs to have a single developmental scale that shows increasing levels of achievement over all the year levels at your school. • Needs to have “norms” or expected levels for each year level (e.g. The National “norm” for Yr 3 students on TORCH is an average of 34.7). • Needs to be able to demonstrate growth from one year to the next (e.g. during Yr 4, the average student grows from a score of 34.7 in Yr 3 to an expected score of 41.4 in Yr 4 – that is 6.7 score points). • As a bonus, the test could also provides diagnostic information. Norms for Year 3 to Year 10 on the TORCH scale TORCH NORMS 90th Percentile 50th Percentile 10th Percentile My Recommended Summative Tests (Pen & Paper) • Reading Comprehension – Progressive Achievement Test - Reading (Comprehension) (PAT-R, 4th Edition) – TORCH and TORCH plus • Mathematics – Progressive Achievement Test - Mathematics (PAT-Maths, 3rd Edition) combined with the I Can Do Maths Selecting the correct PAT-C Test 160 PAT-C: Selecting the Correct Booklet for a Given Ability Level - November 150 140 PAT-Comprehension Score 130 120 110 100 75th Percentile Mean 90 25th Percentile 80 70 60 50 40 Year Level Averages / Average Booklet Difficulties Selecting the correct TORCH Test TORCH: Selecting the Correct Test for a Given Ability Level - November 80.0 75.0 70.0 65.0 TORCH Score 60.0 55.0 50.0 One year above expected/ 75th Percentile 45.0 YrLev Mean/ Mean Difficulty 40.0 One year below expected/ 25th Percentile 35.0 30.0 25.0 20.0 Year Level Averages / Average Test Difficulties Selecting the correct PAT-Math/ICDM Test PAT-M/ICDM: Selecting the Correct Test for a Given Ability Level - November 90.0 80.0 70.0 PAT-Math Score 60.0 50.0 40.0 One year above expected/ 75th Percentile 30.0 YrLev Mean/ Mean Difficulty 20.0 One year below expected/ 25th Percentile 10.0 0.0 -10.0 Year Level Averages / Average Test Difficulties My Recommended Summative Tests (On-Line) • On-Demand - Reading Comprehension – The 30-item “On-Demand” Adaptive Reading test • On-Demand - Spelling – • The 30-item “On-Demand” Adaptive Spelling test On-Demand - Writing Conventions – • The 30-item “On-Demand” Adaptive Writing test On-Demand – General English (Comprehension, Spelling & Writing Conventions) – The 60-item “On-Demand” Adaptive General English test • On-Demand - Mathematics (Number, Measurement, Chance & Data and Space) – The 60-item “On-Demand” Adaptive General Mathematics test • On-Demand - Number – • On-Demand – Measurement, Chance & Data – • The 30-item “On-Demand” Adaptive Number test The 30-item “On-Demand” Adaptive Measurement, Chance & Data test On-Demand - Space – The 30-item “On-Demand” Adaptive Space test Choosing the right starting point is still important (even for “Adaptive” Tests) Choosing the right starting point is still important (even for “Adaptive” Tests) Summative Testing and Triangulation • Even if you give the right test to the right student, sometimes, the test score does not reflect the true ability of the student – every measurement is associated with some error. • To overcome this we should aim to get at least three independent measures – what researchers call TRIANGULATION. • This may include: – Teacher judgment – NAPLAN results – Other pen & paper summative tests (e.g. TORCH, PAT-R, PATMaths, I Can Do Maths) – On-line summative tests (e.g. On-Demand ‘Adaptive’ testing, English Online) Summative Testing and Triangulation • BUT remember, more summative testing does not lead to improved learning outcomes so keep the summative testing to a minimum When should you administer summative tests? Timing for Summative Testing • Should be done at a time when teachers are trying to triangulate on each student’s level of performance. (i.e. mid-year and end-of-year reporting time.) • Should be done at a time that enables teachers to monitor growth – say, every six months. (i.e. From the beginning of the year to the middle of the year and from the middle of the year to the end of the year.) Suggested timing • For Year 1 – Year 6 and Year 8 – Year 10 – Early June (for mid-year reporting and six-monthly growth*) – Early November (for end-of-year reporting and six-monthly growth) • For Prep and Year 7 and new students at other levels – Beginning of the year (for base-line data) – Early June (for mid-year reporting and six-monthly growth) – Early November (for end-of-year reporting and six-monthly growth) * November results from the year before form the base-line data for the current year. (i.e. February testing is not required for Year 1 – Year 6 or for Year 8 – Year 10)