Transcript Slide 1

Using Summative Data to
Monitor Student
Performance:
Choosing appropriate summative tests.
Presented by
Philip Holmes-Smith
School Research Evaluation and Measurement Services
Overview of the session
1.Diagnostic vs. Summative Testing
2.Choosing Appropriate Summative Tests
• The reliability of summative (standardised) tests.
• Choosing appropriate summative tests.
• When should you administer summative tests?
1. Overview of
Diagnostic vs. Summative
Testing
Examples of Diagnostic Testing
• Assessment tools such as:
–
–
–
–
–
–
Marie Clay Inventory,
English Online Assessment (Government schools),
Maths Online Assessment (Government schools),
SINE (CEO schools),
On-Demand Linear Tests,
Probe.
• Teacher assessments such as:
– teacher questioning in class,
– teacher observations,
– student work (including portfolios).
Diagnostic Testing
• Research shows that our most effective teachers
(in terms of improving the learning outcomes of students)
constantly use feedback (including the use of
diagnostic information to inform their teaching).
Hattie (2003, 2009)* shows that using feedback
(including using diagnostic information about
what each student can and can’t do to inform
teaching) has one of the biggest impacts on
improving student learning outcomes.
* 2003: http://www.acer.edu.au/documents/RC2003_Hattie_TeachersMakeADifference.pdf
2009: Hattie, John. (2009). Visible Learning: A synthesis of over 800 meta-analyses relating
to achievement. NY: Routledge.
Examples of Summative (Standardised) Testing
• Government sponsored assessment tools such
as:
– NAPLAN,
– English Online Assessment (Government schools),
– On-Demand Adaptive Tests.
• Other commercial tests such as:
– TORCH,
– PAT-R,
– PAT-Math (together with I Can Do Maths).
Summative (Standardised) Testing
• Summative testing is essential to monitor the
effectiveness of your teaching.
• But, research shows that summative tests
do not lead to improved learning outcomes. As
the saying goes:
“You don’t fatten a pig by weighing it”
• So, although it is essential, keep summative
testing to a minimum.
2. Summative Tests
Summative (Standardised) Testing
• Summative testing is essential to monitor
the effectiveness of our teaching, but:
– Is NAPLAN reliable for all students?
– Are the other summative tests you administer
reliable for all students?
• We need to maximise the reliability of the
tests we use to monitor the effectiveness
of our teaching.
Summative (Standardised) Testing
• Summative testing is essential to monitor
the effectiveness of our teaching, but:
– Do we currently gather enough information to monitor
the effectiveness of our teaching of ALL students? e.g.
• Year 3 NAPLAN reflects the effectiveness of your Prep-Yr2
teaching but what about the Prep teaching vs. Yr1 teaching vs.
the Yr2 teaching?
• Year 9 NAPLAN reflects the effectiveness of your Yr7-Yr8
teaching but what about the Yr 7 teaching vs. Yr 8 teaching?
• We need choose appropriate summative
tests to monitor the effectiveness of our
teaching at all year levels from Prep – Yr10!
The Reliability of
Summative Tests
Three Questions
1.
Do you believe that your students’ NAPLAN and/or On-Demand
results accurately reflect their level of performance?
Three Questions
1.
Do you believe that your students’ NAPLAN and/or On-Demand
results accurately reflect their level of performance?
2.
If we acknowledge that the odd student will have a lucky guessing
day or a horror day, what about the majority?
– Have your weakest students received a low score?
– Have your average students received a score at about expected level?
– Have your best students received a high score?
Three Questions
1.
Do you believe that your students’ NAPLAN and/or On-Demand
results accurately reflect their level of performance?
2.
If we acknowledge that the odd student will have a lucky guessing
day or a horror day, what about the majority?
– Have your weakest students received a low score?
– Have your average students received a score at about expected level?
– Have your best students received a high score?
3.
Think about your students who received high and low scores:
– Are your low scores too low?
– Are your high scores too high?
High highs and Low lows
Is this reading
score reliable?
Is this reading
score reliable?
Item difficulties for a typical test
Summary Statements about Scores
•
Low scores (i.e. more than 0.5 VELS levels below
expected) indicate poor performance but the actual
values should be considered as indicative only
(i.e. such scores are associated with high levels of measurement error).
•
High scores (i.e. more than 0.5 VELS levels above
expected) indicate good performance but the actual
values should be considered as indicative only.
(i.e. such scores are associated with high levels of measurement error).
•
Average scores indicate roughly expected levels of
performance and the actual values are more reliable
(i.e. such scores are associated with lower levels of measurement error).
Choosing appropriate
summative tests
Item Difficulties
for
Booklet 6
on the
PAT-R
(Comprehension)
scale score scale
Average Item Difficulty
Converting Raw
test Scores
Booklet 6
To
PAT-R
(Comprehension)
scale score
Test difficulties of the PAR-R (Comprehension) Tests on the TORCH score
scale together with Year Level mean scores
PAT-C: Selecting the Correct Booklet for a Given Ability Level - November
160
150
140
PAT-Comprehension Score
130
120
110
One year above expected/
75th Percentile
100
YrLev Mean/
Mean Difficulty
90
One year below expected/
25th Percentile
80
70
60
50
40
Year Level Averages / Average Booklet Difficulties
Different norm tables for different tests
Test difficulties of
the
PAT-Maths Tests
on the PATM
scale score scale
together with
Year Level mean
scores
Year 10
Year 8&9
Year 6&7
Year 5
Year 4
Year 3
Which is the best
test for an
average Year 4
student?
Year 2
•
•
Year 1
Source:
ACER, 2006
Test difficulties of
the
PAT-Maths Tests
on the PATM scale
score scale
together with Year
Level mean scores
Year 10
Year 8&9
Year 6&7
Year 5
Year 4
Year 3
The best test for an
average Year 4
student is probably
Test 4 or 5
Year 2
•
•
Year 1
Source:
ACER, 2006
Things to look for in a summative test
• Needs to have a single developmental scale that shows
increasing levels of achievement over all the year levels at
your school.
• Needs to have “norms” or expected levels for each year
level (e.g. The National “norm” for Yr 3 students on TORCH
is an average of 34.7).
• Needs to be able to demonstrate growth from one year to
the next (e.g. during Yr 4, the average student grows from a
score of 34.7 in Yr 3 to an expected score of 41.4 in Yr 4 –
that is 6.7 score points).
• As a bonus, the test could also provides diagnostic
information.
Norms for
Year 3 to
Year 10
on the
TORCH scale
TORCH NORMS
90th Percentile
50th Percentile
10th Percentile
My Recommended Summative Tests
(Pen & Paper)
• Reading Comprehension
– Progressive Achievement Test - Reading (Comprehension)
(PAT-R, 4th Edition)
– TORCH and TORCH plus
• Mathematics
– Progressive Achievement Test - Mathematics
(PAT-Maths, 3rd Edition) combined with the
I Can Do Maths
Selecting the correct PAT-C Test
160
PAT-C: Selecting the Correct Booklet for a Given Ability Level - November
150
140
PAT-Comprehension Score
130
120
110
100
75th Percentile
Mean
90
25th Percentile
80
70
60
50
40
Year Level Averages / Average Booklet Difficulties
Selecting the correct TORCH Test
TORCH: Selecting the Correct Test for a Given Ability Level - November
80.0
75.0
70.0
65.0
TORCH Score
60.0
55.0
50.0
One year above expected/
75th Percentile
45.0
YrLev Mean/
Mean Difficulty
40.0
One year below expected/
25th Percentile
35.0
30.0
25.0
20.0
Year Level Averages / Average Test Difficulties
Selecting the correct PAT-Math/ICDM Test
PAT-M/ICDM: Selecting the Correct Test for a Given Ability Level - November
90.0
80.0
70.0
PAT-Math Score
60.0
50.0
40.0
One year above expected/
75th Percentile
30.0
YrLev Mean/
Mean Difficulty
20.0
One year below expected/
25th Percentile
10.0
0.0
-10.0
Year Level Averages / Average Test Difficulties
My Recommended Summative Tests
(On-Line)
• On-Demand - Reading Comprehension
– The 30-item “On-Demand” Adaptive Reading test
•
On-Demand - Spelling
–
•
The 30-item “On-Demand” Adaptive Spelling test
On-Demand - Writing Conventions
–
•
The 30-item “On-Demand” Adaptive Writing test
On-Demand – General English (Comprehension, Spelling & Writing Conventions)
– The 60-item “On-Demand” Adaptive General English test
• On-Demand - Mathematics (Number, Measurement, Chance & Data and Space)
– The 60-item “On-Demand” Adaptive General Mathematics test
•
On-Demand - Number
–
•
On-Demand – Measurement, Chance & Data
–
•
The 30-item “On-Demand” Adaptive Number test
The 30-item “On-Demand” Adaptive Measurement, Chance & Data test
On-Demand - Space
–
The 30-item “On-Demand” Adaptive Space test
Choosing the right starting point is still important
(even for “Adaptive” Tests)
Choosing the right starting point is still important
(even for “Adaptive” Tests)
Summative Testing and Triangulation
• Even if you give the right test to the right student, sometimes, the
test score does not reflect the true ability of the student
– every measurement is associated with some error.
• To overcome this we should aim to get at least three independent
measures – what researchers call TRIANGULATION.
• This may include:
– Teacher judgment
– NAPLAN results
– Other pen & paper summative tests (e.g. TORCH, PAT-R, PATMaths, I Can Do Maths)
– On-line summative tests (e.g. On-Demand ‘Adaptive’ testing,
English Online)
Summative Testing and Triangulation
• BUT remember, more summative testing does not lead to
improved learning outcomes so keep the summative
testing to a minimum
When should you administer
summative tests?
Timing for Summative Testing
• Should be done at a time when teachers are trying to
triangulate on each student’s level of performance. (i.e.
mid-year and end-of-year reporting time.)
• Should be done at a time that enables teachers to monitor
growth – say, every six months. (i.e. From the beginning of
the year to the middle of the year and from the middle of
the year to the end of the year.)
Suggested timing
• For Year 1 – Year 6 and Year 8 – Year 10
– Early June (for mid-year reporting and six-monthly growth*)
– Early November (for end-of-year reporting and six-monthly
growth)
• For Prep and Year 7 and new students at other levels
– Beginning of the year (for base-line data)
– Early June (for mid-year reporting and six-monthly growth)
– Early November (for end-of-year reporting and six-monthly
growth)
* November results from the year before form the base-line data for the current year.
(i.e. February testing is not required for Year 1 – Year 6 or for Year 8 – Year 10)