Obtaining International Benchmarks for States Through Statistical Linking:

Download Report

Transcript Obtaining International Benchmarks for States Through Statistical Linking:

Obtaining International Benchmarks for States Through Statistical Linking: Presentation at the Institute of Education Sciences (IES) National Center for Education Statistics (NCES)

Gary W. Phillips Chief Scientist American Institutes for Research May 30, 2008

This paper is intended to promote the exchange of ideas among researchers and policy makers. The views expressed in it are part of ongoing research and analysis and do not necessarily reflect the position of the National Center for Education Statistics, the Institute of Education Sciences, or the U.S. Department of Education.

Two ways of obtaining International Benchmarks for States

 

Hard Way

• Each State is administered PISA, TIMSS or PIRLS, and each country is administered PISA, TIMSS or PIRLS.

• Very Expensive and impractical.

Easy Way (statistical linking)

• Each State Is administered NAEP, and each country is administered TIMSS or PIRLS.

• Inexpensive and practical.

Dr. Gary W. Phillips American Institutes for Research 2

PISA Cannot be Statistically Linked to NAEP

  PISA is an age-based assessment (with grade 10 as the modal grade), NAEP is grade-based (grade 8) so PISA results cannot be statistically linked to NAEP.

Even if NAEP conducted a special study to make it possible to link NAEP and PISA (e.g., an age-based special assessment of NAEP with a modal grade of 10), State-NAEP still cannot be compared to PISA because State NAEP is still in the 8 th grade.

Dr. Gary W. Phillips American Institutes for Research 3

TIMSS was Purposefully Designed to be Statistically Linkable to NAEP

 Same grades as NAEP  Similar content standards to NAEP   Similar test design • Matrix sampling of cognitive test items • Policy/research related Background questions Similar nationally representative sampling  Similar scaling (item-response theory)  Similar analysis methods (plausible values) Dr. Gary W. Phillips American Institutes for Research 4

What is Statistical Linking?

If you link test X to Y, this means

• We are expressing the scores on one test (X) in terms of the metric of another test (Y) • For example, we are expressing the scores on NAEP (X) in terms of the metric of TIMSS (Y) 

Four types of linking: equating, calibration, projection, moderation.

Dr. Gary W. Phillips American Institutes for Research 5

Assumptions in Linking

Statistically linking test X to test Y High correlation between true scores Equating Calibration Projection Moderation x 1 x 1 x Same Content x x Equal Reliability x 1 assumes a perfect correlation between true scores

Dr. Gary W. Phillips American Institutes for Research 6

Why is it Important to Statistically Link NAEP to TIMSS or PIRLS?

   Allows policy makers to see how the U.S. (as a whole), the states and school districts stack up against the rest of the world.

Provides a common metric that is familiar to US policy makers. It’s like converting world currencies to dollars.

Seeing the results of TIMSS or PIRLS through the lens of NAEP Achievement Levels provides a familiar benchmark to interpret international educational performance Dr. Gary W. Phillips American Institutes for Research 7

What are the Ideal Data Requirements for Statistically Linking NAEP to International Assessments?

  NAEP and the international assessment must be administered within the United States to groups of students which are: • Randomly equivalent • • • Nationally representative In the same grades In the same year NAEP and the international assessment must cover similar (but not identical) content Dr. Gary W. Phillips American Institutes for Research 8

Possible Statistical Linkages With NAEP During This Decade

Grade NAEP R,M 2-yr cycle National NCES State District 1999 2000 2001 4 8 12 4 8 12 4 8 M,S

M,S

M,S 2002 2003 2004 12 4 8 12 4 8 12 4 8 12 4 R, R,

M M

2005 2006 2007 8 12 4 8 12 4 8 12 R,M,S R,M,S R,M*,S

R

,

M

R,

M

,W R,M*,W 2008 4 8 2009 12 4 8 12 R*,M,S* R*,M,S* R*,M*,S* * New framework M,S M,S R,M R,M R,M,S R,M,S R,M R,M,W R*,M,S* R*,M,S* R,M R,M R,M R,M R,M R,M R,M R,M TIMSS IEA 4-yr cycle gr 4, 8

M,S M M M M

,S ,S ,S ,S PIRLS IEA 5-yr cycle gr 4 R

R

Dr. Gary W. Phillips American Institutes for Research PISA OECD 3-yr cycle age 15 9 R,M,S R,M,S M,S R,M,S

Using NAEP linked to TIMSS as an International Benchmark in Mathematics (Phillips, “ Chance Favors the Prepared Mind” , AIR, 2007)

Comparisons bet ween grade 8 2007 NAEP st at e mat hemat ics result s for

Maryl and

and grade 8 2003 T IMSS nat ional mat hemat ics result s for t he percent at and above proficient based on NAEP achievement levels project ed on t o t he T IMSS scale 100 90 80 70 60 50 40 30 20 10 0 73 66 65 61 57 40 38 37 37 36 28 27 27 26 26 25 24 24 22 22 21 21 19 19 18 18 17 17 12 11 9 8 7 5 5 4 3 2 2 2 2 1 1 0 0 0 H Si on ng g K ap on K or or g, e S ea Ch A , R in R ep es . o e T Be f lg ai iu pe m i Ja (F pa le N n m et ish he ) rla nd H s un ga M ry ar yl Sl an ov d Es ak to R ni Ru a ep ss ub A ia lic us n U tra Fe ni lia de ted ra tio M S ta n al te ay sia s T IM SS La tv Li th ia ua ni a Is ra En el gl an Sc N d ot ew la Z nd ea la nd Sw ed en Se rb Sl ia ov en Ro ia m an A ia rm en M ia ol Ita Bu do ly lg va ar , R ia ep M . o f Cy ac ed pr N on us or ia w , R ay ep . o Pa f Jo le rd sti an ni Eg In an yp do N t ne at Ira 'l n, sia A Is ut la h.

Le m ba ic no R n ep . o f Ch ile Ba hr Ph ai ili n pp in es Tu ni M sia or oc Bo co tsw So ut an h a A Sa fr ud ic a i A ra bi G a ha na 0 Dr. Gary W. Phillips American Institutes for Research 10

Using NAEP linked to TIMSS as an International Benchmark in Science (Phillips, “ Chance Favors the Prepared Mind” , AIR, 2007)

Comparisons bet ween grade 8 2005 NAEP st at e science result s for

Maryl and

and grade 8 2003 T IMSS nat ional science result s for t he percent at and above proficient based on NAEP achievement levels project ed on t o t he T IMSS scale 100 90 80 70 60 50 40 30 20 10 55 52 45 44 42 41 38 38 31 31 30 28 26 26 26 25 24 24 24 22 21 20 18 17 17 15 15 14 12 10 10 10 8 8 6 6 4 3 3 3 3 1 1 1 1 0 0 Si Ch ng in ap es K or e T or H e ea ai on pe , R g i ep K . o on f g, S A R Ja pa Es n U to ni ni En a ted gl S an H ta d un te ga s T ry N IM et SS he rla nd A us s tra lia Sw ed M ar N en yl Sl ew an Z ov d ea ak la R nd ep ub Li th Ru lic ua ss ia ni Sl n a ov Fe en de Be ia ra lg tio Sc iu n m ot la (F nd le m ish ) La tv M ia al ay sia Is ra Bu el lg ar ia Ita ly Jo rd an N or w Ro M ay ac m an ed M ia on ol Se ia rb , R do ia ep va . o , R Pa f ep le . o A sti f rm ni Ira en an n, ia N Eg Is at la yp 'l m A t ut ic h.

R ep . o f Cy pr us Ba hr ai n Ch In ile do ne Ph ili sia pp in es Le Sa ba no ud n i A ra Bo bi So a tsw ut an h a A fr ic M a or oc co G ha na Tu ni sia 0 Dr. Gary W. Phillips American Institutes for Research 11

One potential way of Obtaining State Estimates for PISA is Through Small Area Estimation

 If a survey has been carried out for the nation as a whole, the sample size may be too small for each state to generate accurate state estimates from the collected data.  To deal with this problem, it may be possible to supplement the survey data with auxiliary data (such as the CCD) and use regression modeling techniques in order to obtain state estimates. Dr. Gary W. Phillips American Institutes for Research 12

Statistical Linking Versus Small Area Estimation

 

Statistical Linking

- Two assessments (e.g., NAEP and TIMSS), measuring similar content, are administered to randomly equivalent national samples and the scales are statistically linked (sort of like converting Celsius to Fahrenheit).

Small Area Estimation

obtain state estimates. – One assessment (e.g., PISA) is administered to a national sample and auxiliary data (such as the CCD) is used with regression modeling techniques to Dr. Gary W. Phillips American Institutes for Research 13