Examining the effects of a highly rated science curriculum

Download Report

Transcript Examining the effects of a highly rated science curriculum

Scaling Up Curriculum for Achievement, Learning, and Equity (SCALE-uP): Highlights from a 7-year research program funded by NSF/IERI

Sharon Lynch, PI SCALE-uP, a partnership: George Washington University and Montgomery County Public Schools Co:PIs: Curtis Pyke, Joel Kuipers, Michael Szesze and Bonnie Hansen-Grafton http://www.gwu.edu/~scale-up/ Prepared for presentation MSP

Deep Background for SCALE-uP

• • In 1990’s, AAAS Project 2061 developed a Curriculum Analysis to identify curriculum materials to help students learn a target idea (benchmark/standard), a process based on experts using a

single

set of criteria to judge written materials.

• Same time, many science educators called for multi-cultural science education and need to modify/differentiate curriculum and instruction for diverse learners.

So which was it

? Single set of criteria for materials good for all students? Or would some subgroups of students be disadvantaged by curriculum materials highly rated by Project 2061?

Assumptions

• • •

Science curriculum materials matter. They scaffold student learning and improve teachers’ content and pedagogical knowledge, especially with coordinated professional development.

Curriculum units with a thoughtful sequence of activities provide direct experiences with physical phenomena, pressing students to represent and make sense of data. Sense-making occurs individually, and through groups. Such materials may serve equity concerns:

They build experiences rather than relying on prior knowledge.

– – –

Multi-modal and accessible to diverse learners. Students work together in groups for sense-making. Mitigate the effects of middle school ability grouping.

Interventions: 3 curriculum units focused on different, challenging target ideas

• State of Michigan’s

Chemistry That Applies (CTA)

focuses on conservation of matter. 8 th grade unit ~ 6 weeks long.

• GEMS Lawrence Hall of Science

Real Reasons for the Season (Seasons)

focuses on the reasons for the Earth’s seasons. 7 th grade unit, ~ 3 weeks.

• ARIES Harvard Smithsonian

Motion and Forces (M&F)

focuses on portions of Newton’s Laws. 6 th grade unit ~ 6 weeks long.

● Curriculum Analysis: Instructional Strategies =Excellent, ◕ =Very Good, ◒ =Satisfactory, ◔ =Fair ○ =Poor

Instructional Category Chemistry That Applies ARIES Motion & Forces GEMS Seasons Macmillan/ McGraw Hill I. Identifying a Sense of Purpose Conveying Unit Purpose Conveying lesson/activity purpose Justifying lesson/activity sequence

○ ◕ ◒ ○ ◒ ◒ NR ● ◒ ◔ ◔ ○

II. Taking Account of Student Ideas Attending to prerequisite knowledge and skills Alerting teacher to commonly held ideas Assisting teacher in identifying own students’ ideas Addressing commonly held ideas

◒ ◒ ◒ ◒ ○ ○ ◒ ○ ○ NR ○ ◒ ○ ○ ○ ○

● =Excellent, ◕ =Very Good, ◒ =Satisfactory, ◔ =Fair ○ =Poor

Instructional Category Chemistry That Applies ARIES Motion & Forces GEMS Seasons Macmillan/ McGraw Hill III. Engaging Students with Relevant Phenomena Providing a variety of phenomena Providing vivid experiences

● ● ● ● ○ ◒ ○ ○

IV. Developing and Using Scientific Ideas Introducing terms meaningfully Representing ideas effectively Demonstrating use of knowledge Providing practice

● ◒ ◕ ● ◒ ◒ ○ ○ ◒ ● ◒ ○ ◔ ○ ○ ○

V. Promoting Student Thinking about Phenomena, Experiences, and Knowledge Encouraging students to explain their ideas Guiding student interpretation and reasoning Encouraging students to think about what they’ve learned

● ● ○ ◒ ○ ○ ○ ● ○ ○ ○ ○

SCALE uP’s 5 Research Questions Implementation Questions: 1. Would a middle school science unit rated by Project 2061 Analysis provide better student outcomes than standard fare? Would outcomes be equitable? 2. How did each unit function in classrooms of diverse learners? Scale-up Questions: 3. If effective at small scale (5 schools), would a unit be as effective at large scale (35 schools)?

4. Would a unit prove more effective as schools’ experience with it increased?

5. Is there a relationship between fidelity of implementation of a unit and student outcomes?

Methods: Quasi-experimental implementation studies for 3 units

• 5 matched pairs of middle schools chosen. Then random assignment to treatment or comparison condition, resulting in two groups of students equivalent for demographic characteristics, reading and math scores, prior science GPA, and pre-test on target idea. Student-level unit of analysis.

• Each unit was replicated in same schools for two consecutive years.

• Curriculum independent assessments created for each unit, focusing on its target ideas. • If the unit was effective and equitable at implementation phase, then scale it up to 35 middle schools.

Was each unit effective?

Overview of results for Years 0 through 4: Assessment Levels

• 71-100 -- Flexible understanding of, and commitment to, the benchmark ideas, with few errors or misconceptions. • 51-70 -- Some fluency with the ideas, but also misconceptions in certain contexts.

• 24-50 -- Some evidence of understanding in specific contexts.

• 0-23 -- No consistent evidence of understanding the benchmark ideas.

Results: Adjusted Post-test Scores for Each Unit

Was each unit equitable?

Effect Sizes:

CTA

(Year 1)

How did CTA function in a classroom? A picture is worth a thousand words

Roles of students and their interactions?

Role of teacher?

Role of written curriculum materials?

Role of the physical phenomena?

Chemistry That Applies (CTA)

Effect Sizes:

Seasons

(Year 3)

Real Reasons for Seasons:

Seasons

Effect Sizes:

M&F

(Year 2)

Motion and Forces:

M&F

Scale-up

M&F

or not?

• • • •

We decided to:

Replicate M&F a third time in 5 new pairsof schools (later reduced to 4 pairs), rather than scale it up.

Eliminate pre-test

for this trial.

Focus on fidelity of implementation

they would be interviewed. to better understand whether the independent variable was adequately delivered; M&F and Comparison teachers knew their classes would to be observed and that

Purchase M&F Student Guides

for each student, to be collected at close of unit, in order to better adhere to unit’s intent.

Effect Sizes:

M&F

(Year 4)

Scale up or not?

Yes.

M&F scaled up to 30 middle schools.

Comparison group of 5 schools were retained, contrary to original plan for full scale.

Effect Sizes:

M&F

(Year 5) at (nearly) Full Scale

Results for Year 5: M&F at (nearly) full scale

M&F scaled up to 30 middle schools, with 5 schools “held out” for comparison. Assessment given in 15 + 5 schools.

• • No statistically significant differences in student outcomes. Sustainable? Worthwhile?

M&F

most effective at Year 4 when: --there was great attention to fidelity.

--professional development was done by developer and GW (~ 17 hours) rather than via internal professional development during school (~ 6 hours). • What about school experience? More experienced schools (4 years) had same student outcomes as less experienced schools (1 year). At teacher level, about 75% of teachers had left their middle schools, including the “trainers” after 4 years. • Acknowledge co-PI Curtis Pyke for taking lead on this aspect of SCALE-uP.

Implications and Speculations for 3 Curriculum Units

• Project 2061 Curriculum Analysis? This study only provides one strong “case” for success,

CTA

students had most to gain from the units. . • 2 of 3 units were effective and equitable; underserved •

M&F

was effective only with high fidelity. • Effectiveness of a unit depends on the “comparison condition”.

• Effectiveness at small scale may be hard to sustain at long-term, large scale. • Professional development should be ongoing for such units, due to teacher turnover.

• Learned much about research design/methods for school-based intervention research. Much more to learn.

• Design of nested studies perhaps a slippery slope unless efficacy of intervention is well established. Unit of analysis ought to make theoretical sense.

http://www.gwu.edu/~scale-up/ Papers may be found here.

Why were

M&F

’s results equivocal at (nearly) full scale? • Professional development was not held during summer by curriculum unit developer for full scale, but done by lead teachers September according to geographic groups. • M&F results were best when district attention was focused on the study. At full-scale, the end of the research grant was near. Competing mandates. • “Comparison group” was focused on district guide for four years , and perhaps became more effective. • Overall, M&F did not seem to be a “powerful” intervention. 3 of 4 trials led to equivocal results. Comparison group classrooms were not so different than

M&F

.

Standards-based curriculum materials may improve instruction and outcomes: Project 2061 criteria 1. Convey sense of purpose 2. Address student ideas and misconceptions 3. Promote engagement with relevant phenomena 4. Developing, using scientific ideas 5. Encourage student thinking 6. Encourage assessment of progress 7. Creating learning environment: curiosity, all students

AAAS. Project 2061.

The Focus of Chemistry That Applies– CTA • The Conservation of Matter Benchmark -

No matter how substances within a closed system interact with one another, or how they combine or break apart, the total weight of the system remains the same. The idea of atoms explain conservation of matter: If the number of atoms stays the same no matter how they are rearranged, then their total mass stays the same.

AAAS. 1993. Benchmarks for Science Literacy. Project 2061.

Motion and Forces Target Idea Adapted from AAAS (2001).

65 60 55 50 45 40 35 30 25 20 15

Pre to Post Gains by Gender

Conservation of Matter Understanding by Gender Pretest Posttest

CTA Males CTA Females Comparison Males Comparison Females

65 60 30 25 20 15 55 50 45 40 35

Pre to Post Gains by FARMS

Conservation of Matter Understanding by FARMS Pretest Posttest

CTA--FARMS Never CTA--FARMS Prior CTA--FARMS Current Comp--FARMS Never Comp--FARMS Prior Comp--FARMS Current

65 60 55 50 45 40 35 30 25 20 15

Pre to Post Gains by ESOL

Conservation of Matter Understanding by ESOL Pretest Postest

CTA ESOL Never CTA ESOL Prior CTA ESOL Current Comp ESOL Never Comp ESOL Prior Comp ESOL Current

35 30 25 20 15 65 60 55 50 45 40

Pre to Post Gains by Ethnicity

Conservation of Matter Understanding by Ethnicity Pretest Posttest

CTA White CTA Asian Amer.

CTA Hispanic CTA African Amer.

Comp. White Comp. Asian Amer.

Comp. Hispanic Comp. African Amer.

Disclaimer

The instructional practices and assessments discussed or shown in these presentations are not intended as an endorsement by the U.S. Department of Education.