Using GAISE to Create a Better Introductory Statistics Course

Download Report

Transcript Using GAISE to Create a Better Introductory Statistics Course

What Can We Learn from Quantitative Data in Statistics Education Research?

Sterling Hilton Brigham Young University Andy Zieffler University of Minnesota John Holcomb Cleveland State University Marsha Lovett Carnegie Mellon University University of Minnesota Educational Psychology

Introduction

 Components of a research

program

     Generate ideas (pre-clinical)  Develop a conceptual framework Frame question (pre-clinical, Phase I)    Constructs and Measurement Design and Methods Pilot study Examine question (Phase I, Phase II)  Establish efficacy (small) Generalize findings (Phase III)  Larger studies in varied settings Extend findings (Phase IV)   Longitudinal studies Different populations

Introduction

 Quantitative methods in research program      Framing: measurement development  Validity and reliability Framing: pilot study Examine Generalize Extend  Statistics education research is primarily in the “generate” and “frame” phases

Introduction

 Purpose: Introduce two instruments that are in different stages of development and discuss how they have been and might be used in statistics education research  Comprehensive Assessment of Outcomes in a Fist Statistics course (CAOS)  Survey of Attitudes Toward Statistics (SATS)

Assessment Resource Tools for Improving Statistical Thinking

  Several online assessments

ARTIST Topic Scales

 

Comprehensive Assessment of Outcomes in a First Statistics course (CAOS) Statistics Thinking and Reasoning Test (START)

ARTIST Topic Scales

  7-15 MC items Many topics   Data Collection Data Representation      Measures of Center Measures of Spread Normal Distribution Probability Bivariate Quantitative Data    Bivariate Categorical Data Sampling Distributions Confidence Intervals  Significance Tests

CAOS Test     40 MC items Designed to assess students’ statistical reasoning after any first course in statistics.

CAOS test focuses on statistical literacy and conceptual understanding, with a focus on reasoning about variability.

Developed through a three-year process of acquiring and writing items, testing and revising items, and gathering evidence of reliability and validity.

CAOS Test   

Reliability Analysis

Sample of 10287 Cronbach’s alpha coefficient of .77

   

Content Validity Evidence

18 expert raters Unanimous agreement that CAOS measures important

basic learning outcomes

All raters agreed with the statement “

CAOS measures outcomes for which I would be disappointed if they were not achieved by students who succeed in my statistics courses

.”  Some raters indicated topics that they felt were missing from the scale - no agreement among these raters about the topics that were missing.

START Test    14 MC items Identified through a principal components analysis performed on CAOS data gathered in Fall 2005 and Spring 2006 (n = 1470).

Alpha Coefficient from that data set was calculated to be 0.74.

Use of Quantitative Measures in a Phase 1 Study     Exploratory Studies What can we find out about students’ understanding?

Where are students having difficulties?

Are there inconsistencies in students’ reasoning?

Example Item 1

Measured Learning Outcome

Understanding the interpretation of a median in the context of boxplots.

Example Item 1 The two boxplots below display final exam scores for all students in two different sections of the same course

Example Item 1 Which section has a greater percentage of students with scores at or above 80?

a) b) c) Section A Section B Both sections are about equal.

Example Item 1 Which section has a greater percentage of students with scores at or above 80?

a) b)

c)

Section A Section B

Both sections are about equal.

Example Item 1

 How did students answer this item?

Example Item 1

Pretest Posttest

73.7% 65.6%

Response (N = 754)

Section A 6.6%

19.6%

6.1%

28.2%

Section B

Both sections are about equal.

Example Item 1

 Is this surprising?

 What can we learn from students’ responses to this item?

 Implications/Directions for research? Teaching?

Example Item 2

Measured Learning Outcome

Understanding that correlation does not imply causation.

Example Item 2 Researchers surveyed 1,000 randomly selected adults in the U.S. A statistically significant, strong positive correlation was found between income level and the number of containers of recycling they typically collect in a week. Please select the best interpretation of this result.

Example Item 2 a) We can not conclude whether earning more money causes more recycling among U.S. adults because this type of design does not allow us to infer causation.

b) This sample is too small to draw any conclusions about the relationship between income level and amount of recycling for adults in the U.S.

c) This result indicates that earning more money influences people to recycle more than people who earn less money.

Example Item 2

a) We can not conclude whether earning more money causes more recycling among U.S. adults because this type of design does not allow us to infer causation.

b) This sample is too small to draw any conclusions about the relationship between income level and amount of recycling for adults in the U.S.

c) This result indicates that earning more money influences people to recycle more than people who earn less money.

Example Item 2

 How did students answer this item?

Example Item 2

Pretest 54.6%

18.3% 27.1%

Posttest 52.6%

11.4% 35.9%

Response (N = 743) We can not conclude whether earning more money causes more recycling among U.S. adults because this type of design does not allow us to infer causation.

This sample is too small to draw any conclusions about the relationship between income level and amount of recycling for adults in the U.S.

This result indicates that earning more money influences people to recycle more than people who earn less money.

Example Item 2

 Is this surprising?

 What can we learn from students’ responses to this item?

 Implications/Directions for research? Teaching?

Example Item 3

Measured Learning Outcome

Ability to match a scatterplot to a verbal description of a bivariate relationship.

Example Item 3 Bone density is typically measured as a standardized score with a mean of 0 and a standard deviation of 1. Lower scores correspond to lower bone density. Which of the following graphs shows that as women grow older they tend to have lower bone density?

Example Item 3

a)

b) c)

Graph A

Graph B Graph C

Example Item 3

 How did students answer this item?

Example Item 3

Pretest Posttest 90.5% 92.5% Response (N = 748) Graph A

6.1% 3.3% 6.6% Graph B 0.9% Graph C

Example Item 3

 Is this surprising?

 What can we learn from students’ responses to this item?

 Implications/Directions for research? Teaching?

Example Item 4

Measured Learning Outcome

Understanding of the purpose of randomization in an experiment.

Example Item 4 A recent research study randomly divided participants into groups who were given different levels of Vitamin E to take daily. One group received only a placebo pill. The research study followed the participants for eight years to see how many developed a particular type of cancer during that time period. Which of the following responses gives the best explanation as to the purpose of randomization in this study?

Example Item 4 a) To increase the accuracy of the research results.

b) To ensure that all potential cancer patients had an equal chance of being selected for the study.

c) To reduce the amount of sampling error.

d) To produce treatment groups with similar characteristics.

e) To prevent skewness in the results.

Example Item 4 a) To increase the accuracy of the research results.

b) To ensure that all potential cancer patients had an equal chance of being selected for the study.

c) To reduce the amount of sampling error.

d) To produce treatment groups with similar characteristics.

e) To prevent skewness in the results.

Example Item 4

 How did students answer this item?

Example Item 4

Pretest

41.4% 13.5% 22.7%

8.5%

13.9%

Posttest

31.8% 19.8%

Response (N = 754)

To increase the accuracy of the research results.

To ensure that all potential cancer patients had an equal chance of being selected for the study.

29.4% To reduce the amount of sampling error.

12.3%

6.6%

To produce treatment groups with similar characteristics.

To prevent skewness in the results.

Example Item 4

 Is this surprising?

 What can we learn from students’ responses to this item?

 Implications/Directions for research? Teaching?

How Can We Use the Results?

 Begin to look for underlying reasons students are having difficulties  Examine the research literature  Interview students to gain a more in depth understanding of their reasoning  Compare results with data from other classes (other teachers, schools)

How Can We Use the Results?

  They can inform our instruction  Reconsider how difficult or easy some concepts are for students  Rethink how we currently teach these ideas  Add new activities or tools  Re-allocate classroom time Change the way we assess students  Assessment items better aligned with learning outcomes  Assessment items that probe students reasoning

SATS

  Survey of Attitudes Towards Statistics Candace Schau and Tom Dauphinee (http://www.unm.edu/~cschau/satshomepage.htm)   Twenty-eight item survey Seven point Likert scale response Strongly Disagree 1 2 3 Neither agree nor disagree 4 5 6 Strongly Agree 7

SATS

 Original four subscales    Value (9 items; α range .80 - .90 ) “Statistics is worthless.” Affect (6 items; α range .80 - .85) “I like statistics.” Cognitive Competence (6 items; α range .77 .85)  “I have no idea of what’s going on in statistics.” Difficulty (7 items; α range .64 - .79) “Statistics is a complicated subject.”

SATS

 Two additional subscales  Interest (4 items) “I am interested in using statistics.”  Effort (4 items) “I plan to complete all of my statistics assignments.”

SATS

 Attitude is multi-faceted outcome  Issues to consider  Pre-existing attitudes  Direction and magnitude of changes over a semester  Relevance of items to study

Using the SATS: A Case Study Assessment of a project-rich introductory statistics course

 Fall 2004, at Cleveland State University    Class 1: 30 students Pre/Post Class 2: 16 students Pre/Post SATS administered first day and final exam day

Class 1: Projects - Rich

 4 team projects that used/required  Real data  Computer Software  Collaboration  Writing    Individualized Mid-Term and Take-home Data Analysis Exams http://academic.csuohio.edu/holcombj/eku/index.html

Login: holcomb pwd: projects22

Class 2

 Ti – 83  In – Class demos  Homework and Exams

Comparison of Pre Data

 No significant difference between Class1 and Class2

PreAFFECT vs Class

7 6 5 4 3 2 1 3 2 1 7 6 5 4 1

Class PreCOGCOMP vs Class

2 1

Class

2

PreVALUE vs Class

7 2 1 6 5 4 3 4 3 2 1 7 6 5 1

Class PreDIFFICULTY vs Class

2 1

Class

2

PreINTEREST vs Class

7 2 1 6 5 4 3 7 2 1 6 5 4 3 1

Class PreEFFORT vs Class

2 1

Class

2

Class 1 Change from Pre to Post (2 – sided tests)  Significant Differences for:  Cognitive Competence  Value  Difficulty*  Interest  Insignificant Differences for:  Affect  Effort * (Not Significant with Nonparametric Test)

6.00

4.00

2.00

0.00

Six Components for Class1: Pre - Post 29 24 29 -2.00

-4.00

-6.00

2 5 727 18 2 2 p = 0.541 p=0.018 p = 0.038 p = 0.049 p = 0.006 p = 0.881

diffAFFECT diffCOGCOMP diffVALUE diffDIFFICULTY diffINTEREST diffEFFORT

Class 2: Change from Pre to Post (2- sided tests)  Significant Differences  Affect (wrong direction)  Insignificant Differences  Cognitive Competence   Value Difficulty   Interest Effort

4.00

3.00

Six Components for Class2: Pre - Post 43 31 40 42 2.00

1.00

0.00

-1.00

-2.00

32 -3.00

p = 0.020 p = 0.522 p = 0.247 p = 0.303 p = 0.062 p = 0.051

diffAFFECT diffCOGCOMP diffVALUE diffINTEREST diffDIFFICULTY diffEFFORT

Multivariate Analysis of Post Data Class Significant vs Insignificant  Significant Differences   Affect Value  Interest  Insignificant Differences  Cognitive Competence   Difficulty Effort

Does SATS Ask the Right Questions?

 Value Component Questions          Statistics is worthless.

Statistics should be a required part of my professional training.

Statistical skills will make me more employable.

Statistics is not useful to the typical professional.

Statistical thinking is not applicable in my life outside my job.

I use statistics in my everyday life.

Statistics conclusions are rarely presented in everyday life.

I will have no application for statistics in my profession.

Statistics is irrelevant in my life.

What are the Questions You Want to Ask?

 ADD ANSWERS HERE

Instructors:

Do

try this at home!

 But first, set your expectations  Results may not be as high as you desire by the end of your course (e.g., CAOS)  Results may not change from the beginning to the end of your course or in the direction you anticipate (e.g., SATS)  Same is true for other instruments, too

How might

you

use such data?

How might

you

use such data?

 To better understand students’ learning of particular concepts and skills  To identify different patterns of student performance   To establish a starting point for further inquiry To make your teaching and students’ learning more effective  To assess where students start and to reveal areas of difficulty during course

Some Practical Considerations

 Motivating students to take these instruments seriously  Grading?

 Feedback  Instrument integrity  Time to administer  Others?

INQUERI Project

 INQUERI = Initiative for Quantitative Education Research Infrastructure   To build a research infrastructure by focusing on the development, deployment, user training, and archiving of high quality research methods, instruments, and data To disseminate these methods and results   To catalyze research collaborations See www.inqueri.org

Back to the Big Picture

 Focus on the question/goal you want to address and relate that to past research  Start small  Using existing instruments is one way  Working within your own course to start  Share with colleagues, connect with the literature, and then extend

References

 delMas, R., Garfield, J., Ooms, A., & Chance, B. (2006).

Assessing students' conceptual understanding after a first course in statistics

Francisco, CA.

. Paper presented at the Annual Meeting of the American Educational Research Association, San  Garfield, J., delMas, R., & Chance, B. (n.d.).

Assessment Resource Tools for Improving Statistical Thinking

Retrieved May 8, 2007, from https://app.gen.umn.edu/artist/index.html.

References

    http://www.unm.edu/~cschau/satshomepage.htm

Dauphinee, T. L., Schau, C., & Stevens, J. J. (1997). Survey of Attitudes Toward Statistics: Factor structure and factorial invariance for females and males.

Structural Equation Modeling, 4,

129-141. Schau, C., Stevens, J., Dauphinee, T. L., & Del Vecchio, A. (1995). The development and validation of the Survey of Attitudes Toward Statistics.

Educational and Psychological Measurement, 55,

868-875. Hilton, S. C., Schau, C., & Olsen, J. A. (2003). Survey of Attitudes Toward Statistics: Factor structure invariance by gender and by administration time.

Structural Equation Modeling,

11, 92 – 109

.

Contact Information

 Sterling Hilton  [email protected]

 Andy Zieffler  [email protected]

 John Holcomb  [email protected]

 Marsha Lovett  [email protected]