Transcript Slide 1

Advancing Assessment of
Quantitative and
Scientific Reasoning
Donna L. Sundre
Amy D. Thelk
Center for Assessment and Research Studies (CARS)
James Madison University
www.jmu.edu/assessment/
Overview of talk
Current NSF Research project
History of the test instrument
Phase I: Results from JMU
Phase II: Future directions
Results from some of our partners:
Michigan State
Truman State
Virginia State
Current NSF Project
3-year grant funded by National Science Foundation:
“Advancing assessment of scientific and quantitative
reasoning”
Hersh & Benjamin (2002) listed four barriers to assessing
general education learning outcomes:
 confusion;
 definitional drift;
 lack of adequate measures, and
 misconception that general education cannot be
measured
This project addresses all of these concerns with special
emphasis on the dearth of adequate measures
Objective of NSF project
Exploring the psychometric quality and
generalizability of JMU’s Quantitative and
Scientific Reasoning instruments to institutions
with diverse missions and serving diverse
populations.
Partner Institutions
Virginia State University: State-supported; Historically Black
institution
Michigan State University: State-supported; Research
institution
Truman State University: State-supported; Midwestern liberal
arts institution
St. Mary’s University (Texas): Independent; Roman-Catholic;
Hispanic Serving institution
Project phases
Phase I: First Faculty institute (conducted July 2007 at JMU);
followed by data collection, identification of barriers, and
reporting of results
Phase II: Validity studies (to be developed and discussed
during second faculty institute, July 2008), dissemination
of findings and institutional reports
History of the instrument
Natural World test, developed at JMU, currently in 9th version
Successfully used for assessment of General Education
program effectiveness in scientific and quantitative
reasoning
Generates two subscores: SR and QR
Summary of results since 2001
Table of Results -- 5 Test Versions.doc
Adaptation of an instrument
JMU instrument has been carefully scrutinized for over 10
years
The QR and SR is currently administered at over 25
institutions across the nation
NSF decided to fund this CCLI project to further study
procedures for adoption and adaptation of instruments and
assessment models
Evaluating the
generalizability of the
instrument
Step 1: Mapping Items
to Objectives
Relating test items to stated objectives for each institution
 In the past back translation method was used (Dawis,
1987) ..\..\JMU\NSF Grant\Truman\Blank ObjectiveGrid_truman.doc
 Participants at the NSF Faculty Institute used a new
content alignment method that was reported on at
NCME (Miller, Setzer, Sundre & Zeng, 2007)
 Forms were custom made for each institution
Example Content Alignment form.doc
Early content
validity evidence
Results strongly support generalizability of test items
 Truman State: 100% of items mapped to their objectives
 Michigan State: 98% (1 item not mapped)
 Virginia State: 97% (2 items unmapped)
 St. Mary’s: 92% (5 items not mapped)
Mapping of items alone is not sufficient
Balance across objectives must be obtained
Teams then created additional items to cover identified gaps in
content coverage

14 for MSU; 11 for St. Mary’s; 10 for Truman State; 4 for VSU
Step 2: Data Collection
and Analysis
During Fall 2007 semester, test was administered to students at 3 of the 4
partner institutions
Spring 2008 – data collection from students at sophomore level or above
Results so far
 Means not given: This activity is not intended to promote
comparison of students across institutions
 At this stage, reliabilities provide the most compelling
generalizability evidence; of course, the upcoming validity studies
will be informative
Score
JMU
freshmen
N=1408
SMU
freshmen
N=426
TSU
Jrs/Srs
N=345
VSU
N=653
MSU
N=1029
QR
α =.64
α = .63
α = .66
α = .55
--
SR
α = .71
α = .75
α = .72
α = .65
--
Total Score
NW-9
α = .78
α = .81
α = .79
α = .73
α = .71
Research at JMU
Standard Setting to aid in interpretation
Validity evidence: Instrument aligns with curriculum
Standard Setting
Used Angoff Method to set standards
Our process was informal, unique
Results look meaningful but we’ll reevaluate as we collect
more data in upcoming administrations
Faculty Objective Standards
1.00
0.90
0.80
Freshmen (no CL3
experience)
0.70
0.60
CL3 Package
completers
0.50
0.40
0.30
0.20
0.10
ota
l
NW
-9
T
QR
-9
8
Ob
je c
tive
7
Ob
je c
tive
6
Ob
je c
tive
5
Ob
je c
tive
4
Ob
je c
tive
3
Ob
je c
tive
2
Ob
je c
tive
1
0.00
Ob
je c
tive
Proportion of students meeting standard
Proportion of students meeting faculty objective
standards
Validity evidence
for instrument
and curriculum at JMU
Variables
Pearson’s r
Freshman QR9 score
& AP credits
Freshman QR9 score
& DE credits
Freshman SR9 score
& AP credits
0.28
Freshman SR9 score
& DE credits
0.20
0.21
0.24
Validity evidence
for instrument
and curriculum at JMU -- 2
Variables
Soph/Jr. NW9 score
& AP credits
Soph/Jr. NW9 score
& DE credits
Pearson’s r
0.16
0.01
Phase II studies
Samples of Upcoming Studies:
Correlational Studies: Is there a relationship between scores
on the QR/SR and other standardized tests? … and other
academic indicators?
Comparison of means or models: Is there a variation in the
level of student achievement based upon demographic
variables? Is there a relationship between scores on the
QR/SR and declared majors? Can this instrument be used
as a predictor for success and/or retention for specific
majors?
Qualitative Research: Will institutional differences be
reflected in the results of a qualitative interview that
accompanies the administration of QRSR?
References
Dawis, R. (1987). Scale construction. Journal of
Counseling Psychology, 34, 481-489.
Hersh, R. H., & Benjamin, R. (2002). Assessing
selected liberal education outcomes: A new approach.
Peer Review, 4 (2/3), 11-15.
Miller, B. J., Setzer, C., Sundre, D. L., & Zeng, X. (2007,
April). Content validity: A comparison of two
methods. Paper presentation to the National Council on
Measurement in Education. Chicago, IL.