Assessing Students with Severe Disabilities

Download Report

Transcript Assessing Students with Severe Disabilities

Assessing Learning for
Students with Disabilities
Tom Haladyna
Arizona State University
Useful Sources






Standards for Educational & Psychological Testing, AERA,
APA, NCME (1999)
Tindal & Haladyna (2002) Large-scale assessment programs
for all students: Validity, technical adequacy, and
implementation.
Downing & Haladyna (2006) Handbook of test development
Haladyna & Downing (2004). Construct-irrelevant
variance in high-stakes testing. Educational Measurement:
Issues and Practice.
Kane (2006) Content-related validity evidence.
Handbook of test development.
Kane (In press). Validation. Educational Measurement (4th
ed.)
Assessment vs. Testing

Assessment is the act of judging the
indicators of student achievement for the
benefit of planning future instruction.

Testing is a way of providing one valid source
of information for assessment

A test is NEVER a valid source of information
for assessment unless corroborated by other
evidence.—Multiple indicators
Validity of a Test Score
Interpretation or Use





Way of reasoning about test scores.
Concerned about the accuracy of any
interpretation or use.
Involves an argument about how an assessment
or a test score can be validly interpreted or used.
Involves a claim by the developer/user
Involves evidence that might support this claim
Validation’s Steps

Developmental Phase
State a purpose for the test.
 Define the trait (construct).

Content
 Cognitive demand



Develop the test.
Validate—conduct the study.

Investigative phase
Two Types of Evidence
 That
supports our claim
 That
weakens or threatens
validity
 Construct
under representation
 Construct-irrelevant variance
Two Types of Evidence

Includes procedures known to strengthen our
argument and support our claim.

Includes statistical/empirical information that
also strengthens our argument and supports our
claim
More Types of Evidence








Content-related
Reliability
Item quality
Test design
Test administration
Test scoring
Test reporting
Consequences
Content

Structure—sub scores???

Concurrent—how it correlates with other
information

Does it represent the construct (content)?
Reliability

Very important type of validity evidence.

Can be applied to individual or group scores.

Group scores tend to be very reliable.

Can focus at reliability at a decision point.

Subjective judgment a factor in reliability.
Random Error

Basis for reliability

Can be large or small

Can be positive or negative

We never know.

We just guess.

Guessing allows us to speculate about where a student’s true
score lies and what action we take.
Item Quality

Universal item design

Format issues

Item reviews

Field tests
Test Design

Breadth

Scope

Depth

Length

Formats
Test Administration

Standardized

Accommodations

Standards
Test Scoring

Avoid errors.

Quality control is important.

Invalidate scores when evidence suggests that.
Score Reporting

Helpful to teachers for assessment

Meets requirements for accountability

Meet Standards (Ryan, 2006)
Advice

Document what you do. Technical Report

Build the case for validity.

Do validity studies when possible.

Stay focused on the real reason for assessment and
testing: helping students learn not satisfying someone in
DC.