Transcript General

EDP 303
Presentations
Lawrence W. Sherman, Ph. D
PowerPoint Slides for EDP 303:
Assessment and Evaluation in
Educational Settings.
Linn and Miller (2004) text
7/17/2015
1
Chapter 1:
Educational Testing and
Assessment: Context, Issues, and
Trends





7/17/2015
Accountability demands;
State, national and international assessment
programs
National content and performance standards;
Global competition
Fairness of uses and interpretations
2
Chapter 1:
Educational Testing and
Assessment: Context, Issues, and
Trends
Accountability demands including state, national and
international assessment programs, national
content and performance standards, as well as
global competition, all contribute to increased
demands for testing assessment. These factors
have both stimulated and reflected new trends in
educational measurement. The increased
reliance on testing and assessment as
educational reform tools has also raised issues
concerning the farness of their uses and their
interpretations.
7/17/2015
3
Nature of Assessment
Chapter 2 issues

Maximum Performance


7/17/2015
Function: Determines what
individuals can do when
performing at their best.
Example: Aptitude and
Achievement Tests

Typical Performance


Function: Determine what
individuals will do under
natural conditions
Example: Attitude, interest
and personality
inventories; observational
techniques; peer
appraisal.
4
Form of Assessment

Fixed Choice Test


7/17/2015
Function: Efficient
measurement of
knowledge and skills,
indirect indicator.
Example: Standardized
multiple-choice tests

Complex-performance
assessment


Function: Measurement of
performance in contexts and
on problems valued in their
own right
Example: Hands-on
laboratory experiment,
projects, essays, oral
presentation
5
Tests Used in Classroom
Instruction




7/17/2015
Placement
Formative
Diagnostic
Summative
6
Placement

Function

Determines prerequisite
skills, degree of mastery of
course goals, and or best
mode of learning

Example:





7/17/2015
Readiness tests
Aptitude tests
Pretests on course
objectives
Self-report inventories
Observational techniques
7
Formative Assessment

7/17/2015
Function:
 Determines learning
progress
 Provides feedback to
reinforce learning
 Corrects learning errors

Examples:
 Teacher-made tests
 Custom-made tests from
textbook publishers,
 Observational techniques
8
Diagnostic Assessment

Function:

7/17/2015
Determines causes (intellectual,
physical, emotional, environmental)
of persistent learning difficulties.

Example:
 Published diagnostic
tests,
 Teacher-made
diagnostic tests,
 Observational
techniques
9
Summative Evaluation

Function:


7/17/2015
Determines end-ofcourse achievement
for assigning
grades
Certifying mastery
of objectives

Examples:
 Teacher-made
survey tests
 Performance
rating scales
 Product scales
10
Methods of interpreting results:

Criterion Referenced


Function: Describes student
performance according to a
specified domain of clearly
defined learning tasks (e.g.,
adds single-digit whole
numbers)
Example:



Teacher-made tests,
custom-made tests from
test publishers,
observational techniques

Norm Referenced


Function: Describes
student performance
according to relative
position in some known
group (e.g., ranks 10th out
of 30; top 10 percent)
Examples:




7/17/2015
Standardized aptitude and
achievement tests
Teacher-made survey
tests,
Interest inventories,
Adjustment inventories
11
Chapter 3:
Instructional Goals and
Objectives: Foundations for
Assessment
What types of learning outcomes do you expect from your
teaching?

Knowledge?

Understanding?

Applications?

Thinking skills?

Performance skills?

Attitudes?
Clearly defining desired learning outcomes is the first step
in good teaching. It is also essential to the
assessment of student learning. Sound assessment
requires relating the assessment procedures as
directly as possible to intended learning outcomes.
7/17/2015
12
Chapter 3: Instructional Goals
Types of Learning Outcomes to Consider
Bloom’s Taxonomy of Educational Objectives

Cognitive Domain


Affective Domain


Attitudes, interests, appreciation
Psychomotor Domain

7/17/2015
Knowledge and intellectual skills/abilities
Perceptual and motor skills
13
Chapter 3:
Instructional Goals
Other sources for lists of objectives:


Professional Association Standards
 MCREL
State Content Standards

7/17/2015
OHIO
14
Chapter 3:
Instructional Goals
Some Criteria for Selecting Appropriate Objectives:
1.
Do the objectives include all important outcomes of
the course?
2.
Are the objectives in harmony with the content
standards of the state or district and with general
goals of the school?
3.
Are the objectives in harmony with sound principles
of learning?
4.
Are the objectives realistic in terms of the abilities of
the students and the time and facilities available?
7/17/2015
15
Chapter 4: Validity

7/17/2015
When constructing or selecting assessments, the most
important questions are:
 (1) To what extent will the interpretation of the
scores be appropriate, meaningful, and useful for
the intended application of the results? and
 (2) What are the consequences of the particular
uses and interpretations that are made of the
results?
 (3) A valid test Must be Reliable!
16
Chapter 4: Validity
Issues







7/17/2015
Nature of Validity
Major Considerations in Assessment Validation
Content Considerations
Construct Considerations
Assessment-Criterion Relationships
Consideration of Consequences
Factors Influencing Validity
17
Nature of Validity





7/17/2015
Appropriateness of the
interpretation of the results
It’s a matter of degree
Specific to some particular use or
interpretation
Is a Unitary concept
Involves an overall evaluative
judgment
18
Major Considerations in
Validation




7/17/2015
Content
 How it represents the domain of tasks to be
measured
Construct
 Interpretation as a meaningful measure of
some characteristic or quality
Assessment-Criterion Relationship
 Prediction of future performance (criterion)
Consequences
 How well rsults accompolish intended
purposes and avoids unintended effects
19
Chapter 5: Reliability

Next to validity, reliability is the most important
characteristic of assessment results. Reliability



7/17/2015
(1) provides the consistency that makes validity
possible, and
An unreliable test cannot be Valid!
(2) reliability indicates the degree to which
various kinds of generalizations are justifiable.
The practicality of the evaluation procedure is, of
course, also of concern to the busy classroom
teacher.
20
Chapter 5: Reliability
Issues

Nature of Reliability





7/17/2015
Determining Reliability by Correlation Methods
Standard Error of Measurement
Factors Influencing Reliability Measures
Reliability of Assessments Evaluated in Terms of
a Fixed Performance Standard
Usability
21
Chapter 6:
Planning tests:
Timing




7/17/2015
Preparation (Planning)
Administration
Grading
Post-Test Analysis!
22
Chapter 6:
Planning tests

Objective Tests

A. Supply Type



B. Selection Type




True-False or Alternative-Response
Matching
Multiple Choice
Performance Assessment


7/17/2015
Short Answer
Completion
Extended Response
Restricted Response
23
Table of Specifications:
Similar to tables 6.2-6.4 in Chapter 6
Content:
national Standards
State Standard
Specific Objective
Bloom Taxonomy
Knowledge
Understanding
Application
1. Short Answer
2. True/False
3. Multiple Choice
4. Matching
7/17/2015
24
Chapter 7: Simple Forms



7/17/2015
Short-Answer
True-false
Matching
25
Short-Answer issues
Chapter 7, page 178
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
7/17/2015
Is this the most appropriate type of item for the intended learning
outcomes?
Can the items be answered with a number, symbol, word or brief
phrase?
Has text book language been avoided?
Have the items been stated so that only one response is correct?
Are the answer blanks equal in length?
Are the answer blanks at the end of the items?
Are items free of clues (such as a or an)?
Has the degree of precision been indicated for numerical answers?
Have the units been indicated when numerical answers are expressed
in units?
Have the items been phrased so as to minimize spelling errors?
If revised, are the items still relevant to the intended learning
outcomes?
Have the items been set aside for a time before reviewing?
26
True-False Items
Chapter 7, p. 185
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
7/17/2015
Is this the most appropriate type of item to use?
Can each statement be clearly judged t or f?
Have specific determiners been avoided? (eg., usually, always,
etc.)
Have trivial statements been avoided?
Have negative statements (especially double negatives been
avoided?
Have the items been stated in simple, clear language?
Are opinion statements attributed to some source?
Are the t and f items approximately equal in length?
Is there an approximately equal number of true and false items?
Has a detectable pattern of answers been avoided? (eg.,
t,F,T,F,T,F,etc.)
If revised, are the items still relevant to the intended learning
outcome?
Have the items been set aside for a time before reviewing them?
27
Matching Items
Chapter 7, p. 190
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
7/17/2015
Is this the most appropriate type of item to use?
Is the material in the two lists homogeneous?
Is the list of responses longer or shorter than the list of premises?
Are the responses brief and on the right-hand side?
Have the responses been placed in alphabetical or numerical
order?
Do the directions indicate the basis for matching?
Do the directions indicate that each response may be used more
than once?
Is all of each matching item on the same page?
If revised, are the items still relevant to the intended learning
outcomes?
Have the items been set aside for a time before reviewing them?
28
Chapter 8: Multiple Choice Items
Chapter 8, p. 214
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
7/17/2015
Is this the most appropriate type of item to use?
Does each item stem present a meaningful problem?
Are the item stems free of irrelevant material?
Are the item stems stated in positive terms (if possible)?
If used, has negative wording been given special emphasis (e.g.,
capitalized, underlined, etc.)
Are the alternatives grammatically consistent with the item stem?
Are the alternative answers brief and free of unnecessary words?
Are the alternatives similar in length and form?
Is there only one correct or clearly best answer?????????
Are the distracters plausible to low achievers?
Are the items free of verbal clues to the answer?
Are the verbal (or numerical) alternatives in alphabetical (or numerical)
order?
Have none of the above and all of the above been avoided (or used
sparingly and appropriately?
If revised, are the items still relevant to the intended learning outcomes?
Have the items been set aside for a time before reviewing them?
29
Chapter 10: Measuring Complex
Achievement:
Essay Questions.
Some important learning outcomes may
best be measured by the use of openended essay questions or other types of
“Performance” assessments. Essay
questions provide freedom of response
that is needed to adequately assess
students’ ability to formulate problems;
organize, integrate, and evaluate ideas
and information; and apply knowledge
and skills.
7/17/2015
30
Essay Questions Check List
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
7/17/2015
Is this the most appropriate type of task to use?
Are the questions designed to measure higher-level learning
outcomes?
Are the questions relevant to the intended learning outcomes?
Does each question clearly indicate the response expected?
Are the students told the bases on which their answers will be
evaluated?
Have you conceptualized a “rubric” upon which the response will be
scored?
Are generous time limits provided for responding to the questions?
Are students told the time limits and/or point values for each
question?
Are all students required to respond to the same question?
If revised, are the questions still relevant to the intended learning
outcomes?
Have the questions been set aside for a time before reviewing
them?
31
CHAPTER 11:
PERFORMANCE-BASED
ASSESSMENTS
Essay tests are the most common example of a
performance-based assessment, but there are
many others, including artistic productions,
experiments in science, oral presentations, and the
use of mathematics to solve real-world problems.
The emphasis is on doing, not merely knowing: on
PROCESS as well as PRODUCT.
7/17/2015
32
Suggestions for constructing
performance tasks
1.
2.
3.
4.
5.
6.
7/17/2015
Focus on learning outcomes that require complex
cognitive skills and student performances.
Select or develop tasks that represent both the
content and the skills that are central to important
learning outcomes.
Minimize the dependence of task performance on
skills that are irrelevant to the intended purpose of
the assessment task.
Provide the necessary scaffolding for students to be
able to understand the task and what is expected.
Construct the task directions so that the student’s
task is clearly indicated.
Clearly communicate performance expectations in
terms of the scoring rubrics by which the
performances will be judged.
33
Chapter 12: Portfolios
Key Steps in Defining,
Implementing and Using Portfolios
1.
2.
3.
4.
5.
7/17/2015
Specify purpose
Provide guidelines for selecting portfolio
entries
Define student role in selection and selfevaluation.
Specify evaluation criteria
Use portfolios in instruction and
communication.
34
Chapter 14: Assembling,
Administering, and Appraising
Classroom Tests and
Assessments
Care in preparing an assessment plan and
constructing relevant test items and assessment
tasks should be followed by similar care in
reviewing and editing the items and tasks,
preparing clear directions, and administering and
appraising the results. Classroom assessments
`lso can be improved by using simple methods to
analyze student responses, and building a file of effective items
and tasks.
7/17/2015
35
Chapter 14: Assembling,
Administering,




7/17/2015
Assembling the Classroom Test
Administering and Scoring Classroom Tests and
Assessments
Appraising Classroom Tests and Assessments
Building a File of Effective Items and Tasks
36
Flow Chart of Testing Process
STUDENT LEARNING
PLANNING
7/17/2015
DEVELOPMENT
APPLICATION
37
PLANNING
1. CLARIFY INSTRUCTIONAL
OBJECTIVES: THE CONTENT
OR DOMAIN OBJECTIVES.
7/17/2015
2. SPECIFY WHAT WILL BE
TESTED (BLOOM’S
TAXONOMY):
KNOWLEDGE
COMPREHENSION
APPLICATION
oAPPLICATION
oSYNTHESIS
oEVALUATION
oANALYSIS
3. DEVELOP A TEST BLUE
PRINT:
38
DEVELOPMENT OF THE TEST
4. SELECT THE TEST ITEM
FORMATS:
TRUE/FALSE
MULTIPLE CHOICE
MATCHING
COMPLETION
ESSAY
CONTEXT-DEPENDENT ITEMS
7/17/2015
5. PREPARE/COMPOSE/WRITE
OR SELECT THE ITEMS FROM
AN ARCHIVE FOR THE TEST
GO TO
APPLICATION
39
APPLICATION
7. ADMINISTER THE TEST:
PROVIDE ENOUGH TIME
HAVE ALL MATERIALS
PENCILS
PAPER
SCANNER SHEETS
ETC.
9. ANALYZE AND REVISE ITEMS:
ITEM DIFFICULTY
DISCRIMINATION INDEX
REVISE ITEMS AND STORE IN
ARCHIVE FOR FUTURE USE
7/17/2015
8. SCORE THE TEST:
MASTERY SCORING?
NORMATIVE SCORING?
MACHINE SCORE?
HAND SCORE?
DISCUSS RESULTS WITH
STUDENTS
RETURN TO STEP 5 AND
STORE REVISIONS
6. ASSEMBLE THE TEST:
CONSIDER
REASONABLE TIMEING
BOUNDRIES
COLLECT TEST ITEMS
REVIEW TEST ITEMS
FORMAT TEST
PREPARE DIRECTIONS
40
Chapter 15: Grading and
Reporting
Grading and reporting student progress is one of the more frustrating
Grading and reporting student progress is one of the more
frustrating aspects of teaching—there are so many factors to
consider, and so many decisions to be made. This chapter will
remove some of the complexity by describing the various types
of grading and reporting systems and providing guidelines for
their effective use.
7/17/2015
41
Chapter 15: Grading and
Reporting
*
Functions of Grading and Reporting Systems

Types of Grading and Reporting Systems
Multiple Grading and Reporting Systems
Assigning Letter Grades
Record-Keeping and Grading Software
Conducting Parent-Teacher Conferences
Reporting Standardized Test Results to Parents





7/17/2015
42