Assessment - Touchstone Education Systems

Download Report

Transcript Assessment - Touchstone Education Systems

Assessment
Week 2
Welcome Back!






Agenda:
Due Dates
Self Assessments
Traditional vs. Current Reasons for
Assessment
New Information Reliability and Validity
(zzzzz…)
Stay Awake!
Just for fun…
Meet with your partner and solve this problem.
What pattern do you see?
8, 5, 4, 9, 1, 7, 6, 3, 2, 0
Independently complete:

An Educational Assessment Confidence Inventory

Discussion
What pattern do you see?
8, 5, 4, 9, 1, 7, 6, 3, 2, 0
What is Assessment?

Take some time to work in groups. Answer
this question:
What is Assessment?

Educational Assessment is a formal attempt
to determine students’ status with respect to
educational variables of interest.

Students are varied by:
How much they know about the subject.
How skilled they are.
How positive their attitudes are.



Independently complete:

A Terse Self-Test about Testing

Discussion
Traditional Reasons to Measure Student
Achievement




Diagnosing Students’ Strengths and
Weaknesses
Monitoring Students’ Progress
Assigning Grades
Determining One’s Own Instructional
Effectiveness
Traditional Reasons to Measure Student
Achievement





Diagnosing Students’ Strengths and Weaknesses
Student weaknesses, once identified via assessment,
can be the focus of future instruction.
Teachers need to know what their students’ prior
accomplishments are. If we know this, we do not
waste our energies teaching content that has already
been mastered.
Assessment can allow teachers to identify students’
current strengths and, as a consequence, can help
teachers avoid superfluous, and wasteful instruction.
This is also known as preassessment.
Traditional Reasons to Measure Student
Achievement






Monitoring Students’ Progress
This may be done informally or formally.
It is human nature for teachers to believe that they’re teaching
well and that their students are learning well.
Teachers must systematically monitor progress via some type of
assessment.
Classroom tests help instructors’ make formative assessments
of their instructional procedures. (Evaluations intended to
improve unsuccessful yet still modifiable instruction.)
Summative assessments in contrast, would refer to tests
whose purpose was to make a final (success/failure) decisions.
Traditional Reasons to Measure Student
Achievement


Assigning Grades
Most beginning teachers would think of
assessment as a vehicle for assigning
grades. As the foremost reason to assess.
Traditional Reasons to Measure Student
Achievement





Determining One’s Own Instructional Effectiveness.
Pretest and post test results indicate that the teacher’s students
had acquired ample knowledge and skills regarding specific
content.
If post test results show mastering of content then we would
assume that the teaching and learning was successful.
If post test results indicate minimal growth, then what should we
assume?
Poor posttest results should result in modification of instructional
activities regarding the specific content.
Current Reasons to Measure
Student Achievement

In addition to the four traditional reasons that
teachers need to know about assessment,
there are three new reasons that should
incline teachers to take a dip in the
assessment pool.
Current Reasons to Measure
Student Achievement



Influencing Public Perceptions of Educational Effectiveness
Newspapers, Television reports, Realtors, taxpayers, politicians,
Board of Education members.
These days, students’ performances on high-visibility tests (such
as state/ province wide administrations of nationally
standardized achievement tests) constitute the single, most
influential determiner of citizens’ judgment about an educational
system’s effectiveness.
Current Reasons to Measure
Student Achievement



Helping Evaluate Teachers
A number of statewide and district wide teacher evaluation
systems now call for teachers to assemble tangible evidence of
student accomplishments based on classroom assessment.
It is clear that today’s teachers need to know enough about
educational assessment so that they can corral compelling
evidence regarding their own students’ growth.
Current Reasons to Measure
Student Achievement






Clarifying Teachers’ Instructional Intentions
In the past…once a unit was over, a teacher wrote the test.
Tests were devised after the instruction.
Today High Stakes testing tends to serve as a curricular
magnet.
It is beneficial for the teacher, student and parents to be
keenly aware of the nature of the assessment.
Classroom assessment instruments should always be
prepared prior to any instructional planning in order for the
teacher to better understand what is being sought of
students. This guides instruction.
What is Assessment?

“Assessment is an on-going process aimed at
understanding and improving student learning. It
involves making our expectations explicit and public;
setting appropriate criteria and high standards for
learning quality; systematically gathering, analyzing,
and interpreting evidence to determine how well
performance matches those expectations and
standards; and using the resulting information to
document, explain, and improve performance.
What is Assessment?

In today’s high-stakes testing environment,
you can’t wait for annual test scores to tell
whether students are learning. Instead, you
want teachers to continuously check their
students for understanding, adjust classroom
instruction if necessary, and re-teach those
who are falling behind. But traditionally,
teachers haven’t assessed students on a daily
basis and aren’t always prepared to use
assessment data to adjust their instruction.
Welcome Back!

We are going to play a game! We need two
teams.
Select one captain from each team to come
up front and answer the questions on your
behalf. (But with your input!)
There will be ten questions with one bonus
lightning round question.
Good Luck!

What is assessment?



Review:

Teachers should learn about classroom
assessment:
So that they can figure out if they’re
doing a dazzling or dismal job of
teaching.

Traditional Reason Today’s Reason Neither one

Review:

Teachers should learn about classroom
assessment:
So they have a better idea of where
where their instruction should be
heading.

Traditional Reason Today’s Reason Neither one

Review:



Teachers should learn about classroom
assessment:
So parents will regard teachers as more
complete professionals.
Traditional Reason Today’s Reason Neither one
Review:

Teachers should learn about classroom
assessment:
So when grading time comes around, a
student’s grade will be based on solid
evidence of achievement.

Traditional Reason Today’s Reason Neither one

Review:

Teachers should learn about classroom
assessment:
So that the public’s perceptions of a
school’s effectiveness will be more
accurate.

Traditional Reason Today’s Reason Neither one

Review:

Teachers should learn about classroom
assessment:
So they can make more accurate,
diagnostically based decisions about
their student’s status.

Traditional Reason Today’s Reason Neither one

Review:

Teachers should learn about classroom
assessment:
So they can, when the timing is
appropriate, communicate directly with
families utilizing the internet.

Traditional Reason Today’s Reason Neither one

Review:



Teachers should learn about classroom
assessment:
So they can systematically keep track of
their students’ progress.
Traditional Reason Today’s Reason Neither one
Review:



Teachers should learn about classroom
assessment:
So they can determine their personal
instructional effectiveness.
Traditional Reason Today’s Reason Neither one
Review:

Teachers should learn about classroom
assessment:
So the initial entry knowledge and skills
of students can be accurately
ascertained.

Traditional Reason Today’s Reason Neither one

Review:

Which of the following is not a
traditionally cited reason that classroom
teachers need to know about
assessment?

So teachers can monitor students progress.
So teachers can assign grades to students
So teachers can diagnose students strengths and
weaknesses.
So teachers can clarify their instructional intentions.



Reliability

What is reliability? We hear the term used a
lot in research contexts, but what does it
really mean? If you think about how we use
the word "reliable" in everyday language, you
might get a hint. For instance, we often speak
about a machine as reliable: "I have a reliable
car." Or, news people talk about a "usually
reliable source". In both cases, the word
reliable usually means "dependable" or
"trustworthy."
Reliability

What does the term reliability mean as it
relates to testing?

Reliability refers to the consistency with
which a test measures whatever it’s
measuring.
Reliability


In research, the term reliability means
"repeatability" or "consistency".
A measure is considered reliable if it
would give us the same result over and
over again (assuming that what we are
measuring isn't changing!).
Reliability

Another way of thinking about reliability
is that it refers to the extent to which
students’ scores on tests are free from
errors of measurement.
Reliability




Consistency appears in three varieties:
1. Stability
2. Alternate Form
3. Internal Consistency
Stability

Stability is the consistency of results
among different testing occasions.
Alternate Form

Consistency of results among two or
more different forms of a test.
Internal Consistency

Used to assess the consistency of
results across items within a test
Reliability:

If an assessment procedure fails to yield
consistent results, it is almost
impossible to make any accurate
inferences about what a student’s score
signifies.
Reliability:

As the stakes associated with an
assessment procedure become higher,
there will be typically more attention
given to establishing that the
assessment procedure is, indeed,
reliable.
There are three kinds of:
validity evidence about a test’s
consistency that are not
interchangeable. Do not let anybody
throw internal consistency results at you
and suggest that these results will tell
you about stability.
Error of Measurement:

The standard error of measurement provides an
estimate of how much variability there would be if a
student took an examination on another occasion. In
a sense, therefore, the standard error of
measurement offers test users a notion of the
confidence that can be put in the accuracy of an
individual’s test performance. The lower the number
the better. (+-3% margin of error)

The standard error of measurement focuses on the
consistency of an individual’s performance on a test.
Popham says:
“Although I don’t think you should
devote any time to calculating the
reliability of your own classroom tests, I
think you should have a general
knowledge about what it is and why it is
important.” (pg. 41)
Graham says:
Work in small groups and complete the self check on pages 4243 from 1-10.
Here are the answers:
1.
IC
2.
AF
3.
IC
4.
S
5.
S
6.
IC
7.
AF
8.
S
9.
IC
10.
AF
Validity

Boy! A few more
like that and I'll be
ready for Gamblers
Anonymous."
Validity:
n
Validity is the most important significant concept in
assessment.
Validity

Validity:

The extent to which assessment information is appropriate
for making the desired decision about pupils, instruction,
or classroom climate; the degree to which assessment
information permits correct interpretations of the desired
kind; the most important characteristic of assessment
information.
Validity




Validity of assessment has to do with how closely the test measures
what it was meant to measure. If a test does not show evidence of
validity then it is useless. This evidence of validity can be discussed in
three categories:
Content related evidence - looks at how well the content of the test
relates to what is being assessed. i.e.: the questions, observations, etc.
Criterion related evidence - correlation between the performance on
the test with performance of relevant criterion not in the test.
Construct related evidence - looks at whether the test matches the
capabilities or psychological construct which it is trying to measure.
Validity:
n
A classroom teacher needs to understand what the
essential nature of the three kinds of validity
evidence is, but classroom teachers need not go
into a frenzy of evidence gathering regarding
validity.
Popham recommends that :
n
for important tests, you devote at least
some attention to content-related
evidence of validity.
Content-related evidence :
n
Content-related evidence provides a
picture of the extent to which an
assessment procedure suitably
samples the content of the
assessment domain it represents.
Content Related Evidence:

To determine whether the test questions, observation
guidelines, etc. are valid before they are used with the
students, they are compared to the skills that the test is
about to measure. A table of Specifications, as well as clear
performance objectives, are used to do this.

A table of Specifications is constructed from a two
dimensional chart including the content area covered in the
test, and a list of categories of performance the test is to
measure. With performance objectives, the emphasis is on
the intersection of the content and the performance.
Content-related evidence :
If educators thought that 8th grade students
needed to know 124 specific facts about
history before moving on to 9th grade. They
would create a test with most or all of the
124 facts represented. The more evidence
of students knowledge of the 124 facts, the
more evidence that Content Validity exists.
Content-related evidence :
Content refers to more than just factual
knowledge. These days, content could
refer to skills (such as higher order thinking
skills) or attitudes (such as students’
dispositions toward the study of science).
Criterion-related evidence :
n
Criterion-related evidence of validity
deals with the degree to which an
exam accurately predicts a student’s
subsequent status.
Criterion Related Evidence:




Criterion related evidence of validity involves looking at how a student’s
performance on a test relates to how the student has done using related
material in a different (formal or informal) situation or assessment.
To evaluate whether a test is valid, you must be able to identify what the test
can be correlated with. For example:
high score on college admission test
high performance in college
moderate score on published achievement test
moderate performance in
school
Criterion-related evidence :
n
n
n
A criterion related evidence may be collected in this
manner.
The relationship between a student’s scores on (1)
an aptitude test and (2) the grades those students
subsequently earned.
An aptitude test is an assessment device that is
used in order to predict how well a student will
perform at some later point. Like an SAT
(Scholastic Aptitude Test) which is used to predict
how a high school student will perform in college.
Criterion-related evidence :
n
How accurate were your high school grades
as a predictor to your success in college?
n
How accurate were your aptitude scores for
college entrance?
Construct-related evidence :
n
Construct-related evidence of validity
deals with the assembly of empirical
evidence that a hypothetical construct,
such as a student’s ability to generate
written compositions, is accurately
assessed.
Construct Related Evidence:

Construct related evidence of validity is concerned with
whether or not "a test matches the capabilities or
psychological construct that is to be measured." (Oosterhof
1994) Some examples of constructs of learned knowledge
are public achievement tests, or aptitude tests.
Validity evidence :
n
Work in small groups and answer the Self-Check questions
on page 68-69.
n
Here are the answers:
1.B
2.D (reliability)
3. C
4. A
5. D (reliability)
n
n
n
n
n
Reliability and Validity

A history teacher, Mrs. Smith, tries to determine the consistency
of her tests by occasionally re-administering them to her
students, then seeing how much similarity there was in the way
her students performed. What kind of reliability evidence is Mrs.
Smith attempting to collect?

Stability
Alternate form
Internal Consistency
None of the above



Reliability and Validity

Which of the following statements best describes the
relationship among the three sanctioned forms of reliability
evidence?

All three types of evidence are essentially equivalent.

Stability reliability evidence is more important than either internalconsistency evidence or alternative-form evidence.

The three forms of evidence represent fundamentally different ways of
representing a test’s consistency.

The three forms of evidence differ in their significance because internalconsistency evidence of a test’s reliability is a necessary condition for
the other two types of consistency.
Reliability and Validity

Test-retest data regarding assessment consistency is an
instance of:

Stability reliability

Alternate-form reliability

Internal-consistency reliability

None of the above
Reliability and Validity

The standard error of measurement is focused chiefly on:

Construct-related evidence of validity

Content -related evidence of validity

Criterion-related evidence of validity

None of the above
Reliability and Validity

Classroom teachers are more apt to focus on which of the
following?

Evidence of alternate-form reliability

Criterion-related evidence of validity.

Evidence of internal-consistency reliability.

Content-related evidence of validity.
Reliability and Validity

What kinds of evidence is most eagerly sought by the
commercial testing firms that develop academic aptitude tests?

Evidence of alternate-form reliability

Criterion-related evidence of validity.

Evidence of internal-consistency reliability.

Content-related evidence of validity.