Ch 11: Standardized Tests I: Achievement Tests
Ch 11: Standardized Tests I: Achievement Tests
What did you learn in school?
The public believes the answer to
this question is best found in
standardized testing . . . in seeing
how “my” child or community
compares to others.
Thus arises one of the hottest
topics in education today HIGH STAKES TESTING
Every teacher is involved in the
administration and interpretation
of standardized achievement tests
in one form or another.
All educators need to be aware of
the various purposes, strengths
and shortcomings of these tests.
Potential Negative Effects of High Stakes Testing
on the way you teach and on the likely results.
Because of testing, the school may value some subjects at the
expense of others, thus the curriculum contains some subjects
“judged” more important than other subjects.
As a teacher, you may be asked to focus more on aspects of your
subject which could come at the expense of other aspects.
You may decide to place emphasis on teaching some students at
the expense of other students.
The consequences of these decisions could lead to learning that
Shallow or trivial
Short-lived or narrow
By the way, these educational effects could be “nurtured” by
schools and teachers without the insertion of high stakes testing.
Yet, there are potential positive effects
. . . for some teachers and schools in high stakes testing.
The status of having high achieving schools attracts many teacher
applications and new residents.
Teachers may be
rewarded in terms
of salary and benefits.
Teachers may have
more flexibility in terms
of how they design
instruction and run
Topics: Standardized Achievement Tests
Review of the various meanings of the term
Contrasting standardized achievement tests with
Six classes of standardized achievement tests
Special procedures to follow when administering
standardized achievement tests
Remember from earlier discussions, the meaning of the term . . .
“Standardized” - A Cloze Review
Standardized Testing usually involves:
Uniform, clearly specified methods and procedures for __________ the test.
Attention to, and written reports on, three technical characteristics of testing to
include consistency or __________, item __________, and test
reliability, analysis, bias
Based on many previous cases, the test has large-group scoring __________.
Often, but not always, a standardized test is group __________, machine
__________, and composed of largely __________ items.
administered, scored, multiple-choice
Achievement tests are designed to measure what one already knows. To insure
that the standardized tests are based on real school curricula, the makers pay
attention to __________.
Most Standardized Tests are Timed
Howard Gardner says . . .
“Nothing of consequence would be lost by getting rid of timed
tests by the College Board or, indeed, by (schools) in general.
Few tasks in life — and very few tasks in scholarship —
actually depend on being able to read passages or solve math
problems rapidly. As a teacher, I want my students to read,
write and think well; I don't care how much time they spend
on their assignments. For those few jobs where speed is
important, timed tests may be useful.”
- “Testing for Aptitude, Not for Speed,” New York Times, July 18, 2002
Contrasts with Teacher-Made Tests
Level of detail covered – standardized tests are more general; they’re mostly a
sampling of what was studied.
Research base – teachers rarely have the time to prepare items as extensively
as a standardized test company.
Availability of norms – teachers have only their previous students for
comparison, and this is mostly informal; nationally standardized tests have
norms allowing wider comparisons of achievement.
Frequency of occurrence – standardized tests are infrequent although their
variety and their “high stakes” nature may make them “feel” dominant.
Does it make sense that we need both? Both have different purposes. By the
way, it is interesting to note that those students who do well on teacher-made
tests also do well on standardized tests. Students who don’t “like” tests tend to
not like either type of testing.
Classification of Standardized Achievement Tests
. . . we will discuss the following six groupings.
1) Achievement batteries
2) Single area tests
3) Licensing and certification exams
4) State testing programs
5) National and international studies
6) Individually administered achievement tests
1) Achievement Batteries
A system of interrelated K-12 tests . . .
Basic Idea: To determine each student’s general achievement standing
with respect to regional or national group performance over time and
across subject areas. Typically used K-12.
The test battery is a group or system of interrelated tests that contain a fairly limited sample
of questions covering many subject areas, many grade levels.
The direct comparability of normed scores across content areas and grade levels is one the
greatest values of these achievement batteries. These tests are based on high quality
sources of information for their content (e.g., National Learned Societies, Professional
Organizations, State Curricular Guides from large states).
The original intent was to monitor individual progress in the major areas of the school
curriculum, with the school and the teacher being the intended score recipient. This
reporting has been expanded to parents.
Methods of assessment found in a test battery include more than multiple-choice items (e.g.,
writing exercises, open-ended questions, performance measures).
Major batteries are more alike than they are dissimilar (e.g., Stanford Achievement Test
Series: Metropolitan Achievement Tests; Iowa Test of Basic Skills).
2) Single Area Achievement Tests
Typically high school and beyond . . .
Basic Idea: To determine each student’s specific achievement standing
with respect to regional or national group performance in a single subject
area. May include criterion-referenced interpretations. Typically used in
high school or diagnostically K-12.
There are single area achievement tests related to nearly all high school subjects. Check
out the Educational Testing Service website: ETS Test Link Overview
Notice that there appears to be a wide range of quality. Check dates.
They even ask if you would like to submit your tests to the Test Collection at ETS
(What does this suggest?).
Two Example Areas:
Diagnostic Tests are highly detailed (and therefore long to take) achievement tests with
extensive subscores. These tests are administered individually and are used for
formative evaluation. Reports may include criterion-referenced interpretations to aide
Advanced Placement (AP) Exams and SAT Subject Tests fall into this category and the
outcome emphasis is on the final total score. These are most often used for
3) Licensing & Certification Exams
Using the Praxis exam series as an example . . .
Basic Idea: To determine each student’s specific achievement in a single
subject area with a cut-off score defining acceptable performance
(government set minimal level in order to protect the public).
The Praxis series is a descendent of the National Teacher Exams.
Really a series of separate exams (e.g., Praxis I - Academic Skills; Praxis II - PLT and
Subject Areas; Praxis III First-Year Observation).
The scores are characteristically reported as scaled scores (mean and standard deviation
created). The recipient often thinks the score is criterion-based while it is really norm-based.
Norms are based on whatever individuals took the exams in the most recent three-year
Each state sets its own cut-off scores. These score are usually typically between the 10th
and 20th percentile.
The Ohio Department of Education (ODE) selects the tests required and sets the qualifying
scores (i.e., cut-off scores). Both the selected tests and the qualifying scores are subject to
change by ODE.
The State of Ohio uses the examinees’ scores as a measure of a university’s teacher
education program’s adequacy (PASS/FAIL rates, not the scores themselves).
4) State Assessment Programs
The Ohio Report Card System . . .
Basic Idea: To maintain receipt of federal funds. While some states have a
long history of testing programs, the NCLB Act created a mandate. Now
all states have them. Typically,
Concentrate on basic skills
Examine grades 3-8 . . . plus one high school grade
Employ existing achievement batteries
Are aimed at state content standards
Utilize a combination of multiple-choice & performance items
Issue reports to the public
Draw on a proficiency basis for reporting
Use high school graduation tests
Include public school students only
We will be discussing Ohio’s Report Card System in some detail.
5 ) National and International Assessment
Narrowly defined waves of comparison studies . . .
Basic Idea: To create benchmarks regarding the educational achievement of
American students across the nation and across the world. Each
scheduled wave of testing may address only one or a few areas of interest
and the cohorts may be small.
Take a look at the following websites:
National Assessment of Educational Progress (NAEP)
Content areas covered
Nature of reports
Trends in International Mathematics and Science Study (TIMSS) and Progress in
International Reading Study (PIRLS)
Content areas covered
Nature of reports
6) Individually Administered
. . . may be coupled with to aptitude testing
Basic Idea: To diagnose discrepancies among various achievement levels or
between achievement and mental ability. Sometimes these tests are
called psychoeducational batteries. Administered by school
As a teacher, know that these tests do exist. You would never administer these or
use them to provide formative information about your curriculum. You may find
yourself discussing elements of these as part of an IEP process.
Administering Standardized Tests
Same as discussed vis-à-vis teacher made tests, plus . . .
Attend to students’ test taking skills and test taking motivation
Read the directions in test manual to yourself in advance (both what to say and do)
Ensure availability of materials (e.g., know what you need – pencils, watch, etc.)
During the Test
Follow the test manual directions exactly
Time the test session accurately (need clock/watch with second hand)
Monitor the situation and make notes on unusual situations (e.g., distractions)
After the Test
Retrieve all materials
File your notes
Clean up answer documents if requested in test manual (smudges? correctly coded?)
Clearly Unethical Practices
Changing students’ answers (e.g., fill in blanks)
Deliberately not following directions (e.g., allow more time)
Giving the actual test items to students in advance
Keep in mind the strengths and weaknesses of both
standardized and teacher-made tests. Both contribute to a
successful assessment program.
Know basis for norms and performance categories.
Understand the content outline of Ohio Praxis I and II exams.
Be familiar with sources like NAEP and TIMSS.
Know principles of good practice for administering
Terms Concepts to Review and
Study on Your Own