Transcript Slide 1

Test for Written Spelling
Fourth Edition
Wendy L. Gremling
SPE 33-533 Psyco-Educational Testing for Teachers
3 Purposes for using
the TWS-4
• To serve as a measure in research designed to
measure spelling achievement in individuals
with different types of learning disabilitie
• To identify students whose spelling ability is
deficient enough to call for direct instruction
designed to improve their spelling
• To document overall improvement in spelling
as a consequence of intervention
Do the intidividual
test items match the
purpose?
• The use of a standardized test such as the TWS-4 to
measure improvement is legitimate as long as the
testings occur at least 6 months apart. To prevent
“practice effect” if t he test is given more frequently
than once per year an alternate form of the test should
be administered. Interpretation of the TWS-4 scores
focuses on percentiles and standards scores and
diagnosticians and teachers should be concerned when
a student’s standard score is well below 90. This
means special attention is necessary or additional
assessment is needed to specify for the cause of the
problem.
Test Administration
• Takes only approximately 15 minutes to
administer the test
• Can be administered individually or in a group
setting
– Extra time should be allowed for a group testing
Examiner
Requirements
– Become thoroughly familiar with the contents of the examiner’s
manual
– Practice administering the test to an individual student no fewer
than three times
– Establish rapport with the examinee(s) by exploring the purpose
of the test and approaching the testing session as a pleasurable
undertaking
– Be alert to signs of fatigue in the examinee(s) and cease testing
if there are signs of fatigue or loss of interest
– Consistently use praise and encouragement to the examinee(s)
but avoid prompting or otherwise deviating from testing
procedures
– Allow about 15 minutes to administer the TWS-4. A slightly
longer time may be required when the test is administered in a
group setting.
Special
Considerations
– Permission has been granted for comparison
analysis for test results should a student exhibit a
learning disorder to the test results of a regular
education peer
– Test may be administered individually to students
with learning disorders
• Special considerations were given when determining
the normative sample to the diverse structure of
culture/race/ethnicity of the students.
• Adaptable to translation
Student sampling
• Created for regular education students based
upon chronological age
• 2 tests were developed to represent a
normative sample from 2 different regions.
The sample was collected for 23 states and 27
representing 4,952 residing in one area with
like demographics. (Please refer to chart
above and below)
Demgraphic Characteristics of
the Normative Sample
Characteristics
Percentage of Sample
Percentage of School Age
Population
Gender : Male
Female
48
52
51
49
Residence: Urban
Rural
75
25
75
25
Race:
82
14
4
79
16
5
Geographic Area Northeast
North central
South
West
21
26
32
21
18
24
35
23
Ethnicity
1
11
3
12
73
1
13
4
15
67
White
Black
Other
Native American
Hispanic
Asian
African American
Other
Stratification by Age of Selected Sample
Characteristics
(Geographic, Gender, Race, and Ethnicity)
Geographic Region & age
interval
6-8
9-11
12-14
15-18
Total U.S. School Population
Northeast
N
230
366
336
140
1,072
Gender and Age
Age interval
West
%
N
%
N
%
22
21
23
20
21 18
262
454
395
175
1,286
25
26
27
25
26 24
325
558
468
216
1,567
31
32
32
31
32 35
N
%
513
855
702
328
2,398
230
367
263
167
1,027
22
21
18
24
21 23
N
49
49
47
48
48
%
534
890
760
370
2,554
51
White
6-8
9-11
12-15
16-18
Total U.S. Population
%
Female
N
Race and Age
Age Interval
6-8
9-11
12-14
15-18
Total U.S. Population
South
Male
6-8
9-11
12-14
15-18
Total U.S. School Population
Ethnicity and Age
Age Interval
North Central
51
51
52
53
52
49
Nonwhite
N
%
N
%
879
1,431
1,184
579
4,073
84
82
81
83
168
314
278
119
879
16
18
19
17
82
African American
79
Hispanic
Native American
18
21
Asian
All Other
N
%
N
%
N
%
N
%
N
%
101
235
183
87
606
10
13
12
12
102
185
154
68
509
10
11
11
10
9
17
29
12
67
1
1
2
2
37
50
60
10
157
3
3
4
1
798
1,258
1,036
521
3,613
76
72
71
75
12
15
11
13
1
1
3
4
73
67
Reliability
• Two methods were used to investigate the TWS4’s content sample reliability:
• Coefficient Alphas – the associated standard error of
measurement
– Coefficient alphas were calculated at 13 age intervals using data
from the entire normative sample. Average coefficients exceeded
.90 which is indicative of high reliability
• Alternate Form – immediate administration
– A procedure to be used to estimate error due to content sampling
when alternate forms of a test are available. Both forms of the
test are given during one session and the correlation between the
two tests is a reliability index that can be used to estimate the
content sampling error. Correlation coefficient exceeded .90
concluding high reliability and little sampling error.
Test-Retest &
Interscorer
• Test-retest approach involves re-administering the test
at a later date. The intervening time was
approximately 2 weeks. After the testing was complete
the scores were correlated for standard deviation and
the resulting correlation coefficients are consistently
large enough to strong support the fact that TWS-4 has
acceptable test-retest reliability.
• Reliability among scorers of objective tests such as
TWS-4 is understandably high due to clerical errors or
difficulty deciphering handwriting. Error can be
reduced considerably by the availability of a second
administrator to assist in the deciphering of the results.
Validity
– Content
• In selecting the test items the words have to be typically taught in school
and readily identifiable as either rule governed or irregular in their
spelling. An extensive survey was conducted prior to selecting test items.
To verify words on the TWS-4 are still instructionally relevant
investigations are conducted in five current basal spelling series and on
the current EDL list.
– Concurrent
•
The results of time tested procedures to select valid items for the test
focuses on the procedures to study an item’s discriminating power and
difficulty. Item difficulty (i.e. the percentage of examinees who pass a
given item) is determined to identify items that are too easy or too
difficult. To demonstrate conclusively the item characteristics of the TWS4, an item analysis was undertaken using the entire normative sample as
subjects. A review reported test’s items satisfy the requirements
empirically verifying the content validity.
Predictive Validity
• The definition for the procedures indicate the “effectiveness of a
test in predicting the individual’s performance in specific
activities”
• Performance is checked against a criterion that is either a direct
or indirect measure of what the test is designed to predict
• 3 separate studies were conducted and can be summarized as
follows:
– The coefficients representing the relationships among the TWS values and
the criterion spelling measures were significant beyond the .01 level of
confidence and of high magnitude, therefore, the results of the studies
demonstrate TWS-4 has criterion-related validity
Construct Validity
• The Construct Validity of a test is the extent to which a test
is said to measure a theoretical construct or trait
• Relates to the degree to which the underlying traits of the
test and be identified and reflected to the theoretical model
on which the test is based
– 3 step procedure was created for demonstrating tests vailidity
– Several constructs presumed to account for test performance were
identified
– Hypotheses are generated based on identified constructs
– Hypotheses are verified by logical or empirical methods
• Test demonstrated validity among age differentiation and
group differentiation (those with and without disabilities)
Consequential Validity
• Scores were correlated with students who were known
to possess a high degree of scholastic aptitude and
conversely those with a low aptitude.
• Concurrent criterion-prediction validity demonstrated
through resulting coefficients that the TWS-4 is strongly
influenced by basic aptitude.
Criterion-Prediction
Validity
• The definition for the procedures indicate the “effectiveness of a test
in predicting the individual’s performance in specific activities”
• Performance is checked against a criterion that is either a direct or
indirect measure of what the test is designed to predict
• 3 separate studies were conducted and can be summarized as
follows:
– The coefficients representing the relationships among the TWS
values and the criterion spelling measures were significant beyond
the .01 level of confidence and of high magnitude, therefore, the
results of the studies demonstrate TWS-4 has criterion-related
validity
Construct-Identification
Validity
• The Construct Validity of a test is the extent to which a test is said to
measure a theoretical construct or trait
– Relates to the degree to which the underlying traits of the test
and be identified and reflected to the theoretical model on
which the test is based
• 3 step procedure was created for demonstrating tests vailidity
– Several constructs presumed to account for test performance were identified
– Hypotheses are generated based on identified constructs
– Hypotheses are verified by logical or empirical methods
• Test demonstrated validity among age differentiation and group
differentiation (those with and without disabilities)
Relationship to Tests of
Aptitude and
Intelligence
• Scores were correlated with students who
were known to possess a high degree of
scholastic aptitude and conversely those with
a low aptitude.
• Concurrent criterion-prediction validity
demonstrated through resulting coefficients
that the TWS-4 is strongly influenced by basic
aptitude.
Types of Scores
• Scores are generated through the conversion of raw scores
into normative values (or standard scores and percentiles).
• Raw scores are converted according to spelling age and grade
equivalent via a conversion table
• Can also be converted into standard scores (or quotients) and
percentiles
• Consideration is given to deviation and equivalent factors
such as test conditions
Raw Scores
• The number of items correct on each subtest
• Have limited value and cannot be used to make
clinical interpretations about an individual’s
performance
• Are not comparable to raw scores on other tests
Standard Scores
• The clearest indication of a student’s
performance on the TWS-4
• Transformation of raw scores that establisha
common mean score and standard deviation
• The mean score is 100 with a standard
deviation of 15.
Percentiles
• Also referred to as percentile scores or ranks
and are represented values on a scale of 100
that indicate the percentage of the
distribution that is equal to or below the
value.
• Standard scores can be easily converted into
percentile scores
Spelling ages and
Grade Equivalents
• Spelling ages (SA) and grade equivalents (GE)
are derived by calculating the average score of
students in the normative group at each age
interval and at each school grade.
• Through a process of interpolation,
extrapolation, and smoothing, SA and GE’s are
generated for each raw score.
Educational
Significance
• Interpretation focuses upon percentiles and
standard scores
• Concern should arise when a student’s standard
score is below 90.
• Students who score in the 90-110 range are
performing at their expected level for their age
• Students who score over 110 have demonstrated a
mastery of basic spelling patterns and are likely to
be proficient readers and writers
Is this test Fair?
• The normative sample took measures to ensure an appropriate
representation of a percentage of cultures and/or age groups/gender
groups.
• The normative sample took into consideration the percentage of
race/population, gender/population and geographic location to
ensure the appropriate inclusion of all races.
• As per the amount of studies demonstrated and the demographics
conducted the test is age appropriate and should a variation be
necessary it is easily adaptable.
• Research has demonstrated the tests reliability and validity without
reproach. Should environmental conditions and bias within
administration of the test it has demonstrated it’s fairness.
• As the test requires very little preparation and is easily administered
and can be conducted in a re-test environment if necessary it is more
than adequate for it’s puposes.
Test References
– According to the American Speech, Language and Hearing Association, the test
“Assesses student's ability to spell words whose spellings are readily
predictable in sound-letter patterns, words whose spellings are less
predictable, and both types of words considered together.”
• Hailed as being adaptable, easily administered, appropriate time required for
testing procedures.
• As referenced in the Autism Guide for Effective Teaching, “The test
is administered using a dictated-word format. The words tested are
taken from 10 basal spelling programs and popular graded word
lists. The results of the TWS-4 can be used for three specific
purposes: to identify students whose scores are significantly below
those of their peers and who might need interventions designed to
improve spelling proficiency; to document overall progress in
spelling as a consequence of intervention programs; and to serve as
a measure for research efforts designed to investigate spelling.”
http://www.txautism.net/docs/Guide/Evaluation/AcademicAcheive
ment.pdf
In Summary
• Overall Strengths: easily administered; little time or preparation
required; easily re-administered if necessary; easily adaptable to
learning disabilities; can identify possible learning disorders; age
appropriate; easy to translate if necessary; easy to read and
determine scores
• As the ability to spell is the foundation for language arts and reading
instruction, the understanding of the phonics and grammatical rules for
structure and semantics are vital to the success in many other curricular
areas. This is one of the most fundamental goals in the art of instruction
and it is also an area in which difficulty can be commonplace. As decoding
disabilities and spelling problems can make up 85% of the learning
disabled community early identification of issues is imperative to
academic success. Because of the necessity in recognizing when students
have a spelling problem, this test is an easy one to begin the process.
Although extrinsic factors may affect spelling achievement, administration
of this particular test is a good beginning for students of all ages.
Thank you for your time
and
Have a Wonderful Day!