Assessment in Language Learning

Download Report

Transcript Assessment in Language Learning

Assessment in Language
Learning
By
Didi Sukyadi
Evaluation, Assessment ,and
Testing (Cameron, 2001:222)
• Testing: One technique or method of
assessment that is concerned with measuring
learning through performance.
• Assessment: concerns with pupils learning or
performance and thus provides one type of
information that might be used in evaluation
• Evaluation: a process of systematically collecting
information in order to make a judgment
including the issues of lessons, programs
through documentation, observation, interviews,
questionnaires, etc.
The Place of Evaluation in
Curriculum Development
• Evaluation can or should be involved in all
phases of curriculum development starting
from needs analysis, stating the
objectives, testing itself, material
development, teaching and learning
process and evaluation.
The Place of Assessment in
Teaching and Learning Process
• Brewster et.al. (2003:247): assessment
plays an extremely important part in the
teaching and learning process and may
heavily influence the way the pupils are
taught and the kinds of activities they do
Assessment in Learning
LEARNING = Teaching + Assessment
Outcome
(Statement of competence)
(Statement of Achievement)
Learning
(Alternative modes,
Contexts, Time-scales
Assessment
(Alternative forms of
Evidence acceptable
TEST TYPES
1) Selected response (binary choice,
matching, and multiple-choice)
2) Constructed responses (fill in, short
answer, performance format)
3) Personal responses (conference,
portfolio, self assessment)
Constructed response
• Advantages: virtually no guessing factor,
allows for productive language use, allows
for testing the interactions of receptive and
productive skills.
• Disadvantages: difficult and time
consuming to score and subjective in
scoring.
Selected responses
• Advantages: require a short time to
administer, easy to score, scoring is
objective.
• Disadvantages: relatively difficult to create,
require no language production.
Personal Response Item
• Advantages: directly related and
integrated to curriculum, appropriate for
assessing learning process.
• Disadvantages: difficult to create and
structure, subjective in scoring
INTERPRETING THE OUTCOME
OF ASSESSMENT
1) Norm-referenced tests
Any test that is primarily designed to
disperse performances of students in
normal distribution based on their general
abilities or proficiencies for purposes of
categorizing the students into the levels or
comparing students’ performances to the
performances of others who formed the
normative group (Glaser, 1963)
INTERPRETING THE OUTCOME
OF ASSESSMENT
2) Criterion-referenced tests
Measures which assess student achievement in
terms of certain criterion standard thus provide
information as to the degree of competence
attained by a particular student which is
independent of reference to the performance of
others. They are deliberately constructed to yield
measurements that are directly interpretable in
terms of specified performance standard (Glaser
and Nitko, 1971)
Other names for criterionreferenced tests
1) Domain-referenced tests (Documents that
delineate a domain of student behaviors
and the contents are materials to which
test items are then referenced).
2) Objective-referenced tests (A test
constructed so that the subsets of the
items measures the specific objectives of
a course, program of study or other clearly
delineated subject matter area)
Characteristics of CRT
1) Emphasis on teaching/testing matches.
2) Focus on instructional sensitivity
3) Curricular relevance
4) Absence of normal distribution restrictions
5) No item discrimination restriction
CRT AND LANGUAGE THEORY
Two competing hypotheses
1) The divisibility of language ability
2) The communicative competence
The earlier rather simplistic views of
language ability have been abandoned,
recent focusing on performance
assessment has raised new concern.
Nature of Language and
Assessment
• Language and language acquisition are different
in nature from other educational content such as
in the relationship that exists in the nature of
language proficiency and communicative
competence.
• The difference will have a direct influence on
how the construct of language knowledge is
defined, how language tests are operationalised
and how they are evaluated.
Language proficiency
1) Functional approach (listing the various
uses to which language can be put)
2) General proficiency (individuals differ
basically in the measurable amounts of
some indivisible body of competence they
posses)
• CRT is very appropriate and useful in the
assessment of such clearly definable but
complex language tasks.
Communicative ability
1) Grammatical competence
2) Sociolinguistic competence
3) Strategic competence
4) Organizational competence
5) Pragmatic competence
A language test should reflect:
1) Language is used in interaction
2) Interactions are usually unpredictable
3) Language has a context
4) Language is used for a purpose
5) There is a need to examine a
performance
6) Language is authentic
7) Language success is behavioral based.
Testing communicative language
ability:
1) Be criterion-referenced against the
operational performance of a set of
language tasks.
2) Be concerned with validating itself against
the criteria and be concerned with the
content, construct and predictive validity.
3) Rely on modes of assessment which are
qualitative
4) Subordinate reliability to face validity
Test item:
• A unit of measurement with a prompt and a
prescriptive form for responding, which is
intended to yield a response from an examinee
from which performance in some language
construct my be inferred in order to make some
decision.
• A stem can be the portion of the item (in multiple
choice), a quote the student must respond, or
the reading passage that the student must
analyze and write about.
WRITING TEST ITEMS
1) Do not explain too much.
2) Do not use trick questions
3) Provide only the information necessary
4) Avoid ambiguity
5) Be orderly in test presentation
Linguistic confoundings
1) Item should be written at the examinee’s
level of proficiency
2) Item should not contain negatives or
double negatives
3) Item should not be ambiguous
• Family plays an important role in life. It
sometimes complicates matters. Explain
this. Here this may refer to the role of the
family or complication involved.
Format confoundings
1) Item should contain only relevant information.
(1) Unnecessary information included
• The following twenty vocabulary items have
been selected from the second reading texts in
Unit 2 of the reading Packet. Your teacher
discussed each of these words in class during
the Wednesday vocabulary lesson ….
(2) Too brief
• Write an essay comparing relationship in two
countries
Format confoundings
2) Item should be independent
e.g.
(1) What is the square root of 100?
(2) Multiply this by seven/
Format confoundings
3) Item should be clearly organized and
formatted.
• The item and its options should appear on
the same page
VALIDITY
• Hughes (1989, 2003:26): a test is said to be valid if it
measures accurately what it is intended to measure.
• Content validity: the content of a test constitutes a
representative sample of the language skills, structures,
etc.
• Criterion related validity: the degree to which results on
the test agree with those provided by some independent
and highly dependable assessment of the examinees’
ability,
• Construct validity: is the degree to which a test is
measuring the psychological construct or constructs that
it claims to be measuring
RELIABILITY (NRTs)
1. Test retest reliability
2. Equivalent forms reliability
3. Internal consistency (split-half reliability
DEPENDABILITY IN CRTs
•
1)
•
•
•
•
•
•
DEPENDABILITY IN CRTs
Threshold loss agreement
Po = A + D
N
Po = agreement coefficient
A = masters on both administration of the tests
D = non-masters on both administrations of the test
B = masters on the first administration but non masters
on the second.
C = Non masters on the first administration and master
on the second
Example
• Of the 45 examinees, 13 are categorized
as A, 2 as B, 5 in C and 25 in D.
Po = A + D = 13 + 25 = 38 =
N
45
45
• Consistency due to the test itself
Po = (A + B)* (A + C) + (C + D)* (B + D)
N2
Validating Test Items
• Item Analysis
1) Index of difficulty/Item facility/Item
easiness/P-value
2) Difference index
Item validity
1) Add item 1 to item 10 and you get the
total score for each examinee (Score)
2) Item validity: Correlate each item with the
score using point biserial correlation
(correlating nominal and interval data)
Rater Consistency
1) Correlate score of each rater with the
other two raters using Pearson product
moment correlation.
2) If the correlation is significant, the rating is
consistent.
AUTHENTIC ASSESSMENT
1) Real life, normal communication (the
ability to perform particular tasks)
2) Interactional ability (total communicative
effect)
What is meant by authentic?
•
•
•
•
•
•
•
•
Measures student’s knowledge and skills
Requires application of knowledge
Product or performance assessment
Relevant contextualized tasks
Process and products can both be measured
Part of learning process
Render holistic description
Reflection of real world
Types of Authentic Assessment
•
•
•
•
•
•
•
•
Oral interviews
Story retelling
Teacher observation
Experiments
Demonstration
Projects/Exhibition
Writing samples
Portfolios