Creating Common Assessments

Download Report

Transcript Creating Common Assessments

The LCSD APPR Story
The Local Assessment Chapter
December 13, 2012
The Third Piece 15-20pts
The straw that broke. . .
The local assessment decision
more NYS assessment
an approved 3rd party assessment
region or district developed
Our own assessments
we decide what learning is important
diversify measures of student learning
Measures student growth for SLOs too
Two Parts

I - Best practices in assessment development

II - Superintendent Approval Process


Assurances
Assessment Maps/Blueprints/Specs



Scoring Guides
Test Specifications
Testing Guides
Creating Summative
Assessments
Cheryl Covell
Tompkins-Seneca-Tioga BOCES Model Schools Program
Accountability Based Assessment Instructionally Supportive Assessment
• Summative
• Formative
• Public
• Private
• Infrequent
• Frequent
• Assess particular skills and content
• Assesses broad standards
• Validity most important
• Reliability most important
• Evaluate if students are “meeting standard”
• Sorts, sanctions, rewards
• Evaluate students at all levels
• Informs curriculum and instruction
What kind of test?




Summative
Standardized
Coverage of the “most important” standards
Selected response and constructed response
Summative Assessment


Summarizes learning at a particular point in time.
Can be used to compare how well students mastered a
set of learning goals.
Standards-based Assessment



Student performance is compared to a desired level of
performance on the content.
Questions on the assessment are selected to cover
important content.
Questions on the assessment are selected to match
national, state or local content standards.
APPR Evidence – Considerations
•
How aligned and authentic are the assessment items to the learning
content?
•
How valid and reliable are the assessments?
•
Are they verified as comparable and rigorous?
•
What, if any, administration accommodations must legally be made
for students?
•
How are the assessments scored in terms of point values assigned
per item and method of summarizing scores?
•
Have procedures been established to ensure those with vested
interest do not score students’ assessments?
9
Selected Response
Any item which asks students to select from several
options.
Multiple
choice
True/false
Matching
Fill in the blank, with word bank
Multiple Choice – Stems



Make the question as clear and direct as possible.
Make sure students know what you are asking by just
reading the question – a proficient student should be able
to come up with the right answer without first reading all
of the possible choices.
Use negatives sparingly and emphasize them if used.


Which of the following choices is NOT a mammal?
Ask more than one question when students must first
take in a lot of information.
Multiple Choice – Alternatives


Make sure there is only one unambiguously right answer.
Create plausible distracters.







Common misconceptions
Common mistakes
Technical jargon
Do not use emphasis (bold, capital letters).
Avoid repeating words from the stem.
Use “none of the above” with caution.
List choices in a logical order.
All of the following are correct procedures for
putting out a fire in a pan on the stove except:
a. Do not move the pan.
*b. Pour water into the pan.
c. Slide a fitted lid onto the pan.
d. Turn off the burner controls.
All of the following are correct procedures for
putting out a fire in a pan on the stove except:
a. Do not move the pan.
*b. Pour water into the pan.
c. Slide a fitted lid onto the pan.
d. Turn off the burner controls.
California:
a. Contains the tallest mountain in the United
States
b. Has an eagle on its state flag.
c. Is the second largest state in terms of area.
*d. Was the location of the Gold Rush of 1849.
What is the main reason so many people
moved to California in 1849?
a. California land was fertile, plentiful, and
inexpensive.
*b. Gold was discovered in central California
c. The east was preparing for a civil war.
d. They wanted to establish religious settlements.
When conducting library research in
education, which of the following is the best
source to use for identifying pertinent journal
articles?
a. A Guide to Sources of Educational Information.
*b. Current Index to Journals in Education.
c. Resources in Education
d. The International Encyclopedia of Education.
When conducting library research in
education, which of the following is the best
source to use for identifying pertinent journal
articles?
a. A Guide to Sources of Educational Information.
*b. Education Index.
c. Resources in Education.
d. The International Encyclopedia of Education.
Suppose you are a mathematics professor who wants
to determine whether or not your teaching of the unit
on probability has had a significant effect on your
students. You decide to analyze their scores from a
test they took before the instruction and their scores
from another exam taken after the instruction. Which
of the following t-tests is appropriate to use in this
situation?
*a. Dependent samples.
b. Heterogeneous samples.
c. Homogeneous samples.
d. Independent samples.
When analyzing your students’ pretest and
posttest scores to determine if your teaching
has had a significant effect, an appropriate
statistic to use is the t-test for:
*a. Dependent samples.
b. Heterogeneous samples.
c. Homogeneous samples.
d. Independent samples.
How long does an annual plant generally live?
a. It dies after the first year.
b. It lives for many years.
c. It lives for more than one year.
d. It needs to be replanted each year.
How long does an annual plant generally live?
a. Only one year.
b. Only two years.
c. Several years.
Constructed Response
A constructed response requires students to use
creativity, organization skills, and logic to develop an
answer.
Essay
Short
answer
Diagram
Performance
Fill in the blank – no word bank
Four Elements
1.
Task

2.
Evaluative Criteria

3.
The essential ingredients of the skill being measured.
Quality Definitions

4.
What students should do.
The description of how a student’s performance on each
evaluative criteria would look at each scoring level.
Scoring Strategy

How a final score will be derived for the task.
Smarter Balanced Assessments
Analyze/Integrate Information
2
1
0
The response gives sufficient evidence of the ability to
gather, analyze and integrate information within and
among multiple sources of information.
The response gives limited evidence of the ability to
gather, analyze and integrate information within and
among multiple sources of information.
A response gets no credit if it provides no evidence of
the ability to gather, analyze and integrate information
within and among multiple sources of information.
Smarter Balanced Consortium
http://www.smarterbalanced.org/wordpress/wpcontent/uploads/2012/05/TaskItemSpecifications/EnglishLa
nguageArtsLiteracy/ELARubrics.pdf
Rubistar
http://rubistar.4teachers.org/
Lansing Central School District
Criteria and Assurances for Superintendent Approval
Of Local End-of-Course Assessments
Teacher(s) ________________________________
Course Name __________________________________
Assurances of Validity – Please check all boxes.
1. I have reviewed this assessment to determine that each item measures the learning target that I intend to measure.
2. I have reviewed this assessment to determine that, when taken together, these items assess curriculum that I have prioritized as important to students’ adult
lives, important to students’ future learning, important to students’ learning in other content areas, or relevant to NYS Assessments.
Assurances of Reliability – Please check all boxes.
3. I have reviewed this assessment, and I expect that this collection of assessment items will reliably assess student learning for the affiliated course. I.e., student
scores on this assessment should correctly classify student learning (e.g., as developing, proficient, mastered, advanced).
4. I will immediately report any indications that student scores on this assessment do not reliably describe levels of student learning, if such irregular results occur.
5. This assessment includes scoring materials (e.g., answer key, rubric(s), points distribution for short answers, etc.) that support consistent scoring of all students’
assessments.
Assurances of Bias Review – Please check all boxes.
6. I have reviewed this assessment for assessment bias and have taken all measures available to me to minimize bias in every item.
Assurances of Rigor and Comparability – Please check box 7 and either 8 or 9.
7. I have reviewed this assessment and assure that it is aligned with the appropriate NYS Learning Standards, including grade-level aligned CCLS Literacy Standards
for Social Studies or or for Technical Subjects.
8. I have reviewed this assessment for rigor, using the levels in Bloom’s Taxonomy, and assure that these items address a range of thinking skills, but most items
require students to work above Bloom’s Comprehension level. OR
9. I assure that the assessment requires students to deeply comprehend complex grade level text (as defined in the CCLS) and to demonstrate that comprehension
through informational or argumentative writing using text-based references. (Grades K-2 tasks may require narrative writing.)
The signature(s) below indicate that the end-of-course assessment for the course listed above has been reviewed and revised for validity, reliability, bias, and rigor.
Teacher Signature(s) ________________________________________
Date _____________________
________________________________________
Date _____________________
The signature below indicates Superintendent approval for the end-of-course assessment listed above.
Superintendent
(or designee) Signature
______________________________________
Date _____________________
Validity and Reliability
Validity = Accurate Inferences
Reliability = Consistent Results
A Valid Assessment….
Assurances of Validity
1.
I have reviewed this assessment to determine that each
item measures the learning target that I intend to
measure.
2.
I have reviewed this assessment to determine that,
when taken together, these items assess curriculum that
I have prioritized as important to students’ adult lives,
important to students’ future learning, important to
students’ learning in other content areas, or relevant to
NYS Assessments.
Standards

Statements of what we want students to know and be
able to do.
Learning Targets

The specific knowledge and skills a student must have in
order to master a given standard.
A Reliable Assessment….
Assurances of Reliability
3.
4.
5.
I have reviewed this assessment, and I expect that this
collection of assessment items will reliably assess
student learning for the affiliated course. I.e., student
scores on this assessment should correctly classify
student learning (e.g., as developing, proficient, mastered,
advanced).
I will immediately report any indications that student
scores on this assessment do not reliably describe levels
of student learning, if such irregular results occur.
This assessment includes scoring materials (e.g., answer
key, rubric(s), points distribution for short answers, etc.)
that support consistent scoring of all students’
assessments.
For objective assessments like multiple choice tests, to
increase reliability, you should:



Include enough items.
Allow enough time for students to complete the test.
Standardize administration.
For papers, essays, and projects, to increase reliability, you
should:





Have clear enough directions for students that all are
likely to produce work you can score.
Have a systematic scoring procedure.
Have multiple markers (scorers) when possible.
Use broad categories for classification.
Standardize administration.
Assurances of Bias Review
6.
I have reviewed this assessment for assessment bias and
have taken all measures available to me to minimize bias
in every item.
Bias
Anything in the assessment or way the assessment is carried
out that might not allow students to be able to adequately
demonstrate what they know and can do.
offensive language or topics
 assumed background knowledge
 requiring skills outside of the area being assessed
 noise – typos, poor layout, confusing directions

Topics to Avoid









abortion

abuse of people or animals

contraception

deportation of immigrants
experimentation on people or 
animals that is dangerous or 
painful

killing of animals for sport
the occult, witches, ghosts,

vampires

pregnancy of human beings
rape
sexual behavior or sexual
innuendo
suicide
torture
euthanasia
gun control
climate change caused by
human behavior
prayer in school
current or recent partisan
political issues, ethnic
conflicts, and religious
disputes
Topics to be Treated with Care













Accidents and natural disasters

Advocacy

Alcohol, tobacco, and illegal drugs 
Animals that are frightening to

children

Antisocial, criminal, or inappropriate
behaviors

Biographical materials

Dancing

Dangerous activities

Death and dying

Evolution

Family problems
Gambling
Holidays and birthdays
Homelessness and evictions
Immigration
Junk food
Luxuries
Medicines, including diet
supplements
Obesity and body-image problems
Personal questions
Religion
Serious illnesses
Slavery
Terrorism, wars, violence, suffering
Looking for bias….


Shaquan helps assemble food packages for poor people at
Christmas. Each box holds 6 cans in a row. There is room
for 4 rows in a box. Write the expression that best
describes the number of cans in one full box.
Unacceptable. The first sentence adds to the reading load
but adds no useful information. The references to
“Christmas” and “poor people” are inappropriate and
unnecessary.
Looking for bias….


Lee’s father and Juan’s father are both policemen.
Unacceptable. Even though both officers are male, “police
officers” is preferred to “policemen” to avoid the
impression that only men are police officers.
Looking for bias….


If one card is taken at random from a deck of playing
cards, what is the probability that the card will be an ace?
Unacceptable. The question assumes knowledge of the
number of aces and the total number of cards in a deck
of playing cards. It is acceptable to ask about probability,
and it is acceptable to use playing cards in math problems.
According to the guideline about gambling, however, it is
not acceptable to assume that test takers have knowledge
of the characteristics of a deck of playing cards.
Looking for bias….


When Ms. Luna pulled her car into the parking garage, the
machine at the gate issued a ticket stamped with the time,
11:30 a.m. When she left the garage that afternoon, her
ticket was stamped with the time she left, 12:15 p.m.
What was the total length of time that Ms. Luna’s car was
in the parking garage?
Unacceptable. The question is very wordy and uses an
unfamiliar context for many children. In addition, “pulled
her car” is an idiom that children may not know.
Looking for bias….


People who drive gas-guzzling SUVs contribute to global
warming.
Unacceptable. This excerpt is a clear violation of the
guideline against advocating for one side in a controversial
situation.
Assurances of Rigor and Comparability
7.
8.
9.
I have reviewed this assessment and assure that it is aligned
with the appropriate NYS Learning Standards, including
grade-level aligned CCLS Literacy Standards for Social
Studies or for Technical Subjects.
I have reviewed this assessment for rigor, using the levels in
Bloom’s Taxonomy, and assure that these items address a
range of thinking skills, but most items require students to
work above Bloom’s Comprehension level.
****OR****
I assure that the assessment requires students to deeply
comprehend complex grade level text (as defined in the
CCLS) and to demonstrate that comprehension through
informational or argumentative writing using text-based
references. (Grades K-2 tasks may require narrative
writing.)
Rigor…
Rigor Definition – APPR Field Guidance
Rigorous means that the locally-selected measure is
aligned to the New York State learning standards or, in
instances where there are no such learning standards that
apply to a subject/grade level, evidence of alignment to
research-based learning standards.
Common Core Learning Standard
7.RP - Ratios & Proportional Relationships
“Use proportional relationships to solve multistep ratio
and percent problems.”
1. 50% of 20:
2. 67% of 81:
3. Shawn got 7 correct answers out of 10 possible answers on his science test. What
percent of questions did he get correct?
4. J.J. Redick was on pace to set an NCAA record in career free throw percentage.
Leading into the NCAA tournament in 2004, he made 97 of 104 free throw attempts.
What percentage of free throws did he make?
5. J.J. Redick was on pace to set an NCAA record in career free throw percentage.
Leading into the NCAA tournament in 2004, he made 97 of 104 free throw attempts.
In the first tournament game, Redick missed his first five free throws. How far did his
percentage drop from before the tournament game to right after missing those free
throws?
6. J.J. Redick and Chris Paul were competing for the best free-throw shooting
percentage. Redick made 94% of his first 103 shots, while Paul made 47 out of 51
shots.
– Which one had a better shooting percentage?
– In the next game, Redick made only 2 of 10 shots while Paul made 7 of 10 shots.
What are their new overall shooting percentages? Who is the better shooter?
– Jason argued that if Paul and J.J. each made the next ten shots, their shooting
percentages would go up the same amount. Is this true? Why or why not?
Common Core Learning Standard
Reading Standards for Informational Text
Grade 3
“Determine the main idea of a text; recount the key details and
explain how they support the main idea.”
LITTLE RED RIDING HOOD:
1. What is the main idea?
2. This story is mostly about:
A. Two boys fighting
B. A girl playing in the woods
C. Little Red Riding Hood’s adventures with a wolf
D. A wolf in the forest
3. This story is mostly about:
A. Little Red Riding Hood’s journey through the woods
B. The pain of losing your grandmother
C. Everything is not always what it seems
D. Fear of wolves
In an open-ended question, the rubric
defines the rigor.
In a multiple choice question, the options
define the rigor.
Test Map
A summary of the content and method of assessment.



What are the prioritized standards?
What type of items are included and how many of
each?
How will the items be weighted?
Lansing Test Map
Category of Assessed Student Learning
Question Type: (e.g., 2 credit multiple choice, 4 Number of Questions
credit short response, 20 credit essay, etc.)
Total Credits
Percent of Total Credit
Assigned to Area
Some Details



Assessment Item Security
Proctoring/Administrating Exams
Assessment Maps/Blueprints/Specs





NYS Scoring Guides
NYS Test Specifications
NYS Testing Guides
First Assessment due date – January 31, 2013
All students in a course, e.g., 4th Grade ELA, must take the
same assessment.
The End

Next steps = begin work in grade level/department teams
tomorrow

Ask lots of questions