Best Practices in Assessment

Download Report

Transcript Best Practices in Assessment

Go to www.livetext.com

Click on “Use Visitor Pass”

Enter “9409ACEF” in the Pass Code field and click on Visitor Pass Entry

Dr. Lance Tomei • • • • • • • • Educational consultant specializing in assessment and accreditation Retired Director for Assessment, Accreditation, and Data Management: University of Central Florida (UCF), College of Education and Human Performance Former UCF NCATE Coordinator Experienced NCATE BOE Team Chair/Designated CAEP Site Team Leader Experienced Florida State Site Visit Team Chair Former member, FL DOE Student Growth Implementation Committee (SGIC) Former member, FL DOE Teacher and Leader Preparation Implementation committee (TLPIC) Former chair, FL DOE TLPIC Site Visit Protocol Subcommittee

      CAEP Standards/Components addressing assessment of candidate learning & other CAEP requirements Implications for your assessment system and key assessments Discussion on rubric design Rubric workshop Rubric templates and design strategies Summary/reflection

1.

2.

▪ Evaluate initial candidates’ progressive acquisition and mastery of knowledge and skills in the following four categories of InTASC standards: The learner and learning ▪ ▪ Content knowledge Instructional practice ▪ Professional responsibility Evaluate advanced candidates’ progressive acquisition and mastery of knowledge and skills specific to their discipline.

Summative assessments should ensure that candidates nearing program completion:

 Apply research and evidence in their practice    Apply content and pedagogical knowledge in a manner consistent with professional standards Demonstrate skills and commitment that afford all P 12 students access to rigorous college- and career ready standards Model and apply technology standards to engage students and enhance learning

. . .Clinical experiences, including technology-enhanced learning opportunities, are structured to have multiple performance-based assessments at key points within the program to demonstrate candidates’ development of the knowledge, skills, and professional dispositions, as delineated in Standard 1, that are associated with . . .

. . . a positive impact on the learning and development of all P-12 students. (Initial) . . . creating a supportive school environment that results in a positive impact on the learning and development of all P-12 students. (Advanced)

    3.2 Program admission 3.3 Professional dispositions/non-academic attributes 3.4 The provider creates criteria for program progression and monitors candidates’ advancement from admissions through completion . . . Providers present multiple forms of evidence to indicate candidates’ developing content knowledge, pedagogical content knowledge, pedagogical skills, and the integration of technology in all of these domains.

3.5 & 3.6: Program exit

The provider’s quality assurance system relies on relevant, verifiable, representative, cumulative and actionable measures, and produces empirical evidence that

interpretations of data are valid and

consistent [emphasis added].

At its fall 2014 conference, CAEP announced that its accreditation process will require the early submission of all key assessment instruments (rubrics, surveys, etc.) used by an Educator Preparation Provider (EPP) to generate data provided as evidence in support of CAEP accreditation. Once CAEP accreditation timelines are fully implemented, this will occur three years prior to the on-site visit.

1.

2.

3.

4.

5.

6.

7.

8.

9.

10.

Validity and Reliability

Relevance Verifiability

Representativeness Cumulativeness

Fairness

Stakeholder Interest Benchmarks

Vulnerability to Manipulation

Actionability

 Your overall assessment system needs to ensure that you can demonstrate the validity and reliability of your key assessment data as well as your analysis, interpretation, and application of those data to evaluate program impact and support continuous quality improvement.

 The quality of your key assessment instruments will be a critical factor in meeting many components of the new CAEP standards.

 Build a solid arch!

Image from http://www.bing.com/image

Change Plan

Evaluate & Integrate

Analyze Measure

       Who should participate and who should take the lead?

Self-selected or directed artifacts (major implications for rubric design)?

Do formative assessments collectively address all applicable competencies?

Do summative assessments collectively address all applicable competencies?

Are formative assessments and summative assessments well-articulated?

Are key assignments fully aligned with key assessments?

Can you demonstrate the validity and reliability of your current supporting evidence?

 Designing high-quality rubrics is difficult and time-consuming, but . . .

 Well-designed rubrics enhance teaching and learning, and . . .

 . . . improve validity and inter- and intra-rater reliability in assessing student learning  Bottom line: good rubrics = good data!

 Enhance student learning outcomes  Serve as a learning scaffold by clarifying formative and summative learning objectives  For each target learning outcome, establish critical indicators aligned to applicable standards/ competencies (=construct & content validity)  Facilitate self- and peer-evaluations  Provide actionable assessment data for individual students

 Provide a consistent and effective framework for key assessments  Establish clear/concrete performance descriptors for each assessed criterion at each performance level  Help ensure articulation of formative and summative assessments  Improve validity and reliability of assessment data  Produce actionable program-level data

Available online at http://www.aacu.org/value/rubrics/index_p.cfm

1.

Rubric and the assessed activity or artifact are well-articulated.

2.

Rubric has construct validity (e.g., standards-aligned) and content validity (rubric criteria represent all critical indicators for the competency to be assessed).

3.

 Each criterion assesses an individual construct No overly broad criteria  No double- or multiple-barreled criteria

Criterion Assessment Unsatisfactory

No evidence of review of assessment data. Inadequate modification of instruction. Instruction does not provide evidence of assessment strategies.

Developing Proficient

Instruction provides evidence of alternative assessment strategies. Some instructional goals are assessed. Some evidence of review of assessment data.

Alternative assessment strategies are indicated (in plans). Lessons provide evidence of instructional modification based on learners' needs. Candidate reviews assessment data to inform instruction.

Distinguished

Candidate selects and uses assessment data from a variety of sources. Consistently uses alternative and traditional assessment strategies. Candidate communicates with learners about their progress.

Criterion Unsatisfactory Alignment to Applicable State P-12 Standards and Identification of Appropriate Instructional Materials Lesson plan does not reference P-12 standards or instructional materials.

Developing Lesson plan references applicable P-12 standards OR appropriate instructional materials, but not both.

Proficient Lesson plan references applicable P-12 standards AND identifies appropriate instructional materials

4.

 To enhance reliability, performance descriptors should: Provide concrete/objective distinctions between performance levels (there is no overlap between performance levels)  Collectively address all possible performance levels (there is no gap between performance levels)  Eliminate or minimize double/multiple-barrel narratives (exception: progressive addition of barrels)

Criterion Communicating Learning Activity Instructions to Students Unsatisfactory Developing

Makes two or more errors when describing learning activity instructions to students Makes no more than two errors when describing learning activity instructions to students

Proficient

Makes no more than one error when describing learning activity instructions to students

Distinguished

Provides complete, accurate learning activity instructions to students

Criterion Instructional Materials Unsatisfactory Developing Lesson plan does not reference any instructional materials Instructional materials are missing for one or two parts of the lesson Proficient Instructional materials for all parts of the lesson are listed and directly relate to the learning objectives.

Distinguished Instructional materials for all parts of the lesson are listed, directly relate to the learning objectives, and are developmentally appropriate.

Criterion Unsatisfactory Alignment to Applicable State P-12 Standards and Identification of Appropriate Instructional Materials Lesson plan does not reference P-12 standards or instructional materials.

Developing Lesson plan references applicable P-12 standards OR appropriate instructional materials, but not both.

Proficient Lesson plan references applicable P-12 standards AND identifies appropriate instructional materials

5.

Rubric contains no unnecessary performance levels (e.g., multiple levels of mastery)    Common problems: Use of subjective terms to differentiate performance levels Use of performance level labels or surrogates  Use of inconsequential terms to differentiate performance levels  Worst case scenario: failure to maintain the integrity of target learning outcomes  Resulting data are actionable

Criterion Unsatisfactory Developing Proficient Distinguished Knowledge of Laboratory Safety Policies

Candidate shows a weak degree of understanding of laboratory safety policies Candidate shows a relatively weak degree of understanding of laboratory safety policies Candidate shows a moderate degree of understanding of laboratory safety policies Candidate shows a high degree of understanding of laboratory safety policies

Criteria Analyze Assessment Data Unacceptable

Fails to analyze and apply data from multiple assessments and measures to diagnose students’ learning needs, inform instruction based on those needs, and drive the learning process in a manner that documents acceptable performance.

Acceptable

Analyzes and applies data from multiple assessments and measures to diagnose students’ learning needs, informs instruction based on those needs, and drives the learning process in a manner that documents acceptable performance.

Target

Analyzes and applies data from multiple assessments and measures to diagnose students’ learning needs, informs instruction based on those needs, and drives the learning process in a manner that documents targeted performance.

Criterion Unsatisfactory

Quality of Writing Poorly written

Developing

Satisfactorily written

Proficient

Well written

Distinguished

Very well written

Criteria Alignment of Assessment to Learning Outcome(s) Unacceptable Acceptable Target The content of the test is not appropriate for this learning activity and is not described in an accurate manner.

The content of the test is appropriate for this learning activity and is described in an accurate manner.

The content of the test is appropriate for this learning activity and is clearly described in an accurate manner.

Criterion Alignment to Applicable State P-12 Standards Unsatisfactory No reference to applicable state P-12 standards Developing Referenced state P-12 standards are not aligned with the lesson objectives and are not age appropriate Proficient Distinguished Referenced state P-12 standards are age appropriate but are not aligned to the learning objectives.

Referenced state P-12 standards are age appropriate and are aligned to the learning objectives.

6.

 Resulting data are actionable To remediate individual candidates  To help identify opportunities for program quality improvement Base on the first four attributes, the following meta-rubric has been developed for use in evaluating the efficacy of other rubrics…

Criteria Unsatisfactory Rubric alignment to assignment.

The rubric includes multiple criteria that are not explicitly or implicitly reflected in the assignment directions for the learning activity to be assessed.

Developing Mastery

The rubric includes one criterion that is not explicitly or implicitly reflected in the assignment directions for the learning activity to be assessed.

The rubric criteria accurately match the performance criteria reflected in the assignment directions for the learning activity to be assessed.

Comprehensiveness of Criteria

More than one critical indicator for the competency or standard being assessed is not reflected in the rubric.

One critical indicator for the competency or standard being assessed is not reflected in the rubric.

All critical indicators for the competency or standard being assessed are reflected in the rubric.

Integrity of Criteria

More than one criterion contains multiple, independent constructs (similar to “double barreled survey question).

One criterion contains multiple, independent constructs. All other criteria each consist of a single construct.

Each criterion consists of a single construct.

Quality of Performance Descriptors

Performance descriptors are not distinct (i.e., mutually exclusive) AND collectively do not include all possible learning outcomes.

Performance descriptors are not distinct (i.e., mutually exclusive) OR collectively do not include all possible learning outcomes.

Performance descriptors are distinct (mutually exclusive) AND collectively include all possible learning outcomes.

Each workshop participant was asked to bring a current rubric in use in the program along with the assignment instructions for the artifact or activity to be assessed using that rubric. During the workshop, each participant should:

1.

2.

3.

4.

Evaluate your rubric using the meta-rubric as a guide.

Identify any perceived opportunities to improve the quality of your rubric and/or assignment instructions.

Determine what actions you would take to improve the quality of your rubric, if any.

At the conclusion of individual work, report out to the group at least one finding regarding your rubric along with your thoughts about how you might respond to that finding.

Criteria Unsatisfactory Rubric alignment to assignment.

The rubric includes multiple criteria that are not explicitly or implicitly reflected in the assignment directions for the learning activity to be assessed.

Developing

The rubric includes one criterion that is not explicitly or implicitly reflected in the assignment directions for the learning activity to be assessed.

Comprehensiveness of Criteria

More than one critical indicator for the competency or standard being assessed is not reflected in the rubric.

One critical indicator for the competency or standard being assessed is not reflected in the rubric.

Mastery

The rubric criteria accurately match the performance criteria reflected in the assignment directions for the learning activity to be assessed.

All critical indicators for the competency or standard being assessed are reflected in the rubric.

Integrity of Criteria

More than one criterion contains multiple, independent constructs (similar to “double-barreled survey question).

One criterion contains multiple, independent constructs. All other criteria each consist of a single construct.

Each criterion consists of a single construct.

Quality of Performance Descriptors

Performance descriptors are not distinct (i.e., mutually exclusive) AND collectively do not include all possible learning outcomes.

Performance descriptors are not distinct (i.e., mutually exclusive) OR collectively do not include all possible learning outcomes.

Performance descriptors are distinct (mutually exclusive) AND collectively include all possible learning outcomes.

     Including more performance levels than are needed to accomplished the desired assessment task Using double- or multiple-barreled criteria or performance descriptors Failing to include all possible performance outcomes Using overlapping performance descriptors Failing to include performance descriptors or including descriptors that are simply surrogates for performance level labels

       In designing rubrics for key formative and summative assessments, think about both effectiveness and efficiency Identify critical indicators for target learning outcomes and incorporate those into your rubric Limit the number of performance levels to the minimum number needed to meet your assessment requirements Populate the target learning outcome column first (Proficient, Mastery, etc.) Make clear (objective/concrete) distinctions between performance levels; avoid the use of subjective terms in performance descriptors Be sure to include all possible outcomes Don’t leave validity and reliability to chance  Most knowledgeable faculty should lead program-level assessment work; engage stakeholders; align key assessments to applicable standards/competencies; focus on critical indicators  Train faculty on the use of rubrics  Conduct and document inter-rater reliability and fairness studies