Transcript Document
Technology to mark scripts Scanners read the multiple-choice responses and store the data in a database. The software then scores the responses. Technology to mark scripts ETS The CriterionSM Online Essay Evaluation Service automatically evaluates essay responses using e-rater and the Critique writing analysis tools. E-rater® gives holistic scores for essays. CritiqueTM provides real-time feedback about grammar, usage, mechanics and style, and organization and development. C-raterTM offers automated analysis of conceptual information in short-answer, free responses. FILE 4 examens-leacock.pdf (highlighted pages) Using automated marking • multiple choice marking done by machine. • automated short-answer marking (character recognition) • process-based marking (increasing intelligence in the marking engine and redesign of the tests) Technology to mark scripts E-marking by human raters What is e-marking? The story of Assurance and Qualification Alliance (AQA) First e-marking in January 2005 (83 GCSE papers) General / Expert marking (overseas?) Technology to mark scripts AQA “This summer we intend to use electronic marking for 2 million scripts involving 2812 examiners in 102 GCSE papers and eight GCE papers. A full list is provided in the attached update to the questions and answers initially provided last spring”. (2007) (My personal involvement – describe how it works, table leaders, seeded items, micro-management etc) Technology to mark scripts AQA An alternative: Automated Marking: Key-in all short responses and then decide which are the correct. (double-entry) Electronic mark collection E-marking research AQA: we have yet to establish that examiners are able to assess long pieces of continuous writing as effectively on screen as they are able on paper. Using on-screen marking • Real-time quality marking, through early detection and remediation of aberrant marking; • random distribution of scripts and items to markers; • specialisation of markers in a limited number of items; • reduction of clerical errors, because the computer sums the marks; • elimination of paper distribution; and • greater security (no papers lost or compromised). • Item-level data available for diagnosis / reporting Using on-screen marking Easy to use Less beraucracy (The markers have been trained) Using on-screen marking Remote Standardization: In face to face meetings, all examiners must work through the standardisation material at the pace determined by the Principal Examiner. The online system allows each individual to work at their own pace, revisiting those aspects they find most challenging. Instant feedback at individual item level is provided and the examiner can refer to this at any point during the marking period E-marking = on-screen marking Technology to mark scripts ETS Online Scoring Network (OSN) ETS employs raters to evaluate essays online. Prospective raters are trained to apply scoring criteria and must complete a certification test. Raters begin each session by practice scoring at least one set of essays. If questions arise during calibration or actual scoring, the rater and scoring leader can telephone each other to discuss issues. Quality of marking “Quality control is a formal systematic process designed to ensure that expected quality standards are achieved during scoring, equating, and reporting of test scores”. Allalouf, A. (2007). Quality control procedures in the Scoring, Equating, and Reporting of test scores. Educational Measurement: Issues and Practice, 26 (1), 36-43. Quality of marking The citation demonstrates the implications of time pressure on the staff. It is attributed to Joanne M. Lenke, the former president of Harcourt, in referring to a huge error of judgment that affected some 257,000 examinees in 1999. “[The error] might have been caught if the company had more than two days to analyze data from 4.3 million test forms . . . before Wednesday’s deadline for passing results in the Internet” (Colvin & Groves, 1999; Rhoades & Madaus, 2003). Quality of marking May 2001, the ‘New Yorker Times’ published an article with the title “Speed over accuracy?” Quality of marking USA: On September 15, 1999, days after school started, McGraw Hill admitted that the same error that affected Tennessee scores had also incorrectly lowered New York City students’ percentile rankings at the lower end of the scale. Thus, 8,668 children whose correct scores were above the cutoff had mistakenly been compelled to go to summer school. Quality of marking Scotland: In the summer of 2000, exam scores arrived late, and inaccuracies were reported. By mid-August, over 5,000 potential university students had received incomplete or inaccurate results. University placements (admittance decisions) were down by almost 9%, … students might not be granted their first choice Quality of marking In 2000, Pearson's NCS subsidiary made a scoring error on the Minnesota high school graduation test that wrongly denied 50 students the diplomas they had earned. After a Minnesota court concluded that "the error was preceded by years of quality control problems at NCS" due to a "culture, emphasizing profitability and cost-cutting," Pearson created a settlement pool of $7 million for test takers and paid $16,000 to each of the seniors who missed graduation. Schaeffer, R. (2006). Testimony of Robert Schaeffer before the New York Senate Higher Education Committee. Reached at http://www.fairtest.org/univ/SAT_Error_testimony.html Consistency of marking Double marking: Cyprus, Slovenia, Greece, Australia, New Zealand, Russia In England, PricewaterhouseCoopers (2005) announced that the awarding bodies double-mark an average of 12% of scripts as part of their quality assurance procedures. Consistency of marking Disagreement between markers. Why it happens. Martin, D. (2005). Report on the performance of the New Zealand Qualifications Authority in the delivery of secondary school qualifications. State Services Commision. Reached at http://www.ssc.govt.nz/upload/downloadable_files/part-two-reviewnzqa.pdf on 13/2/2007. (code 596, check page 49) Discrepancy resolution techniques Johnson, R. L., Penny, J., Fisher, S., Kuhs, T. (2003). Score resolution: an investigation of the reliability and validity of resolved scores. Applied Measurement in Education, 16 (4), 299-322. Consistency of marking The QCS Test (Queensland Core Skills Test) Writing Task paper: Each response is marked at least three times by different markers working independently. Monitoring marker consistency. Short Response paper: Each candidate response is marked at least twice by different markers working independently (referee marking). Monitoring marker consistency. Multiple Choice papers are marked electronically Assessment for learning In classrooms where assessment for learning is practiced, students know at the outset of a unit of study what they are expected to learn. Assessment for learning The teacher will work with the student to understand what she or he knows about the topic as well as to identify any gaps or misconceptions Assessment for learning occurs at all stages of the learning process This is the tirte Demonstration of Reports Sdfsdf Sdfsd Fsdf sdfsd