© Caveon, LLC Legal Issues in Testing (Part One) A legal framework for the appropriate and defensible use of Caveon Data Forensics™ June 21, 2006 Bob.

Download Report

Transcript © Caveon, LLC Legal Issues in Testing (Part One) A legal framework for the appropriate and defensible use of Caveon Data Forensics™ June 21, 2006 Bob.

Legal Issues in Testing

(Part One) A legal framework for the appropriate and defensible use of Caveon Data Forensics™ June 21, 2006 Bob Hunt, J.D. Ph.D.

Vice President of Legal Services and General Counsel Caveon Test Security © Caveon, LLC

Overview …

     What is “data forensics” The need for competitive fairness in testing Adequacy of proctoring Legal rights to use statistics to:   monitor testing behavior cancel scores and take other actions Examples of Concepts © Caveon, LLC

© Caveon, LLC

Fairness in Testing

   Does the test objectively evaluate relevant knowledge and skills?

Does the test provide consistent results? 

Is the testing process equally fair?

Why is Competitive Fairness Important?

  Without competitive fairness, tests are simply biased rituals The validity and reliability of test results hinge on fair competition

Consistent results (reliability) Test Relevance (validity) Relational Fairness (level playing field)

© Caveon, LLC

The (in)Adequacy of Proctoring

  Cheating behaviors are increasingly “invisible” to proctors    Theft of test material Wider use of stolen test content obtained from the Internet Wireless communications Human proctoring is hampered by:   Scope: one to many Errors of observation, memory and bias © Caveon, LLC

© Caveon, LLC

Data Forensics Promotes Competitive Fairness

    More comprehensive (no limits on effectiveness due to volume of cheating) Fewer errors of observation, memory and bias (need to elaborate why scientific error is more trustworthy) Flags “trusted” test results Reinforces other detection efforts  Confirm or clear other suspicions

Caveon Data Forensics

 “Low-resolution” analyses 1.

Gain-score 2.

Response aberrance 3.

Latency aberrance  “High-resolution” analyses 1.

Copying indices 2.

Erasures © Caveon, LLC

Questions?

© Caveon, LLC

The $1,000,000 Legal Question

Can statistics improve test security if they don’t provide “first-hand” evidence of cheating?

© Caveon, LLC

© Caveon, LLC

Answer: Cheating is Overrated

   Testing programs have broad authority to manage the validity of test results Courts only require a general “lack of confidence” in a test result Lots of (other) reasons to lack confidence in test results:  Disruption    Illness Technology issues Mishandling, etc.

Validity vs. Cheating

 “We are satisfied that the relevant public and private interests are fairly accommodated by a procedure which permits ETS to cancel scores upon an

adequate showing of substantial question as to their validity, without any necessity for a showing of actual cheating or other misconduct.”

 Scott v. ETS (N.J. Sup.1991) © Caveon, LLC

© Caveon, LLC Courts Understand the Need for a Broader Solution  “To demand that ACT prove by eyewitness testimony that an individual cheated before invalidating a score would undermine ACT's primary function of providing colleges with scores that are highly reliable.” 

“ACT could not possibly catch every student who cheats on its exams if it had to produce an eyewitness to confirm every instance of misconduct

.”  Langston v. ACT, (11th Cir 1989)

© Caveon, LLC Question: Does that Mean that Test Scores can be Cancelled for ANY Reason?  Answer: Nearly    Testing programs enjoy broad discretion in evaluating the validity of test scores Score cancellations based on statistical evidence have been upheld Courts that have addressed the issue directly have stated that it is not necessary to prove wrongdoing

Implement Uncertainty!

 “Your scores may be classified as the USMLE program

or competence indeterminate

if the scores are at or above the passing level and

cannot certify that they represent a valid measure of your knowledge

as sampled by the examination.”  U.S. Medical Licensing Board  Test retakes replace committee reviews, hearings, etc. as the chief due process opportunity © Caveon, LLC

Questions?

Case Studies:  Scott v. ETS: Private/Contract law  Murray v. ETS: Public/Constitutional law © Caveon, LLC

Scott v. ETS (N.J. Sup. 1991)  National Teachers Examination  administered by ETS for the State of New Jersey  Decided on constitutional grounds © Caveon, LLC

© Caveon, LLC

Scott v. ETS (cont.)

   ETS detected a score gain of 42 points Scott’s second and third attempts ETS then detected similarity between the Scott’s test-responses and those of another examinee (not seated near her). ETS cancelled Scott’s score but offered her an opportunity to retest.

Scott v. ETS (cont.)

 The issue:  Whether Scott had been denied due process by ETS’s failure to prove that she had

actually cheated

.

© Caveon, LLC

© Caveon, LLC

Scott v. ETS (cont.)

 The Decision:  “We are satisfied that the relevant public and private interests are fairly accommodated by a procedure which permits ETS to cancel scores upon an adequate showing of

substantial question as to their validity

,

without any necessity for a showing of actual cheating or other misconduct.”

Scott v. ETS (cont.)

 The “relevant public and private interests”  ETS: accuracy of its test results and predictions  Test-takers: no unfair advantage  School officials: reliability  Public: reserving teacher certification for those who fulfill its requirements © Caveon, LLC

© Caveon, LLC

Scott v. ETS (cont.)

 Meaning of “Substantial Question of validity”:  “Here ETS questioned plaintiff's scores on the basis of a statistical analysis showing hardly more than a 4 in 10 million chance that they were fairly earned … That gave ETS, and would give any other observer, substantial grounds for doubting the reliability of the scores.”

Case Study: Murray v. ETS (5 th Cir. 1999)      U. Texas El-Paso basketball recruit Needed an SAT score of 820 (of 1600) to receive a scholarship Scored 700 on his first attempt in March Enrolled in “Testbusters” Scored 1300 in June © Caveon, LLC

Murray v. ETS (cont.)

 Test-use agreement:  “ETS reserves the right to cancel any test score if … the student engages in misconduct, if there is a testing irregularity, or if ETS believes there is a reason to question the score's validity.”  SAT I Registration Bulletin © Caveon, LLC

© Caveon, LLC

Murray v. ETS (cont.)

 The Decision:  “The only contractual duty ETS owed to Murray was to investigate the validity of Murray's scores in good faith.”  Good-faith means:  Having a reason to lack confidence in a test result (from statistics or some other source)   Providing an opportunity to retake the test and to present other evidence supporting scores Allowing independent review

© Caveon, LLC

Murray v. ETS (cont.)

 “ETS dutifully fulfilled its contract [ ..] by following established procedures for determining the validity of questionable scores.”

Conclusions

 Statistical analyses of test data like Caveon Data Forensics provide powerful test security information  Managing cheating with test-use agreements, statistics and retakes is within the legitimate power of testing programs to manage the “validity” of their tests © Caveon, LLC

THANK YOU!

Bob Hunt Caveon Test Security [email protected]

© Caveon, LLC

© Caveon, LLC