Transcript Slide 1

Usability of CAPTCHAs Or “usability issues in CAPTCHA design”

Jeff Yan

School of Computing Science Newcastle University, UK (Joint work with Ahmad Salah El Ahmad)

Apology

 n th  2 nd time to miss SOUPS … (n > 2) time to be unable to present my paper …  All due to the same problem:  A US visit visa!

(started my application in April, I’ve not heard its result yet …) SOUPS’08 (CMU, July 2008) (2)

SOUPS’08 (CMU, July 2008)  Does this man look like a terrorist?! ;-) (3)

CAPTCHA

   Why was it invented?

  Ask any CMU people, or read the cartoon  Automated Turing tests  that computers cannot pass, but human can Almost standard security technology (e.g. for anti spam)  widespread application on commercial websites SOUPS’08 (CMU, July 2008) (4)

Main CAPTCHAs

 Text-based schemes   typically require users to solve a text recognition task the most widely deployed  Sound-based schemes  typically require users to solve a speech recognition task.  Image-based schemes   typically require users to perform an image recognition task Example: Microsoft’s Assira SOUPS’08 (CMU, July 2008) (5)

This paper is about understanding

how to design usability usable

and

robust CAPTCHAs, with a focus on

 Isn’t that … CAPTCHAs with poor usability should not exist by definition?

  Yes, but … still many deployed CAPTCHAs, including famous ones, are not that usable … SOUPS’08 (CMU, July 2008) (7)

 How about robustness?

  When necessary, it will be covered However, our major attacks are discussed in somewhere else   Low-cost attacks on schemes by Microsoft, Yahoo and Google (CCS’08, to appear) The pixel count attack (ACSAC’07)  Breaking CAPTCHAs by counting the number of pixels!

SOUPS’08 (CMU, July 2008) (8)

A framework for CAPTCHA usability

   Distortion  distortion techniques employed and their impact on usability.

Content  content embedded in CAPTCHA challenges and their impact on usability  e.g. how the content should be organized?

Presentation  the way that CAPTCHA challenges are presented and impact on usability. SOUPS’08 (CMU, July 2008) (9)

Distortion | confusing characters

 Well-known that under common distortions, characters such as 1 and l, o and 0, 5 and s, would cause confusion  To be secure (or resistant to segmentation attacks), Google and Yahoo CAPTCHAs introduced new confusing characters  vv or w? rm or nn?

  cl or d? cm or an?

rn or m? nn or m? … SOUPS’08 (CMU, July 2008) (10)

Distortion | confusing characters

 ~6% challenges in Google CAPTCHA, and ~10% in the latest Yahoo scheme (rolled out since Mar 2008) were observed to have such confusing characters.

(11) SOUPS’08 (CMU, July 2008)

Content | string length

  A design issue: string length predictable or not?

Case study:  Microsoft CAPTCHA used a fixed length of 8 characters, which helped its usability The first object is “7”?

The first object is “L”?

With the length info, users can be pretty sure that the first objects in the above examples are noise.

SOUPS’08 (CMU, July 2008) (12)

Content | string length

 However, the length info also helped our automated segmentation attack (success rate: >92%)  Our program knows when to stop!

Start point  Stop: identified 8 chars already SOUPS’08 (CMU, July 2008) (13)

Presentation | the use of colour

 Using colour is common practice in CAPTCHA design (for all sorts of reasons)  However, we have seen many cases in which the use of colour  is unhelpful for usability   has caused negative impact on security, or is problematic in terms of both usability and security SOUPS’08 (CMU, July 2008) (14)

Presentation | the use of colour

 Case 1: Gimpy-r (a well-known early scheme) How human see it How machines see it SOUPS’08 (CMU, July 2008) (15)

Presentation | the use of colour

Case 1: Gimpy-r    Dominant colour of distorted text (often black) is distinguishable:  always the lowest intensity, and  never appeared in the background easy to extract the text colour background:  No much use in terms of security  negative effect in usability (e.g. confusing people) SOUPS’08 (CMU, July 2008) (16)

Presentation | the use of colour

 Case 2: BotBlock How human see it How machines see it SOUPS’08 (CMU, July 2008) (17)

Presentation | the use of colour

Case 2: BotBlock    sophisticated colour management providing resistance to OCR However, the misuse of colour:  texts have distinguishable colour patterns   the same colour for foreground occurs repetitively.

easy to extract text automatically Negative effect on usability and false sense of security .

SOUPS’08 (CMU, July 2008) (18)

Presentation | the use of colour

 It seems that the “Las Vegas effect” also applies to CAPTCHA design  No colour might be better than too much colour  Major CAPTCHAs started to avoid using fancy colour management, including  Microsoft    Yahoo Google reCAPTCHA SOUPS’08 (CMU, July 2008) (19)

The framework:

applied to text CAPTCHAs

Category

Distortion Content Presentation

Usability issue

Distortion method and level Confusing characters Friendly to foreigners?

Character set String length How long?

Predictable or not?

Random string or dictionary word?

Offensive word Font type and size Image size Use of color Integration with web pages SOUPS’08 (CMU, July 2008) (20)

The framework

   Inspired by text-based CAPTCHAs Applicable to sound-based schemes  Details see our paper also applicable to image-based schemes (e.g. IMAGINATION)  for schemes such as Assira and Bongo, in which distortion is absent, only the dimensions of content and presentation will apply.

SOUPS’08 (CMU, July 2008) (21)

Summary

 First attempt towards a systematic analysis of usability issues in CAPTCHA design (in particular, text-based schemes)  Proposed a simple but novel framework, which accommodates both   novel issues we have identified, and known issues scattered in the literature  The framework is applicable to text, sound and (some) image based CAPTCHAs.

SOUPS’08 (CMU, July 2008) (22)