Transcript Slide 1
Optical Character Recognition
By Vicky Shunkwiler, Heather Hurd, and Sam Stone
What is OCR?
OCR is the electronic identification and digital encoding of printed or handwritten characters by means of an optical scanner and specialized software
Informational Video
OCR FOR DUMMIES
How does OCR work?
Scans printed text onto a computer and the software interprets the material and “reads” it off through synthetic speech on the computer
OCR Software
The OCR software scans and determines whether it is reading images or text The machine determines letters and words by recognizing shapes of letters by repetitions and patterns of familiar forms
History of OCR
Originally developed in 1929 by G. Tauschek in Germany In the 1950’s the US funded to develop their own version of OCR, first by the American Bankers Association and Financial Services to process checks
History
In 1953, David Shepard founded Intelligent Machines Research Corporation (IMR) Shepard came up with “Gismo” which later ended up being limited, compared to future IMR systems that could scan and recognize most documents
History
OCR/IMR systems were first used by Readers Digest, IBM, Standard Oil Company, US Air Force, and credit card companies It also became widely used in US, British and Canadian postal services starting in the 1970s
OCR Today
OCR software can distinguish most fonts and some handwritten text The current price is between $3,000.00 and $10,000.00. However is decreasing in price, because of increase in popularity in businesses
OCR Today
OCR software has presently become a popular aide to those with visual impairments because it scans in text and can read it off to them
Factors Affecting OCR Accuracy
Accuracy rate exceeding 98% is necessary for OCR to be more effective than rekeying
Hardware and Software Variables
Scanner Quality Recognition Method and Algorithm Type of Font Scan Resolution Generation of Original Type of Binding
Paper Quality and Typeface Clarity
Pale, broken, or touching characters may not be recognized Stains, marks, or any other non character may be recognized and misinterpreted by OCR Shaded or Colored backgrounds Variations in typeface may be lost or misunderstood
Formatting
Unusual fonts or characters may not be in the software’s catalog and therefore may not be recognized Typed characters are most accurately recognized currently. Research into OCR that recognizes handwritten and cursive characters accurately is underway Tables, indents, footnotes, etc. may not be recognized
Ray Kurzweil
Developed the first OCR that could recognize all kinds of printed text Continually advancing technology for the blind
Kurzweil Music Systems
Kurzweil and Stevie wonder Developed a synthesizer that could reproduce the sound of grand pianos and other instruments
Original Kurzweil Reading Machine
Kurzweil- National Federation for the Blind Reader