people.cis.ksu.edu

Download Report

Transcript people.cis.ksu.edu

Statistical Language Identifier
By Klinton Brown
Project Description
●
●
Gather statistical data on various languages by
analyzing large bodies of text.
Use this data to reliably determine the
language of text input by a user.
Requirements
●
A feature to analyze large bodies of text
●
A feature to compare input to known languages
●
A user-friendly interface
Tools
●
Java
●
Eclipse
●
Wikipedia for text corpora
Timeline
●
●
●
October
–
Choose project
–
Presentation 1
November
–
Implement text analyzer
–
Implement language comparer
December
–
Create GUI
–
Test program
Questions?