Transcript 计算语言学的论文阅读
Natural Language Processing Course Project: Zhao Hai 赵海 Department of Computer Science and Engineering Shanghai Jiao Tong University [email protected] Goals • Develop an English grammatical error checker – Only consider tense errors for verbs 2 Examples • 2 I plays football yesterday . • 2 l drink tea last week . • 2 Mary visits the factory last month . • 2 I finished reading the novel by nine o'clock last night . • 2 We has learned over two thousand English words by the end of last term . • 3 They had plant six hundred trees by the end of last Wednesday . 3 Data Format • Data format of input file like the following (each sentence in a line): – I likes this bicycle. • You program can support the above test input file and output your results as follows with numbers indicate which words have errors ( -1 means no error). – 2 I likes bicycle as I was a boy. – 2 7 He follow the great idea that have made a great success. – -1 I enjoy the dinner. • • All submitted systems should accept arguments in command line : – Your_program_test.input output.test 4 Evaluation Metric: Definition • Comparing the difference between golden test data and your system outputs, our evaluation program will get a f-score to score your outputs F=2RP/(R+P) R = number of correctly marked words / number of problematic words in golden set P = number of correctly marked words / number of marked words in output 5 Schedule • Five weeks for your system . • Test dataset will be released 24 hours in advance before the submission deadline for your system outputs. 6 Submission • Four parts are required for the submission (please package all your files and then upload): – The complete source code of your system, and one executable file for a specific OS at least. – Document 1:about your code infrastructure, compiling options and environment and running setting. – Document 2:the principles of your system, including which classifier, features and decoding algorithm that your opt. – If available: Models that you train from the provided corpus and your system outputs for the given test data. 7 Groups and Scoring • Grouping – 1 member for a team, 100% 8 Groups and Scoring • The team who gives the highest F-score will receive a score of 100 and the lowest team will receive 60, other teams will receive their scores based on an interpolation strategy between these two scores. Plus – Document quality • You may adopt any open-source toolkit in your system. • It has no impact on your system scoring, but • We must see a footnote about where the toolkit is from • Compiling error, incomplete document, or incorrect data format may cause score loss. 9 Attention • We will compare all system outputs, exact match will let all teams receive ZERO point. • The system that fails to output the same result as that in the corresponding package will receive ZERO point. 10 Tips • It is expected to be a rule-based system • Write your own scoring program 11 Techniques • Building you checker, you may need part-of-speech for word to design your rules. • POS tagging toolkits are available online. Consider using them! • If you have to adopt these existing toolkit, then you must provide necessary information in the document to let us know. • 12 Techniques: building your own POS tagger • Machine learning model – HMM, or – Maximum entropy Markov model • Decoding algorithm – Viterbi • Reference – http://www.aclweb.org/anthology/I/I08/I08-4011.pdf – For the best performance, two-pass decoding was adopted in the above paper. However, you may consider one-pass only decoding for better efficiency. • Tips: there are many open source POS taggers online, consider revise them and integrate them into your system. 13 CoNLL 2013 shared task • Survey paper: – http://www.comp.nus.edu.sg/~nlp/conll13st/Co NLLST01.pdf • Proceeding – http://wing.comp.nus.edu.sg/~antho/signll.html • Note this project requires a rule-based system rather than a supervised learning system like CoNLL 2013 shared task 14