Computational Linguistics for Linguists Feature Structures CLINT-LIN-FEATSTR Example PATR-II Grammar and Lexicon Grammar (grammar.grm) Lexicon (lexicon.lex) Rule s -> np vp \w uther \c n Rule np -> n \w sleeps \c v Rule vp -> v CLINT-LIN-FEATSTR.
Download ReportTranscript Computational Linguistics for Linguists Feature Structures CLINT-LIN-FEATSTR Example PATR-II Grammar and Lexicon Grammar (grammar.grm) Lexicon (lexicon.lex) Rule s -> np vp \w uther \c n Rule np -> n \w sleeps \c v Rule vp -> v CLINT-LIN-FEATSTR.
Computational Linguistics for Linguists Feature Structures 2007 CLINT-LIN-FEATSTR 1 Example PATR-II Grammar and Lexicon Grammar (grammar.grm) Lexicon (lexicon.lex) Rule s -> np vp \w uther \c n Rule np -> n \w sleeps \c v Rule vp -> 2007 v CLINT-LIN-FEATSTR 2 Example PATR-II Grammar and Lexicon Grammar (grammar.grm) Lexicon (lexicon.lex) Rule s -> np vp \w uther \c n Rule np -> n \w sleeps \c v Rule vp -> \w sleep \c v 2007 v CLINT-LIN-FEATSTR 3 Example PATR-II Grammar and Lexicon %Grammar(grammar.grm) %Lexicon (lexicon.lex) Rule s -> Rule s -> Rule npsg Rule nppl Rule vpsg Rule vppl \w cows \c npl 2007 npsg vpsg nppl vppl \w uther \c nsg -> nsg -> npl -> vsg \w sleeps \c vsg \w sleep \c vpl -> vpl CLINT-LIN-FEATSTR 4 Grammar and Lexicon with Pronouns %Grammar(grammar.grm) Rule s -> npsg vpsg Rule s -> nppl vppl Rule npsg -> nsg Rule nppl -> npl Rule vpsg -> vsg Rule vppl -> vpl 2007 %Lexicon (lexicon.lex) \w he \c nsg \w him \c nsg \w she \c nsg \w her \c nsg \w they \c npl \w them \c npl \w sleeps \c vsg \w sleep \c vpl CLINT-LIN-FEATSTR 5 Problem with the Grammar • The grammar allows: he/him/she/her sleeps they/them sleep 2007 CLINT-LIN-FEATSTR 6 Grammar and Lexicon with Pronouns %Grammar(grammar.grm) Rule s -> npsgnom vpsg Rule s -> npplnom vppl Rule npsgnom -> nsgnom Rule npplnom -> nplnom Rule npsgacc -> nsgacc Rule npplacc -> nplacc Rule vpsg -> vsg Rule vppl -> vpl 2007 %Lexicon (lexicon.lex) \w he \c nsgnom \w him \c nsgacc \w she \c nsgnom \w her \c nsgacc \w they \c nplnom \w them \c nplacc \w sleeps \c vsg \w sleep \c vpl CLINT-LIN-FEATSTR 7 Remarks • The only mechanism available to CFG to prevent overgeneration is the creation of new categories. • Whenever we add new categories the grammar gets longer and less understandable • Is there another way? 2007 CLINT-LIN-FEATSTR 8 Constraints and Information Structures • PATR2 handles this problem by associating words with feature structures. • Feature structures are commonly written as attribute-value matrices e.g. [cat noun num sing ] • Items on the left are attributes • Items on the right are corresponding values 2007 CLINT-LIN-FEATSTR 9 Constraints and Information Structures • Rules are then augmented with constraint equations between feature structures associated with constituents. • These can be used to express constraints between constituents (eg subject/verb agreement), • or to pass information from words up to higher constituents (e.g. np inherits information from n). 2007 CLINT-LIN-FEATSTR 10 Example of a PATR rules with Constraints Rule s -> np vp <np num> = <vp num> Rule np -> n <np head> = <n head> 2007 CLINT-LIN-FEATSTR 11 Feature Constraints Feature constraints comprise three parts, in this order: 1. a feature path, the first element of which is one of the symbols from the phrase structure rule 2. an equal sign (=) 3. either a simple value, or another feature path that also starts with a symbol from the phrase structure rule 2007 CLINT-LIN-FEATSTR 12 Unification • Unification is the basic operation applied to feature structures in PC-PATR • It consists of the merging of the information from two feature structures. • Two feature structures can unify if their common features have the same values, but do not unify if any feature values conflict. 2007 CLINT-LIN-FEATSTR 13 Examples [num sg] unified with [person first] gives [num sg person first] [num sg] unified with [num sg] gives [num sg] [num sg] unified with [num pl] gives … 2007 CLINT-LIN-FEATSTR 14 Examples [num sg] unified with [person first] gives [num sg person first] [num sg] unified with [num sg] gives [num sg] [num sg] unified with [num pl] gives NOTHING 2007 CLINT-LIN-FEATSTR 15 Complex-Valued FS • Feature structures can have either simple values, or complex values, such as this [cat np head [agr [ num sg gen masc] deftype indef]] • Feature structures can be arbitrarily nested and used to build linguistic representations. 2007 CLINT-LIN-FEATSTR 16 Building Up Structures • Agreement Features – 3rd person singular [ num sing person 3 ] • Noun Phrase – 3rd person sing noun phrase [ cat np agr [ num sing person 3 ]] • Sentence – with 3rd person singular subject [cat s subj [ cat np agr [ num sing person 3 ]]] 2007 CLINT-LIN-FEATSTR 17 Simple Unification Examples 1. [ agreement: 4.[ agreement: [ number: singular [ number: singular person: first ] ] person: first ] 2. [ agreement: case: nominative ] ] [ number: singular 5. [ agreement: case: nominative ] ] [ number: singular 3. [ agreement: person: third ] [ number: singular case: nominative ] ] person: third ] ] 2007 CLINT-LIN-FEATSTR 18 Checkpoint Satisfy yourself that, using the previous examples: • unify(1,2) = 4 • unify(2,3) = 5 • unify(1,3) = fail 2007 CLINT-LIN-FEATSTR 19 Paths • Portions of a feature structure can be referred to using the path notation. • A path is a sequence of one or more feature names enclosed in angled brackets (< >). For instance, (1) <head> (2) <head deftype> (3) <head agr num> • Paths are used to express feature constraints, 2007 CLINT-LIN-FEATSTR 20 Examples of Constraints • <head deftype> = indef • <np head agr> = <vp head agr> 2007 CLINT-LIN-FEATSTR 21 Constraint Equations • The feature constraints associated with phrase structure rules in PC-PATR consist of a set ofunification expressions. • Each expression has three parts, in this order: • a feature path, the first element of which is one of the symbols from the phrase structure rule • an equal sign (=) • either a simple value, or another feature path that also starts with a symbol from the phrase structure 2007 CLINT-LIN-FEATSTR 22 Execution of Equations • Each equation is interpreted as an instruction to unify the left and right hand sides • First, each side is "evaluated" before any unification is attempted. If the path does not exist it is created. • After successful unification, the two structures are not merely equivalent, but identical, so that any changes to one affect changes to the other. 2007 CLINT-LIN-FEATSTR 23 Lexical Entries • Lexical entries define the basic properties of words. • Each definition divided into fields, each of which begins with a standard format marker at the beginning of a line. – \w the lexical form of the word, – \c word category (part of speech) – \g word gloss – \f additional features of this word 2007 CLINT-LIN-FEATSTR 24 Lexical Entry Examples \w fox \c N \g canine \f <number> = singular \w foxes \c N \g canine+PL \f <number> = plural 2007 CLINT-LIN-FEATSTR 25 Corresponding Feature Structures • When these entries are used by the grammar, they are represented by these feature structures: [ cat: N gloss: canine lex: foxes number: singular ] [ cat: N gloss: canine+PL lex: foxes number: plural ] 2007 CLINT-LIN-FEATSTR 26