Computational Linguistics for Linguists Feature Structures CLINT-LIN-FEATSTR Example PATR-II Grammar and Lexicon Grammar (grammar.grm) Lexicon (lexicon.lex) Rule s -> np vp \w uther \c n Rule np -> n \w sleeps \c v Rule vp -> v CLINT-LIN-FEATSTR.

Download Report

Transcript Computational Linguistics for Linguists Feature Structures CLINT-LIN-FEATSTR Example PATR-II Grammar and Lexicon Grammar (grammar.grm) Lexicon (lexicon.lex) Rule s -> np vp \w uther \c n Rule np -> n \w sleeps \c v Rule vp -> v CLINT-LIN-FEATSTR.

Computational Linguistics
for Linguists
Feature Structures
2007
CLINT-LIN-FEATSTR
1
Example PATR-II
Grammar and Lexicon
Grammar (grammar.grm)
Lexicon (lexicon.lex)
Rule
s -> np vp
\w uther
\c n
Rule
np -> n
\w sleeps
\c v
Rule
vp ->
2007
v
CLINT-LIN-FEATSTR
2
Example PATR-II
Grammar and Lexicon
Grammar (grammar.grm)
Lexicon (lexicon.lex)
Rule
s -> np vp
\w uther
\c n
Rule
np -> n
\w sleeps
\c v
Rule
vp ->
\w sleep
\c v
2007
v
CLINT-LIN-FEATSTR
3
Example PATR-II
Grammar and Lexicon
%Grammar(grammar.grm)
%Lexicon (lexicon.lex)
Rule
s ->
Rule
s ->
Rule
npsg
Rule
nppl
Rule
vpsg
Rule
vppl
\w cows
\c npl
2007
npsg vpsg
nppl vppl
\w uther
\c nsg
-> nsg
-> npl
-> vsg
\w sleeps
\c vsg
\w sleep
\c vpl
-> vpl
CLINT-LIN-FEATSTR
4
Grammar and Lexicon
with Pronouns
%Grammar(grammar.grm)
Rule
s -> npsg vpsg
Rule
s -> nppl vppl
Rule
npsg -> nsg
Rule
nppl -> npl
Rule
vpsg -> vsg
Rule
vppl -> vpl
2007
%Lexicon (lexicon.lex)
\w he
\c nsg
\w him
\c nsg
\w she
\c nsg
\w her
\c nsg
\w they
\c npl
\w them
\c npl
\w sleeps
\c vsg
\w sleep
\c vpl
CLINT-LIN-FEATSTR
5
Problem with the Grammar
• The grammar allows:
he/him/she/her sleeps
they/them sleep
2007
CLINT-LIN-FEATSTR
6
Grammar and Lexicon
with Pronouns
%Grammar(grammar.grm)
Rule
s -> npsgnom vpsg
Rule
s -> npplnom vppl
Rule
npsgnom -> nsgnom
Rule
npplnom -> nplnom
Rule
npsgacc -> nsgacc
Rule
npplacc -> nplacc
Rule
vpsg -> vsg
Rule
vppl -> vpl
2007
%Lexicon (lexicon.lex)
\w he
\c nsgnom
\w him
\c nsgacc
\w she
\c nsgnom
\w her
\c nsgacc
\w they
\c nplnom
\w them
\c nplacc
\w sleeps
\c vsg
\w sleep
\c vpl
CLINT-LIN-FEATSTR
7
Remarks
• The only mechanism available to CFG to
prevent overgeneration is the creation of
new categories.
• Whenever we add new categories the
grammar gets longer and less
understandable
• Is there another way?
2007
CLINT-LIN-FEATSTR
8
Constraints and
Information Structures
• PATR2 handles this problem by associating
words with feature structures.
• Feature structures are commonly written as
attribute-value matrices e.g.
[cat noun
num sing ]
• Items on the left are attributes
• Items on the right are corresponding
values
2007
CLINT-LIN-FEATSTR
9
Constraints and
Information Structures
• Rules are then augmented with constraint
equations between feature structures
associated with constituents.
• These can be used to express constraints
between constituents (eg subject/verb
agreement),
• or to pass information from words up to
higher constituents (e.g. np inherits
information from n).
2007
CLINT-LIN-FEATSTR
10
Example of a PATR rules
with Constraints
Rule
s -> np vp
<np num> = <vp num>
Rule
np -> n
<np head> = <n head>
2007
CLINT-LIN-FEATSTR
11
Feature Constraints
Feature constraints comprise three parts,
in this order:
1. a feature path, the first element of which is one of
the symbols from the phrase structure rule
2. an equal sign (=)
3. either a simple value, or another feature path that
also starts with a symbol from the phrase
structure rule
2007
CLINT-LIN-FEATSTR
12
Unification
• Unification is the basic operation applied
to feature structures in PC-PATR
• It consists of the merging of the
information from two feature structures.
• Two feature structures can unify if their
common features have the same values,
but do not unify if any feature values
conflict.
2007
CLINT-LIN-FEATSTR
13
Examples
[num sg] unified with [person first] gives
[num sg
person first]
[num sg] unified with [num sg] gives
[num sg]
[num sg] unified with [num pl] gives …
2007
CLINT-LIN-FEATSTR
14
Examples
[num sg] unified with [person first] gives
[num sg
person first]
[num sg] unified with [num sg] gives
[num sg]
[num sg] unified with [num pl] gives
NOTHING
2007
CLINT-LIN-FEATSTR
15
Complex-Valued FS
• Feature structures can have either simple
values, or complex values, such as this
[cat np
head [agr
[ num sg
gen masc]
deftype indef]]
• Feature structures can be arbitrarily
nested and used to build linguistic
representations.
2007
CLINT-LIN-FEATSTR
16
Building Up Structures
• Agreement Features – 3rd person singular
[ num sing
person 3 ]
• Noun Phrase – 3rd person sing noun phrase
[ cat np
agr [ num sing
person 3 ]]
• Sentence – with 3rd person singular subject
[cat s
subj [ cat np
agr [ num sing
person 3 ]]]
2007
CLINT-LIN-FEATSTR
17
Simple Unification Examples
1. [ agreement:
4.[ agreement:
[ number: singular
[ number: singular
person: first ] ]
person: first ]
2. [ agreement:
case: nominative ] ]
[ number: singular
5. [ agreement:
case: nominative ] ]
[ number: singular
3. [ agreement:
person: third ]
[ number: singular
case: nominative ] ]
person: third ] ]
2007
CLINT-LIN-FEATSTR
18
Checkpoint
Satisfy yourself that, using the previous
examples:
• unify(1,2) = 4
• unify(2,3) = 5
• unify(1,3) = fail
2007
CLINT-LIN-FEATSTR
19
Paths
• Portions of a feature structure can be referred to
using the path notation.
• A path is a sequence of one or more feature
names enclosed in angled brackets (< >). For
instance,
(1) <head>
(2) <head deftype>
(3) <head agr num>
• Paths are used to express feature constraints,
2007
CLINT-LIN-FEATSTR
20
Examples of Constraints
• <head deftype> = indef
• <np head agr> = <vp head agr>
2007
CLINT-LIN-FEATSTR
21
Constraint Equations
• The feature constraints associated with phrase
structure rules in PC-PATR consist of a set
ofunification expressions.
• Each expression has three parts, in this order:
• a feature path, the first element of which is one
of the symbols from the phrase structure rule
• an equal sign (=)
• either a simple value, or another feature path
that also starts with a symbol from the phrase
structure
2007
CLINT-LIN-FEATSTR
22
Execution of Equations
• Each equation is interpreted as an instruction to
unify the left and right hand sides
• First, each side is "evaluated" before any
unification is attempted. If the path does not
exist it is created.
• After successful unification, the two structures
are not merely equivalent, but identical, so that
any changes to one affect changes to the other.
2007
CLINT-LIN-FEATSTR
23
Lexical Entries
• Lexical entries define the basic properties
of words.
• Each definition divided into fields, each of
which begins with a standard format
marker at the beginning of a line.
– \w the lexical form of the word,
– \c word category (part of speech)
– \g word gloss
– \f additional features of this word
2007
CLINT-LIN-FEATSTR
24
Lexical Entry Examples
\w fox
\c N
\g canine
\f <number> = singular
\w foxes
\c N
\g canine+PL
\f <number> = plural
2007
CLINT-LIN-FEATSTR
25
Corresponding Feature Structures
• When these entries are used by the grammar,
they are represented by these feature structures:
[ cat: N
gloss: canine
lex: foxes
number: singular ]
[ cat: N
gloss: canine+PL
lex: foxes
number: plural ]
2007
CLINT-LIN-FEATSTR
26