Representing Meaning Lecture 19 13 Sep 2007 Semantic Analysis Semantic analysis is the process of taking in some linguistic input and assigning a meaning.
Download
Report
Transcript Representing Meaning Lecture 19 13 Sep 2007 Semantic Analysis Semantic analysis is the process of taking in some linguistic input and assigning a meaning.
Representing Meaning
Lecture 19
13 Sep 2007
Semantic Analysis
Semantic analysis is the process of taking in some
linguistic input and assigning a meaning representation
to it.
There a lot of different ways to do this that make more
or less (or no) use of syntax
We’re going to start with the idea that syntax does
matter
The compositional rule-to-rule approach
Compositional Semantics
Syntax-driven methods of assigning semantics to
sentences
Semantic Processing
We’re going to discuss 2 ways to attack this problem
(just as we did with parsing)
There’s the theoretically motivated correct and
complete approach…
Computational/Compositional Semantics
Create a FOL representation that accounts for all the entities,
roles and relations present in a sentence.
And there are practical approaches that have some
hope of being useful and successful.
Information extraction
Do a superficial analysis that pulls out only the entities,
relations and roles that are of interest to the consuming
application.
Compositional Analysis
Principle of Compositionality
The meaning of a whole is derived from the meanings
of the parts
What parts?
The constituents of the syntactic parse of the input
What could it mean for a part to have a meaning?
Better
Turns out this representation isn’t quite as useful as it
could be.
Giving(John, Mary, Book)
Better would be one where the “roles” or “cases” are
separated out. E.g., consider:
$ x , y Giving ( x )^ Giver ( John , x )^ Given ( y , x )
^ Givee ( Mary, x )^ Isa ( y , Book )
Note: essentially Giver=Agent, Given=Theme,
Givee=To-Poss
Predicates
The notion of a predicate just got more complicated…
In this example, think of the verb/VP providing a template like
the following
$ w, x, y, zGiving( x)^ Giver( w, x)^ Given( y, x)^ Givee( z, x)
The semantics of the NPs and the PPs in the sentence plug into
the slots provided in the template
Advantages
Can have variable number of arguments associated with an
event: events have many roles and fillers can be glued on as
appear in the input.
Specifies categories (e.g., book) so that we can make
assertions about categories themselves as well as their
instances. E.g., Isa(MobyDick, Novel), AKO(Novel, Book).
Reifies events so that they can be quantified and related to
other events and objects via sets of defined relations.
Can see logical connections between closely related examples
without the need for meaning postulates.
Example
AyCaramba serves meat
$e Serving(e)^ Server(e, AyCaramba)^ Served(e, Meat)
Compositional Analysis
Augmented Rules
We’ll accomplish this by attaching semantic formation rules to our
syntactic CFG rules
Abstractly
A 1...n
{ f ( 1.sem,...n.sem)}
This should be read as the semantics we attach to A can be
computed from some function applied to the semantics of A’s parts.
Example
Easy parts…
NP -> PropNoun
NP -> MassNoun
PropNoun -> AyCaramba
MassMoun -> meat
Attachments
{PropNoun.sem}
{MassNoun.sem}
{AyCaramba}
{MEAT}
Example
S -> NP VP
VP -> Verb NP
Verb -> serves
{VP.sem(NP.sem)}
{Verb.sem(NP.sem)
???
xy $e Serving(e)^ Server(e, y)^ Served(e, x)
Lambda Forms
A simple addition to FOPC
Take a FOPC sentence
with variables in it that are
to be bound.
Allow those variables to be
bound by treating the
lambda form as a function
with formal arguments
xP(x)
xP( x)(Sally)
P( Sally)
Example
Example
Example
Example
Syntax/Semantics Interface:
Two Philosophies
1.
2.
Let the syntax do what syntax does well and don’t expect it to
know much about meaning
In this approach, the lexical entry’s semantic attachments
do all the work
Assume the syntax does know something about meaning
•
Here the grammar gets complicated and the lexicon
simpler (constructional approach)
Example
Mary freebled John the nim.
Who has it?
Where did he get it from?
Why?
Example
Consider the attachments for the VPs
VP -> Verb NP NP rule (gave Mary a book)
VP -> Verb NP PP
(gave a book to Mary)
Assume the meaning representations should be the
same for both. Under the lexicon-heavy scheme, the
VP attachments are:
VP.Sem (NP.Sem, NP.Sem)
VP.Sem (NP.Sem, PP.Sem)
Example
Under a syntax-heavy scheme we might want to do
something like
VP -> V NP NP
V.sem ^ Recip(NP1.sem) ^ Object(NP2.sem)
VP -> V NP PP
V.Sem ^ Recip(PP.Sem) ^ Object(NP1.sem)
i.e the verb only contributes the predicate, the grammar
“knows” the roles.
Integration
Two basic approaches
Integrate semantic analysis into the parser (assign
meaning representations as constituents are
completed)
Pipeline… assign meaning representations to
complete trees only after they’re completed
Semantic Augmentation to CFG Rules
CFG Rules are attached with semantic attachments.
These semantic attachments specify how to compute the meaning
representation of a construction from the meanings of its constituent parts.
A CFG rule with semantic attachment will be as follows:
A 1,…,n
{ f(j.sem,…,k.sem) }
The meaning representation of A, A.sem, will be calculated by applying the
function f to the semantic representations of some constituents.
Naïve Approach
ProperNoun Anarkali {Anarkali }
MassNoun meat
{ Meat }
NP ProperNoun
{ProperNoun.sem }
NP MassNoun
{ MassNoun.sem }
Verb serves {$e,x,y ISA(e,Serving) Server(e,x) Served(e,y) }
But we cannot propagate this representation to upper levels.
Using Lambda Notations
ProperNoun Anarkali
{ Anarkali }
MassNoun meat
{ Meat }
NP ProperNoun
{ProperNoun.sem }
NP MassNoun
{ MassNoun.sem }
Verb serves {xy $e ISA(e,Serving) Server(e,y) Served(e,x) }
VP Verb NP { Verb.sem(NP.sem) }
S NP VP
{ VP.sem(NP.sem) }
application of lambda expression
lambda expression
Quasi-Logical Form
During semantic analysis, we may use quantified expressions as terms. In
this case, our formula will not be a FOPC formula. We call this form of
formulas as quasi-logical form.
A quasi-logical form should be converted into a normal FOPC formula by
applying simple syntactic translations.
Server(e,<$x ISA(x,Restaurant)>)
a quasi-logical formula
$x ISA(x,Restaurant ) Server(e,x)
a normal FOPC formula
Parse Tree with Logical Forms
S
write(bertrand,principia)
VP
y.write(y,principia)
NP
bertand
bertrand
V
x.y.write(y,x)
writes
NP
principia
principia
Pros and Cons
If you integrate semantic analysis into the parser as it is
running…
You can use semantic constraints to cut off parses
that make no sense
But you assign meaning representations to
constituents that don’t take part in the correct (most
probable) parse
Complex Terms
Allow the compositional system to pass around representations
like the following as objects with parts:
Complex-Term
→
<Quantifier var body>
$ x Isa(x, Restaurant)
Example
Our restaurant example winds up looking like
$eServing(e) Server(e, $xIsa( x, Restaurant) ) Served(e,Meat)
Big improvement…
Conversion
So… complex terms wind up being embedded inside
predicates. So pull them out and redistribute the parts in
the right way…
P(<quantifier, var, body>)
turns into
Quantifier var body connective P(var)
Example
Server(e, $ x Isa( x, Restaurant) )
$ x Isa( x, Restaurant) Server(e, x)
Quantifiers and Connectives
If the quantifier is an existential, then the connective is
an ^ (and)
If the quantifier is a universal, then the connective is an
-> (implies)
Multiple Complex Terms
Note that the conversion technique pulls the quantifiers
out to the front of the logical form…
That leads to ambiguity if there’s more than one complex
term in a sentence.
Quantifier Ambiguity
Consider
Every restaurant has a menu
That could mean that
every restaurant has a menu
Or that
There’s some uber-menu out there and all restaurants have
that menu
Quantifier Scope Ambiguity
xRestaurant ( x)
$e, yHaving(e) Haver(e, x) Had(e, y) Isa( y, Menu)
$yIsa( y, Menu) xIsa( x, Restaurant)
$eHaving(e) Haver(e, x) Had(e, y)
Ambiguity
This turns out to be a lot like the prepositional phrase
attachment problem
The number of possible interpretations goes up
exponentially with the number of complex terms in the
sentence
The best we can do is to come up with weak methods to
prefer one interpretation over another
Non-Compositionality
Unfortunately, there are lots of examples where the
meaning (loosely defined) can’t be derived from the
meanings of the parts
Idioms, jokes, irony, sarcasm, metaphor, metonymy,
indirect requests, etc
English Idioms
Kick the bucket, buy the farm, bite the bullet, run the
show, bury the hatchet, etc…
Lots of these… constructions where the meaning of the
whole is either
Totally unrelated to the meanings of the parts (kick
the bucket)
Related in some opaque way (run the show)
The Tip of the Iceberg
Describe this construction
1. A fixed phrase with a particular meaning
2. A syntactically and lexically flexible phrase with a
particular meaning
3. A syntactically and lexically flexible phrase with a
partially compositional meaning
4. …
Example
Enron is the tip of the iceberg.
NP -> “the tip of the iceberg”
Not so good… attested examples…
the tip of Mrs. Ford’s iceberg
the tip of a 1000-page iceberg
the merest tip of the iceberg
How about
That’s just the iceberg’s tip.
Example
What we seem to need is something like
NP ->
An initial NP with tip as its head followed by
a subsequent PP with of as its head and that has
iceberg as the head of its NP
And that allows modifiers like merest, Mrs. Ford, and
1000-page to modify the relevant semantic forms
Quantified Phrases
Consider
A restaurant serves meat.
Assume that A restaurant looks like
$x Isa( x, Restaurant)
If we do the normal lambda thing we get
$eServing(e) Server(e, $xIsa( x, Restaurant)) Served(e,Meat))
Semantic analysis
Goal: to form the formal structures from smaller pieces
Three approaches:
Syntax-driven semantic analysis
Semantic grammar
Information extraction: filling templates
Semantic grammar
Syntactic parse trees only contain parts that are unimportant in
semantic processing.
Ex: Mary wants to go to eat some Italian food
Rules in a semantic grammar
InfoRequest USER want to go to eat FOODTYPE
FOODTYPENATIONALITY FOODTYPE
NATIONALITYItalian/Mexican/….
Semantic grammar (cont)
Pros:
No need for syntactic parsing
Focus on relevant info
Semantic grammar helps to disambiguate
Cons:
The grammar is domain-specific.
Information extraction
The desired knowledge can be described by a relatively simple and
fixed template.
Only a small part of the info in the text is relevant for filling the
template.
No full parsing is needed: chunking, NE tagging, pattern matching,
…
IE is a big field: e.g., MUC. KnowItAll
Summary of semantic analysis
Goal: to form the formal structures from smaller pieces
Three approaches:
Syntax-driven semantic analysis
Semantic grammar
Information extraction
Lexical Semantics
Meaning
Traditionally, meaning in language has been studied from
three perspectives
The meaning of a text or discourse
The meanings of individual sentences or utterances
The meanings of individual words
We started in the middle, now we’ll look at the meanings of
individual words.
Word Meaning
We didn’t assume much about the meaning of words when
we talked about sentence meanings
Verbs provided a template-like predicate argument structure
Nouns were practically meaningless constants
There has be more to it than that
The internal structure of words that determines
where they can go and what they can do (syntagmatic)
What’s a word?
Words?: Types, tokens, stems, roots, inflected forms?
Lexeme –
An entry in a lexicon consisting of a pairing of a form with
a single meaning representation
Lexicon - A collection of lexemes
Lexical Semantics
The linguistic study of systematic meaning related structure of
lexemes is called Lexical Semantics.
A lexeme is an individual entry in the lexicon.
A lexicon is meaning structure holding meaning relations of
lexemes.
A lexeme may have different meanings. A lexeme’s meaning
component is known as one of its senses.
Different senses of the lexeme duck.
an animal, to lower the head, ...
Different senses of the lexeme yüz
face, to swim, to skin, the front of something, hundred, ...
Relations Among Lexemes and
Their Senses
Homonymy
Polysemy
Snonymy
Hyponymy
Hypernym
Homonymy
Homonymy is a relation that holds between words having the same form
(pronunciation, spelling) with unrelated meanings.
Bank -- financial institution, river bank
Bat -- (wooden stick-like thing) vs (flying scary mammal thing)
Fluke –
A fish, and a flatworm.
The end parts of an anchor.
The fins on a whale's tail.
A stroke of luck.
Homograph disambiguation is critically important in speech
synthesis, natural language processing and other fields.
Polysemy
Polysemy is the phenomenon of multiple related meanings in a same
lexeme.
Bank -- financial institution, blood bank, a synonym for 'rely upon'
-- these senses are related.
•
"I'm your friend, you can bank on me"
•
While some banks furnish sperm only to married women, others are
less restrictive
•
However: a river bank is a homonym to 1 and 2, as they do not share
etymologies. It is a completely different meaning
Mole - a small burrowing mammal
several different entities called moles which refer to different things, but their
names derive from 1.
e.g. A Mole (espionage) burrows for information hoping to go undetected. .
Polysemy
Milk
The verb milk (e.g. "he's milking it for all he can get") derives from the
process of obtaining milk.
Lexicographers define polysemes within a single dictionary lemma,
numbering different meanings, while homonyms are treated in separate
lemmata.
Most non-rare words have multiple meanings
The number of meanings is related to its frequency
Verbs tend more to polysemy
Distinguishing polysemy from homonymy isn’t always easy
(or necessary)
Synonymy
Synonymy is the phenomenon of two different lexemes having
the same meaning.
Big and large
In fact, one of the senses of two lexemes are same.
There aren’t any true synonyms.
Two lexemes are synonyms if they can be successfully substituted
for each other in all situations
What does successfully mean?
Preserves the meaning
But may not preserve the acceptability based on notions of politeness, slang,
...
Example - Big and large?
That’s my big sister
That’s my large sister
a big plane
a large plane
Hyponymy and Hypernym
Hyponymy: one lexeme denotes a subclass of the other lexeme.
The more specific lexeme is a hyponymy of the more general lexeme.
The more general lexeme is a hypernym of the more specific lexeme.
A hyponymy relation can be asserted between two lexemes when the
meanings of the lexemes entail a subset relation
Since dogs are canids
Dog is a hyponym of canid and
Canid is a hypernym of dog
Car is a hyponymy of vehicle, vehicle is a hypernym of car.
Ontology
The term ontology refers to a set of distinct objects resulting from
analysis of a domain.
A taxonomy is a particular arrangements of the elements of
an ontology into a tree-like class inclusion structure.
A lexicon holds different senses of lexemes together with other
relations among lexemes.
Lexical Resourses
There are lots of lexical resources available
Word lists
On-line dictionaries
Corpora
The most ambitious one is WordNet
A database of lexical relations for English
Versions for other languages are under development
WordNet
WordNet is widely used lexical database for English.
WebPage: http://www.cogsci.princeton.edu/~wn/
It holds:
The senses of the lexemes
holds relations among nouns such as hypernym, hyponym, MemberOf,
..
Holds relations among verbs such as hypernym, …
Relations are held for each different senses of a lexeme.
WordNet Relations
Some of WordNet Relations (for nouns)
WordNet Hierarchies
Hyponymy chains for the senses of the lexeme bass
WordNet - bass
The noun "bass" has 8 senses in WordNet.
1. bass -- (the lowest part of the musical range)
2. bass, bass part -- (the lowest part in polyphonic music)
3. bass, basso -- (an adult male singer with the lowest voice)
4. sea bass, bass -- (the lean flesh of a saltwater fish of the family Serranidae)
5. freshwater bass, bass -- (any of various North American freshwater fish with
lean flesh (especially of the genus Micropterus))
6. bass, bass voice, basso -- (the lowest adult male singing voice)
7. bass -- (the member with the lowest range of a family of musical instruments)
8. bass -- (nontechnical name for any of numerous edible marine and
freshwater spiny-finned fishes)
The adjective "bass" has 1 sense in WordNet.
1. bass, deep -- (having or denoting a low vocal or instrumental range; "a deep
voice"; "a bass voice is lower than a baritone voice"; "a bass clarinet")
WordNet –bass Hyponyms
Results for "Hyponyms (...is a kind of this), full" search of noun "bass"
6 of 8 senses of bass
Sense 2
bass, bass part -- (the lowest part in polyphonic music)
=> ground bass -- (a short melody in the bass that is constantly repeated)
=> thorough bass, basso continuo -- (a bass part written out in full and accompanied by figures for successive chords)
=> figured bass -- (a bass part in which the notes have numbers under them to indicate the chords to be played)
Sense 4
sea bass, bass -- (the lean flesh of a saltwater fish of the family Serranidae)
=> striped bass, striper -- (caught along the Atlantic coast of the United States)
Sense 5
freshwater bass, bass -(any of various North American freshwater fish with lean flesh (especially of the genus Micropterus))
=> largemouth bass -- (flesh of largemouth bass)
=> smallmouth bass -- (flesh of smallmouth bass)
Sense 6
bass, bass voice, basso -- (the lowest adult male singing voice)
=> basso profundo -- (a very deep bass voice)
Sense 7
bass -- (the member with the lowest range of a family of musical instruments)
=> bass fiddle, bass viol, bull fiddle, double bass, contrabass, string bass -(largest and lowest member of the violin family)
=> bass guitar -- (the lowest six-stringed guitar)
=> bass horn, sousaphone, tuba -- (the lowest brass wind instrument)
=> euphonium -- (a bass horn (brass wind instrument) that is the tenor of the tuba family)
=> helicon, bombardon -- (a tuba that coils over the shoulder of the musician)
=> bombardon, bombard -- (a large shawm; the bass member of the shawm family)
Sense 8
bass -- (nontechnical name for any of numerous edible marine and freshwater spiny-finned fishes)
=> freshwater bass -- (North American food and game fish)
WordNet – bass Synonyms
Results for "Synonyms, ordered by estimated frequency" search of noun "bass"
8 senses of bass
Sense 1
bass -- (the lowest part of the musical range)
=> low pitch, low frequency -- (a pitch that is perceived as below other pitches)
Sense 2
bass, bass part -- (the lowest part in polyphonic music)
=> part, voice -(the melody carried by a particular voice or instrument in polyphonic music; "he tried to sing the tenor part")
Sense 3
bass, basso -- (an adult male singer with the lowest voice)
=> singer, vocalist, vocalizer, vocaliser -- (a person who sings)
Sense 4
sea bass, bass -- (the lean flesh of a saltwater fish of the family Serranidae)
=> saltwater fish -- (flesh of fish from the sea used as food)
Sense 5
freshwater bass, bass -(any of various North American freshwater fish with lean flesh (especially of the genus Micropterus))
=> freshwater fish -- (flesh of fish from fresh water used as food)
Sense 6
bass, bass voice, basso -- (the lowest adult male singing voice)
=> singing voice -- (the musical quality of the voice while singing)
Sense 7
bass -- (the member with the lowest range of a family of musical instruments)
=> musical instrument, instrument -(any of various devices or contrivances that can be used to produce musical tones or sounds)
Sense 8
bass -- (nontechnical name for any of numerous edible marine and freshwater spiny-finned fishes)
Internal Structure of Words
Paradigmatic relations connect lexemes together in particular ways
but don’t say anything about what the meaning representation of
a particular lexeme should consist of.
Various approaches have been followed to describe
the semantics of lexemes.
Thematic roles in predicate-bearing lexemes
Selection restrictions on thematic roles
Decompositional semantics of predicates
Feature-structures for nouns
Thematic Roles
Thematic roles provide a shallow semantic language for characterizing
certain arguments of verbs.
For example:
Ali broke the glass.
Veli opened the door.
Ali is Breaker and the glass is BrokenThing of Breaking event;
Veli is Opener and the door is OpenedThing of Opening event.
These are deep roles of arguments of events.
Both of these events have actors which are doer of a volitional event, and
things affected by this action.
A thematic role is a way of expressing this kind of commonality.
AGENT and THEME are thematic roles.
Some Thematic Roles
AGENT --The volitional causer of an event -- She broke the door
EXPERIENCER -- The experiencer of an event -- Ali has a headache.
FORCE -- The non-volitional causer of an event -- The wind blows it.
THEME -- The participant most directly effected by an event -She broke the door.
INSTRUMENT -- An instrument used in an event -He opened it with a knife.
BENEFICIARY -- A beneficiary of an event -- I bought it for her.
SOURCE -- The origin of the object of a transfer event -I flew from Rome.
GOAL -- The destination of the object of a transfer event -I flew to Ankara.
Thematic Roles (cont.)
Takes some of the work away from the verbs.
It’s not the case that every verb is unique and has to completely
specify how all of its arguments uniquely behave.
Provides a mechanism to organize semantic processing
It permits us to distinguish near surface-level semantics
from deeper semantics
Linking
Thematic roles, syntactic categories and their positions in larger syntactic
structures are all intertwined in complicated ways.
For example…
AGENTS are often subjects
In a VP->V NP NP rule, the first NP is often a GOAL
and the second a THEME
Deeper Semantics
He melted her reserve with a husky-voiced paean to her eyes.
If we label the constituents He and her reserve as the Melter and
Melted, then those labels lose any meaning they might have had.
If we make them Agent and Theme then we don’t have the same
problems
Selectional Restrictions
A selectional restriction augments thematic roles by allowing lexemes to
place certain semantic restrictions on the lexemes and phrases can
accompany them in a sentence.
I want to eat someplace near Bilkent.
Now we can say that eat is a predicate that has an AGENT and a
THEME
And that the AGENT must be capable of eating and the THEME must
be capable of
being eaten
Each sense of a verb can be associated with selectional restrictions.
THY serves NewYork. -- direct object (theme) is a place
THY serves breakfast. -- direct object (theme) is a meal.
We may use these selectional restrictions to disambiguate a sentence.
As Logical Statements
For eat…
Eating(e) ^Agent(e,x)^ Theme(e,y)^Isa(y, Food)
(adding in all the right quantifiers and lambdas)
WordNet
Use WordNet hyponyms (type) to encode the selection
restrictions
Specificity of Restrictions
What can you say about THEME in each with respect to the verb?
Some will be high up in the WordNet hierarchy, others not so high…
PROBLEMS
Unfortunately, verbs are polysemous and language is creative…
… ate glass on an empty stomach accompanied only
by water
and tea
you can’t eat gold for lunch if you’re hungry
… get it to try to eat Afghanistan
Discovering the Restrictions
Instead of hand-coding the restrictions for each verb,
can we discover a verb’s restrictions by using a corpus and WordNet?
1.
2.
3.
4.
Parse sentences and find heads
Label the thematic roles
Collect statistics on the co-occurrence of particular
headwords with particular thematic roles
Use the WordNet hypernym structure to find the most
meaningful level to use as a restriction
Motivation
Find the lowest (most specific) common ancestor that
covers a significant number of the examples
Word-Sense Disambiguation
Word sense disambiguation refers to the process of selecting
the right sense for a word from among the senses that the word is
known to have
Semantic selection restrictions can be used to disambiguate
Ambiguous arguments to unambiguous
predicates
Ambiguous predicates with unambiguous
arguments
Ambiguity all around
Word-Sense Disambiguation
We can use selectional restrictions for disambiguation.
He cooked simple dishes.
He broke the dishes.
But sometimes, selectional restrictions will not be enough to disambiguate.
What kind of dishes do you recommend?
-- we cannot know what
sense is used.
There can be two lexemes (or more) with multiple senses.
They serve vegetarian dishes.
Selectional restrictions may block the finding of meaning.
If you want to kill Turkey, eat its banks.
Kafayı yedim.
These situations leave the system with no possible meanings, and they
can indicate a metaphor.
WSD and Selection Restrictions
Ambiguous arguments
Prepare a dish
Wash a dish
Ambiguous predicates
Serve Denver
Serve breakfast
Both
Serves vegetarian dishes
WSD and Selection Restrictions
This approach is complementary to the compositional analysis
approach.
You need a parse tree and some form of
predicate-argument analysis derived from
The tree and its attachments
All the word senses coming up from the
lexemes at the leaves of the tree
Ill-formed analyses are eliminated by noting
any selection restriction violations
Problems
As we saw last time, selection restrictions are violated all
the time.
This doesn’t mean that the sentences are ill-formed or
preferred less than others.
This approach needs some way of categorizing and
dealing with the various ways that restrictions can be
violated
WSD Tags
What’s a tag?
A dictionary
sense?
For example, for WordNet an instance of “bass” in a text
has 8 possible tags or labels (bass1 through bass8).
WordNet Bass
The noun ``bass'' has 8 senses in WordNet
1.
2.
3.
4.
5.
6.
7.
8.
bass - (the lowest part of the musical range)
bass, bass part - (the lowest part in polyphonic music)
bass, basso - (an adult male singer with the lowest voice)
sea bass, bass - (flesh of lean-fleshed saltwater fish of the family
Serranidae)
freshwater bass, bass - (any of various North American lean-fleshed
freshwater fishes especially of the genus Micropterus)
bass, bass voice, basso - (the lowest adult male singing voice)
bass - (the member with the lowest range of a family of musical
instruments)
bass -(nontechnical name for any of numerous edible marine and
freshwater spiny-finned fishes)
Representations
Most supervised ML approaches require a very simple
representation for the input training data.
Vectors of sets of feature/value pairs
I.e. files of comma-separated values
So our first task is to extract training data from a corpus with respect
to a particular instance of a target word
This typically consists of a characterization of
the window of text surrounding the target
Representations
This is where ML and NLP intersect
If you stick to trivial surface features that are
easy to extract from a text, then most of the
work is in the ML system
If you decide to use features that require more
analysis (say parse trees) then the ML part
may be doing less work (relatively) if these
features are truly informative
Surface Representations
Collocational and co-occurrence information
Collocational
Encode features about the words that appear in
specific positions to the right and left of the target
word
Often limited to the words themselves as
well as they’re part of speech
Co-occurrence
Features characterizing the words that occur
anywhere in the window regardless of position
Typically limited to frequency counts
Collocational
Position-specific information about the words in the
window
guitar and bass player stand
[guitar, NN, and, CJC, player, NN, stand, VVB]
In other words, a vector consisting of
[position n word, position n part-of-speech…]
Co-occurrence
Information about the words that occur within the
window.
First derive a set of terms to place in the vector.
Then note how often each of those terms occurs in a
given window.
Classifiers
Once we cast the WSD problem as a classification
problem, then all sorts of techniques are possible
Naïve
Bayes (the right thing to try first)
Decision lists
Decision trees
Neural nets
Support vector machines
Nearest neighbor methods…
Classifiers
The choice of technique, in part, depends on the set of
features that have been used
Some techniques work better/worse with features with
numerical values
Some techniques work better/worse with features that
have large numbers of possible values
For example, the feature the word to the left has a
fairly large number of possible values
Statistical Word-Sense
Disambiguation
ps arg max P( s | V )
sS
Where s is a vector of senses,
V is the vector representation of the input
P(V | s ) P( s )
ps arg max
P(V )
sS
n
ps arg max P( s) P(vi | s)
sS
i 1
By Bayesian rule
By making independence assumption of
meanings. This means that the result is
the product of the probabilities of its
individual features given that its sense
Problems
Given these general ML approaches, how many
classifiers do I need to perform WSD robustly
One
for each ambiguous word in the
language
How do you decide what set of tags/labels/senses to use
for a given word?
Depends
on the application
END
Examples from Russell&Norvig (1)
7.2. p.213
Not all students take both History and Biology.
Only one student failed History.
Only one student failed both History and Biology.
The best history in History was better than the best score in Biology.
Every person who dislikes all vegetarians is smart.
No person likes a smart vegetarian.
There is a woman who likes all men who are vegetarian.
There is a barber who shaves all men in town who don't shave themselves.
No person likes a professor unless a professor is smart.
Politicians can fool some people all of the time or all people some of the time but they
cannot fool all people all of the time.
Categories & Events
Categories:
VegetarianRestaurant (Joe’s) – categories are relations and not objects
MostPopular(Joe’s,VegetarianRestaurant) – not FOPC!
ISA (Joe’s,VegetarianRestaurant) – reification (turn all concepts into
objects)
AKO (VegetarianRestaurant,Restaurant)
Events:
Reservation (Hearer,Joe’s,Today,8PM,2)
Problems:
Determining the correct number of roles
Representing facts about the roles associated with an event
Ensuring that all the correct inferences can be drawn
Ensuring that no incorrect inferences can be drawn
MUC-4 Example
On October 30, 1989, one civilian was killed in a
reported FMLN attack in El Salvador.
INCIDENT: DATE
INCIDENT: LOCATION
INCIDENT: TYPE
INCIDENT: STAGE OF EXECUTION
INCIDENT: INSTRUMENT ID
INCIDENT: INSTRUMENT TYPE
PERP: INCIDENT CATEGORY
PERP: INDIVIDUAL ID
PERP: ORGANIZATION ID
PERP: ORG. CONFIDENCE
PHYS TGT: ID
PHYS TGT: TYPE
PHYS TGT: NUMBER
PHYS TGT: FOREIGN NATION
PHYS TGT: EFFECT OF INCIDENT
PHYS TGT: TOTAL NUMBER
HUM TGT: NAME
HUM TGT: DESCRIPTION
HUM TGT: TYPE
HUM TGT: NUMBER
HUM TGT: FOREIGN NATION
HUM TGT: EFFECT OF INCIDENT
HUM TGT: TOTAL NUMBER
30 OCT 89
EL SALVADOR
ATTACK
ACCOMPLISHED
TERRORIST ACT
"TERRORIST"
"THE FMLN"
REPORTED: "THE FMLN"
"1 CIVILIAN"
CIVILIAN: "1 CIVILIAN"
1: "1 CIVILIAN"
DEATH: "1 CIVILIAN"
Subcategorization frames
1.
2.
3.
4.
5.
6.
7.
I ate
I ate a turkey sandwich
I ate a turkey sandwich at my desk
I ate at my desk
I ate lunch
I ate a turkey sandwich for lunch
I ate a turkey sandwich for lunch at my desk
- no fixed “arity” (problem for FOPC)
One possible solution
Eating1 (Speaker)
2.
Eating2 (Speaker, TurkeySandwich)
3.
Eating3 (Speaker, TurkeySandwich, Desk)
4.
Eating4 (Speaker, Desk)
5.
Eating5 (Speaker, Lunch)
6.
Eating6 (Speaker, TurkeySandwich, Lunch)
7.
Eating7 (Speaker, TurkeySandwich, Lunch, Desk)
Meaning postulates are used to tie semantics of predicates:
w,x,y,z: Eating7(w,x,y,z) ⇒ Eating6(w,x,y)
Scalability issues again!
1.
Another solution
- Say that everything is a special case of Eating7 with some
arguments unspecified:
∃w,x,y Eating (Speaker,w,x,y)
- Two problems again:
Too many commitments (e.g., no eating except at meals: lunch, dinner, etc.)
No way to individuate events:
∃w,x Eating (Speaker,w,x,Desk)
∃w,y Eating (Speaker,w,Lunch,y) – cannot combine into
∃w Eating (Speaker,w,Lunch,Desk)
Reification
w: Isa(w,Eating) ∧ Eater(w,Speaker) ∧ Eaten(w,TurkeySandwich) –
equivalent to sentence 5.
Reification:
No need to specify fixed number of arguments for a given
surface predicate
No more roles are postulated than mentioned in the input
No need for meaning postulates to specify logical connections
among closely related examples
∃
Representing time
3.
I arrived in New York
I am arriving in New York
I will arrive in New York
∃
1.
2.
w: Isa(w,Arriving) ∧ Arriver(w,Speaker) ∧ Destination(w,NewYork)
Representing time
i,e,w,t: Isa(w,Arriving) ∧ Arriver(w,Speaker) ∧
Destination(w,NewYork) ∧ IntervalOf(w,i) ∧ EndPoint(I,e) ∧ Precedes
(e,Now)
∃ i,e,w,t: Isa(w,Arriving) ∧ Arriver(w,Speaker) ∧
Destination(w,NewYork) ∧ IntervalOf(w,i) ∧ MemberOf(i,Now)
∃ i,e,w,t: Isa(w,Arriving) ∧ Arriver(w,Speaker) ∧
Destination(w,NewYork) ∧ IntervalOf(w,i) ∧ StartPoint(i,s) ∧ Precedes
(Now,s)
∃
Representing time
We fly from San Francisco to Boston at 10.
Flight 1390 will be at the gate an hour now.
Use of tenses
Flight 1902 arrived late.
Flight 1902 had arrived late.
“similar” tenses
When Mary’s flight departed, I ate lunch
When Mary’s flight departed, I had eaten lunch
reference point
Aspect
Stative: I know my departure gate
Activity: John is flying
no particular end point
Accomplishment: Sally booked her flight
natural end point and result in a particular state
Achievement: She found her gate
Figuring out statives:
* I am needing the cheapest fare.
* I am wanting to go today.
* Need the cheapest fare!
Representing beliefs
Want, believe, imagine, know - all introduce hypothetical worlds
I believe that Mary ate British food.
Reified example:
∃ u,v: Isa(u,Believing) ∧ Isa(v,Eating) ∧ Believer (u,Speaker) ∧
BelievedProp(u,v) ∧ Eater(v,Mary) ∧ Eaten(v,BritishFood)
However this implies also:
∃ u,v: Isa(v,Eating) ∧ Eater(v,Mary) ∧ Eaten(v,BritishFood)
Modal operators:
Believing(Speaker,Eating(Mary,BritishFood))
- not FOPC! – predicates
in FOPC hold between objects, not between relations.
Believes(Speaker, ∃ v: ISA(v,Eating) ∧ Eater(v,Mary) ∧
Eaten(v,BritishFood))
Modal operators
Beliefs
Knowledge
Assertions
Issues:
If you are interested in baseball, the Red Sox are playing
tonight.
Examples from Russell&Norvig (2)
7.3. p.214
One more outburst like that and you'll be in comptempt of court.
Annie Hall is on TV tonight if you are interested.
Either the Red Sox win or I am out ten dollars.
The special this morning is ham and eggs.
Maybe I will come to the party and maybe I won't.
Well, I like Sandy and I don't like Sandy.