Remarks on the Evolution of Language

Download Report

Transcript Remarks on the Evolution of Language

CS544: Logic, Lecture 1
February 2, 2010
Jerry R. Hobbs
USC/ISI
Marina del Rey, CA
Outline of My Six Lectures
Feb 2: Logic
Feb 4: Syntax and Compositional Semantics of Clauses
Feb 11: Syntax and Compositional Semantics of NPs
Feb 16: Coordination and Comparison
Feb 18: Inference, Coreference, and Metonymy
Feb 23: Inference and Question-Answering
Aim: To give you experience with examining texts
closely from a computational point of view
Homework
No project, but ....
A homework assignment at the end of each lecture,
covering the issues discussed in the lecture,
Due at the beginning of the next class.
Homework will always involve analysis of the 3 sample
texts for the class (handed out today)
You can discuss issues abstractly with other students,
but work specifically on the homework assignments
must be done alone.
Logic and Information
Logic and natural language are the best two ways
we know of representing information / knowledge
Natural language is too variable to compute with
Logic has traditionally been too narrow in what can
be represented
A goal of knowledge representation research:
Develop logics with expressivity closer to
natural language
Propositional Logic
Propositional constants: P, Q, R, ...
Latin word
for “or”
is “vel”
Logical connectives:
and: & or 
or: v
Have vaues of either True or False.
not: ~ or ¬
imply: --> or 
equivalent, iff: <--> or 
Defined by truth tables:
&
T
P: T T
F F
Q:
F
F
F
v
Q:
T
P: T T
F T
Definitions of <--> and -->:
[P <-->Q] <--> [[P --> Q] & [Q --> P]]
~
F
T
F
P:
T
F
F
T
“Material implication”:
either P is false or Q is true
[P --> Q] <--> [~P v Q]
Properties of Logical Connectives
Properties of logical connectives:
Modus ponens: [P & [P-->Q]] --> Q
& and v are associative and commutative:
[[P & Q] & R] <--> [P & [Q & R]]
[P & Q] <--> [Q & P]
[[P v Q] v R] <--> [P v [Q v R]]
[P v Q] <--> [Q v P]
so we can write [P & Q & R & ...] and [P v Q v R v ...]
(What about [[P --> Q] --> R] <-?-> [P --> [Q --> R]]?)
Relating &, v and ~:
~[P & Q] <--> ~P v ~Q
~[P v Q] <--> ~P & ~Q
[P & [Q v R]] <--> [[P & Q] v [P & R]]
[P v [Q & R]] <--> [[P v Q] & [P v R]]
Double negation: ~~P <--> P
Clause Form
Clause form:
clause
[P1 v P2 v ~P3 v ....] & [Q1 v ~Q2 v ...]
Negation applies only to propositional constants, not larger expressions.
Disjunctions (v) appear at the midlevel, outscoping negation and outscoped
by conjunctions.
Conjunction at the highest level.
Eliminate --> and <--> with the rules
[P <--> Q] ==> [[P --> Q] & [Q --> P]]
Push ~ all the way inside with the rules:
~[P & Q] ==> ~P v ~Q
Push v inside & with the rule:
[P v [Q & R]] ==> [[P v Q] & [P v R]]
Eliminate double negations with the rule:
~~P ==> P
[P --> Q] ==> [~P v Q]
~[P v Q] ==> ~P & ~Q
It is always possible to reduce an expression to clause form.
(Conjunctive Normal Form)
Example
~[P & [Q --> R]]
We can rewrite this
as two rules:
P --> Q
and
P & R --> False
~[P & [~Q v R]]
Eliminate -->
~P v ~[~Q v R]
Move ~ inside
~P v [~~Q & ~R]
Move ~ inside
~P v [Q & ~R]
Cancel double negation
[~P v Q] & [~P v ~R]
Distribute v through &
Clause form with two clauses
Literals: P, ~Q, R, ...
Positive literals: P, R, ...
Horn clause: A clause with at most one positive literal.
~P v ~Q v R
equivalent to [P & Q] --> R
~P v ~Q
equivalent to [P & Q] --> False
Definite Horn clause: A clause with exactly one positive literal.
First-Order Logic
Propositional logic: Don’t look inside propositions: P, Q, R, ...
First-order logic: Look inside propositions: p(x,y), like(J,M), ...
Constants: John1, Sam1, ..., Chair-46, ..., 0, 1, 2, ...
Variables: x, y, z, ....
Predicate symbols: p, q, r, ..., like, hate, ...
Function symbols: motherOf, sumOf, ...
All the logical connectives of propositional logic.
Predicates and functions apply to a fixed number of arguments:
Predicates: like(John1,Mary1), hate(Mary1,George1), tall(Sue3), ...
Functions: motherOf(Sam1) = Mary1, sumOf(2,3) = 5, ...
In the expression: 3 + 2 > 4
function
predicate
Predicates applied to arguments are propositions and yield True or False.
Functions applied to arguments yield entities in the domain.
Quantifiers
Two different roles for variables:
Recall from high-school algebra:
(x + y)(x - y) = x2 - y2
universal statement:
(A x,y)[(x + y)(x - y) = x2 - y2]
x2 -7x + 12 = 0
existential statement:
(E x)[x2 -7x + 12 = 0]
Universal quantifier: A or : statement is true for all values of variable
Existential quantifier: E or  : statement is true for some value of variable
In (A x)[p(x) & q(y)] x is bound by the quantifier; y is not.
Both are in the scope of the quantifier.
We’ll only use variables that are bound by a quantifier.
The quantifier tells how the variable is being used.
Relation between A and E:
~(A x) p(x) <--> (E x)~p(x)
(A x) p(x) <--> (A y) p(y)
(A x)[p(x)] & Q <--> (A x)[p(x) & Q]
where no x in Q
Negation can be moved inside
The variable doesn’t matter
No harm scoping over what
doesn’t involve the variable
Clause Form for 1st Order Logic
Eliminate --> and <-->
Move negation to the inside
Give differently quantified variables different names:
(A x)p(x) & (E x)q(x) ==> (A x)p(x) & (E y)q(y)
Eliminate existential quantifiers with Skolem constants and functions:
(E x)p(x) ==> p(A)
(A x)(E y)p(x,y) ==> (A x)p(x,f(x))
Skolem constant
Skolem function
Move universal quantifiers to outside:
(A x)p(x) & (A y)[q(y) v r(y,f(y))] ==> (A x)(Ay)[p(x) & [q(y) v r(y,f(y))]
prenex form: prefix
Put matrix into clause form
matrix
Example
(A x)[(E y)[p(x,y) --> q(x,y)] --> (E y)[r(x,y)]]
(A x)[~(E y)[~p(x,y) v q(x,y)] v (E y)[r(x,y)]]
Eliminate implication
(A x)[(A y)~[~p(x,y) v q(x,y)] v (E y)[r(x,y)]]
Move negation inside
(A x)[(A y)[p(x,y) & ~q(x,y)] v (E y)[r(x,y)]]
Move negation inside
(A x)[(A y)[p(x,y) & ~q(x,y)] v (E z)[r(x,z)]]
Rename variables
(A x)[(A y)[p(x,y) & ~q(x,y)] v r(x,f(x))]
Skolem function
(A x)(A y)[[p(x,y) & ~q(x,y)] v r(x,f(x))]
Prenex form
(A x)(A y)[[p(x,y) v r(x,f(x))] & [~q(x,y) v r(x,f(x))]]
Distribute v inside &
p(x,y) v r(x,f(x)),
Break into clauses
~q(x,y) v r(x,f(x))
Horn Clauses
Horn clause: A clause with one positive literal.
~p(x,y) v ~q(x) v r(x,y)
is equivalent to
[p(x,y) & q(x)] --> r(x,y)
procedure body
procedure name
The key idea in Prolog
Implicative normal form:
(A x,y)[[p1(x,y) & p2(x,y) & ...] --> (E z)[q1(x,z) & q2(x,z) & ...]]
Useful for commonsense knowledge:
(A x)[[car(x) & intact(x) --> (E z)[engine(z) & in(z,x)]]
Every intact car has an engine in it.
Logical Theories and Rules of
Inference
Logical theory:
The logic as we have defined it so far
+ A set of logical expressions that are taken to be true (axioms)
Rules of inference:
Modus Ponens: From P, P --> Q infer Q
Universal instantiation: From (A x)p(x) infer p(A)
Theorems:
Expressions that can be derived from the axioms and the rules
of inference.
Models
What do the logical symbols mean? What do the axioms mean?
A logical theory is used to describe some domain.
We assign an individual or entity in the domain to each constant (the
denotation of that constant.
To each unary predicate we assign a set of entities in the domain, those
entities for which the predicate is true (the denotation or extension of p).
To each binary predicate we assign a set of ordered pairs of entities, etc.
~P: true when P is not true.
P & Q: true when P is true and Q is true
P v Q: true when P is true or when Q is true
p(A): true when the denotation of A is in the set assigned to p
(A x)p(x): true when for every assignment of x, x is in the set assigned to p
If all the axioms of the logical theory are true, then the domain is a model of
the theory.
Examples
Logical theory:
Predicate: sum(x,y,z) (x is the “sum” of y and z)
Axiom 1: (A x,y,z,w)[(E u)[sum(u,x,y) & sum(w,u,z)]
<--> (E v)[sum(v,y,z) & sum(w,x,v)]]
(associativity)
Some models: addition of numbers, multiplication of numbers
concatenation of strings
Add Axiom 2: (A x,y,w)[sum(w,x,y) <--> sum(w,y,x)]
(commutativity)
Some models: addition of numbers, multiplication of numbers
concatenation of strings
In general, adding axioms eliminates models.
Some Uses of Models
Consistency: A theory is consistent if you can’t conclude a contradiction.
If a logical theory has a model, it is consistent.
Independence: Two axioms are independent if you can’t prove one from
the other.
To show two axioms are independent,show that there is a model in
which one is true and the other isn’t true.
Soundness: All the theorems of the logical theory are true in the model.
Completeness: All the true statements in the model are theorems in
the logical theory.
The logical theory should
Precision = 100%
tell the whole truth (complete)
Recall = 100%
and nothing but the truth (sound)
Extension vs. Intension
(not “intention”)
Extension: “president”
Intension: “president”
....
Clinton
Bush
W1: ....
Clinton
Bush
W2: ....
Clinton
Gore
W3: ....
Clinton
Bush
Kerry
Frees meaning of predicate from accidents of how the world is
Back to Language
Logic is about representing information.
Language conveys information.
Logic is a good way to represent the information conveyed by language.
A man builds a boat.
(E x,y)[man(x) & build(x,y) & boat(y)]
A tall man builds a small boat.
(E x,y)[tall(x) & man(x) & build(x,y) & small(y) & boat(y)]
Seems simple enough, but problems arise.
(e.g., the determiner “a”, the present tense, tall/small for what)
Two ways to deal with these problems:
Much computational semantics
Complicate the logic.
Complicate our conceptualization of the underlying domain.
Me
Reifying Events
Events can be modified: John ran slowly.
Events can be placed in space and time: On Tuesday, John ran in Chicago.
Events can be causes and effects: John ran, because Sam was chasing him.
Because John ran, he was tired.
Events can be objects of propositional attitudes: Sam believes John ran.
Events can be nominalized: John’s running tired him out.
Events can be referred to by pronouns: John ran, and Sam saw it.
To represent these, we need some kind of “handle” on the event.
We need constants and variables to be able to denote events.
We need to treat events as “things” -- reify events (from Latin “re(s)” - thing)
Let e1 be John’s running. Then
slow(e1)
onDay(e1, ...), in(e1, Chicago)
cause(..., e1), cause(e1, ...)
believe(Sam,e1)
tiredOut(e1, John)
see(Sam, e1)
Representing Reifications
Why not this?
slow( run(John) )
This evaluates to True or False
Then slow would describe not John’s running, but True or False
e1: run(John)
This is easily understood, but it takes us out of
logic.
run’(e1,John)
This means “e1 is the event of John’s running”
I’ll use this when I need to; run(John) otherwise.
Reifying Everything
Not just events, but states, conditions, properties:
John fell because the floor was slippery.
cause(e1,e2) & fall’(e2, j) & slippery’(e1, f)
The contract was invalid because John failed to sign it.
cause(e1,e2) & invalid’(e2,c) & fail’(e1,j, e3) & sign’(e3,j,c)
I will use the word “eventuality” to describe all
these things -- events, states, conditions, etc.
Controversial
Representing Case Relations
Jenny pushed the chair from the living room to the dining room for Sam yesterday
Case:
Agent
Theme
Source
Goal
Benefactor
Time
Could represent this like
push(Jenny, Chair1, LR, DR, Sam, 14Feb05, ...)
Or like
push’(e) & Agent(Jenny,e) & Theme(Chair1,e) & Source(LR,e) & Goal(DR,e)
& Benefactor(Sam,e) & atTime(e, 14Feb05)
Or like
push’(e, Jenny, Chair1) & from(e, LR) & to(e, DR) & for(e, Sam)
& yesterday(e, ...)
from complements
from adjuncts
Equivalence of these: (A e,x,y)[push’(e,x,y) --> Agent(x,e) & Theme(y,e)]
Space, Time, Tense, and Manner
John ran.
run’(e,J) & Past(e)
tense
John ran on Tuesday.
run’(e,J) & Past(e) & onDay(e,d) & Tuesday(d)
John ran in Chicago.
run’(e,J) & Past(e) & in(e,Chicago)
John ran slowly.
run’(e,J) & Past(e) & slow(e)
John ran reluctantly.
run’(e,J) & Past(e) & reluctant(J,e)
Attributives
Some attributive adjectives have an implicit comparison set or scale:
A small elephant is bigger than a big mosquito.
That mosquito is big.
mosquito(x) & big(x, s)
The implicit comparison set or scale,
which must be determined
from context
Proper Names
Proper names:
Could treat them as constants:
Springfield is the capital of Illinois. ==> capital(Springfield, Illinois)
But there are many Springfields; we could treat it as a predicate true
of any town named Springfield:
capital(x,y) & Springfield(x) & Illinois(y)
Or we could treat the name as a string, related to the entity by the
predicate name:
capital(x,y) & name(“Springfield”, x) & name(“Illinois”, y)
Indexicals
An indexical or deictic is a word or phrase that requires knowledge of
the situation of utterance for its interpretation.
“I”, “you”, “we”, “here”, “now”, some uses of “this”, “that”, ...
The property of being “I” is being the speaker of the current utterance
Indexicals require an argument for the utterance or the speech situation.
I(x,u): x is the speaker of utterance u
you(x,u): x is the intended hearer of utterance u
we(s,u): s is a set of people containing the speaker of utterance u
here(x,u): x is the place of utterance u
now(t,u): t is the time of utterance u
from the quotation marks
Chris said, “I see you now.”
==> say(Chris,u) & content(e,u) & see’(e,x,y) & I(x,u) & you(y,u)
& atTime(e,t) & now(t,u)