Artificial Intelligence 4. Knowledge Representation
Download
Report
Transcript Artificial Intelligence 4. Knowledge Representation
Artificial Intelligence
14. Inductive Logic Programming
Course V231
Department of Computing
Imperial College, London
© Simon Colton
Inductive Logic Programming
Representation scheme used
–
Need to
–
–
–
–
Logic Programs
Recap logic programs
Specify the learning problem
Specify the operators
Worry about search considerations
Also
–
–
Go through a session with Progol
Look at applications
Remember Logic Programs?
Subset of first order logic
All sentences are Horn clauses
–
Implications where a conjunction of literals (body)
–
Single facts can also be Horn clauses
With no body
A logic program consists of:
–
Imply a single goal literal (head)
A set of Horn clauses
ILP theory and practice is highly formal
–
Best way to progress and to show progress
Horn Clauses and Entailment
Writing Horn Clauses:
–
Also replace conjunctions with a capital letter
–
–
h(X,Y) b1(X,Y) b2(X) ... bn(X,Y,Z)
h(X,Y) b1, B
Assume lower case letters are single literals
Entailment:
–
When one logic program, L1 can be proved using another logic
program L2
–
We write: L2 L1
Note that if L2 L1
This does not mean that L2 entails that L1 is false
Logic Programs in ILP
Start with background information,
–
As a logic program labelled B
Also start with a set of positive examples of
the concept required to learn
+
– Represented as a logic program labelled E
And a set of negative examples of the concept
required to learn
– Represented as a logic program labelled E
ILP system will learn a hypothesis
–
Which is also a logic program, labelled H
Explaining Examples
A Hypothesis H explains example e
– If logic program e is entailed by H
– So, we prove e is true
Example
–
–
–
H: class(A, fish) :- has_gills(A)
B: has_gills(trout)
Positive example: class(trout, fish)
Entailed by H B taken together
Note that negative examples can also be entailed
–
By the hypothesis and background taken together
Prior Conditions on the Problem
Problem must be satisfiable:
– Prior satisfiability: e E (B e)
–
–
So, the background does not entail any negative
example (if it did, no hypothesis could rectify this)
This does not mean that B entails that e is false
Problem must not already be solved:
+
– Prior necessity: e E (B e)
–
If all the positive examples were entailed by the
background, then we could take H = B.
Posterior Conditions on Hypothesis
Taken with B, H should entail all positives
+
– Posterior sufficiency: e E (B H e)
Taken with B, H should entail no negatives
– Posterior satisfiability: e E (B H e)
If the hypothesis meets these two conditions
–
It will have perfectly solved the problem
Summary:
–
–
All positives can be derived from B H
But no negatives can be derived from B H
Problem Specification
+
Given logic programs E , E-, B
–
Which meet the prior satisfiability and necessity
conditions
Learn a logic program H
–
Such that B H meet the posterior satisfiabilty and
sufficiency conditions
Moving in Logic Program Space
Can use rules of inference to find new LPs
Deductive rules of inference
–
–
Modus ponens, resolution, etc.
Map from the general to the specific
i.e., from L1 to L2 such that L1 L2
Look today at inductive rules of inference
–
Will invert the resolution rule
–
Map from the specific to the general
–
Four ways to do this
i.e., from L1 to L2 such that L2 L1
Inductive inference rules are not sound
Inverting Deductive Rules
Man alternates 2 hats every day
–
Knows that a hat having a pin in causes pain
–
Infers that his hat has a pin in it
Looks and finds the hat X does have a pin in it
Uses Modus Ponens to prove that
–
Whenever he wears hat X, he gets a pain, hat Y is OK
His pain is caused by a pin in hat X
Original inference (pin in hat X) was unsound
–
–
Could be many reasons for the pain in his head
Was induced so that Modus Ponens could be used
Inverting Resolution
1. Absorption rule of inference
Rule written same as for deductive rules
–
Remember that q is a single literal
–
Input above the line, and the inference below line
And that A, B are conjunctions of literals
Can prove that the original clauses
–
Follow from the hypothesised clause by resolution
Proving Given clauses
Exercise: translate into CNF
–
Use the v diagram,
–
And convince yourselves
because we don’t want to write as a rule of deduction
Say that Absorption is a V-operator
Example of Absorption
Example of Absorption
Inverting Resolution
2. Identification
Rule of inference:
Resolution Proof:
Inverting Resolution
3. Intra Construction
Rule of inference:
Resolution Proof:
Predicate Invention
Say that Intra-construction is a W-operator
This has introduced the new symbol q
q is a predicate which is resolved away
–
ILP systems using intra-construction
–
In the resolution proof
Perform predicate invention
Toy example:
–
–
When learning the insertion sort algorithm
ILP system (Progol) invents concept of list insertion
Inverting Resolution
4. Inter Construction
Rule of inference:
Resolution Proof:
Predicate
Invention
Again
Generic Search Strategy
Assume this kind of search:
–
–
–
A set of current hypothesis, QH, is maintained
At each search step, a hypothesis H is chosen from QH
H is expanded using inference rules
–
Which adds more current hypotheses to QH
Search stops when a termination condition is met by a
hypothesis
Some (of many) questions:
–
Initialisation, choice of H, termination, how to expand…
Search (Extra Logical) Considerations
Generality and Speciality
There is a great deal of variation in
–
Search strategies between ILP programs
Definition of generality/speciality
–
A hypothesis G is more general than hypothesis S iff
G S. S is said to be more specific than G
–
A deductive rule of inference maps a conjunction of clauses
G onto a conjunction of clauses S, such that G S.
–
These are specialisation rules (Modus Ponens, resolution…)
An inductive rule of inference maps a conjunction of clauses
S onto a conjunction of clauses G, such that G S.
These are generalisation rules (absorption, identification…)
Search Direction
ILP systems differ in their overall search strategy
From Specific to General
–
Start with most specific hypothesis
–
Keep generalising to explain more positive examples
–
Which explain a small number (possibly 1) of positives
Using generalisation rules (inductive) such as inverse resolution
Are careful not to allow any negatives to be explained
From General to Specific
–
Start with empty clause as hypothesis
–
Keep specialising to exclude more and more negative examples
–
Which explains everything
Using specialisation rules (deductive) such as resolution
Are careful to make sure all positives are still explained
Pruning
Remember that:
–
–
If G is more general than S
–
A set of current hypothesis, QH, is maintained
And each hypothesis explains a set of pos/neg exs.
Then G will explain more (>=) examples than S
When searching from specific to general
–
Can prune any hypothesis which explains a negative
Because further generalisation will not rectify this situation
When searching from general to specific
–
Can prune any hypothesis which doesn’t explain all positives
Because further specialisation will not rectify this situation
Ordering
There will be many current hypothesis in QH to
choose from.
–
ILP systems use a probability distribution
–
Which is chosen first?
Which assigns a value P(H | B E) to each H
A Bayesian measure is defined, based on
–
–
The number of positive/negative examples explained
When this is equal, ILP systems use
A sophisticated Occam’s Razor
Defined by Algorithmic Complexity theory or something similar
Language Restrictions
Another way to reduce the search
–
One possibility
–
Specify what format clauses in hypotheses are allowed to have
Restrict the number of existential variables allowed
Another possibility
–
–
Be explicit about the nature of arguments in literals
Which arguments in body literals are
–
Instantiated (ground) terms
Variables given in the head literal
New variables
See Progol’s mode declarations
Example Session with Progol
Animals dataset
–
–
Learning task: learn rules which classify animals into
fish, mammal, reptile, bird
Rules based on attributes of the animals
Physical attributes: number of legs, covering (fur, feathers, etc.)
Other attributes: produce milk, lay eggs, etc.
16 animals are supplied
7 attributes are supplied
Input file: mode declarations
Mode declarations given at the top of the file
–
Declaration about the head of hypothesis clauses
:- modeh(1,class(+animal,#class))
–
These are language restrictions
Means hypothesis will be given an animal variable and will return
a ground instantiation of class
Declaration about the body clauses
:- modeb(1,has_legs(+animal,#nat))
–
Means that it is OK to use has_legs predicate in body
And that it will take the variable animal supplied in the head and
return an instantiated natural number
Input file: type information
Next comes information about types of object
–
Each ground variable (word) must be typed
animal(dog), animal(dolphin), … etc.
class(mammal), class(fish), …etc.
covering(hair), covering(none), … etc.
habitat(land), habitat(air), … etc.
Input file: background concepts
Next comes the logic program B, containing
these predicates:
–
–
has_covering/2, has_legs/2, has_milk/1,
homeothermic/1, habitat/2, has_eggs/1, has_gills/1
E.g.,
–
–
–
–
has_covering(dog, hair), has_milk(platypus),
has_legs(penguin, 2), homeothermic(dog),
habitat(eagle, air), habitat(eagle, land),
has_eggs(eagle), has_gills(trout), etc.
Input file: Examples
Finally, E+ and E- are supplied
Positives:
class(lizard, reptile)
class(trout, fish)
class(bat, mammal), etc.
Negatives:
:- class(trout, mammal)
:- class(herring, mammal)
:- class(platypus, reptile), etc.
Output file: generalisations
We see Progol starting with the most specific hypothesis
for the case when animal is a reptile
–
Starts with the lizard reptile and finds most specific:
class(A, reptile) :- has_covering(A,scales), has_legs(A,4),
has_eggs(A),habitat(A, land)
Then finds 12 generalisations of this
–
Examples
Then chooses the best one:
–
class(A, reptile) :- has_covering(A, scales).
class(A, reptile) :- has_eggs(A), has_legs(A, 4).
class(A, reptile) :- has_covering(A, scales), has_legs(A, 4).
This process is repeated for fish, mammal and bird
Output file: Final Hypothesis
class(A, reptile) :- has_covering(A,scales), has_legs(A,4).
class(A, mammal) :- homeothermic(A), has_milk(A).
class(A, fish) :- has_legs(A,0), has_eggs(A).
class(A, reptile) :- has_covering(A,scales), habitat(A, land).
class(A, bird) :- has_covering(A,feathers)
Gets 100% predictive accuracy on training set
Some Applications of ILP
(See notes for details)
Finite Element Mesh Design
Predictive Toxicology
Protein Structure Prediction
Generating Program Invariants