Cae-based reasoning - Middlesex University

Download Report

Transcript Cae-based reasoning - Middlesex University

Case-based reasoning
What is case-based reasoning?

An approach to building KBS which is
radically different to the rule-based and
other knowledge-representation
approaches we have seen so far.
 The principle is to find a solution which
has been shown to solve problems like
your current problem in the past, and
adapt it so that it solves the current
problem.
What is case-based reasoning?

This has a certain psychological
plausibility as a model of what the
expert-decision-maker actually does
when solving a problem.
 Based on research by Riesbeck &
Schank (1989). A good comprehensive
description is to be found in Kolodner
(1993).
What is case-based reasoning?

Three quotes from Roger Schank:
"Humans use cases because they don't
know what they know - they don't know
their own rules - they do things nonreflectively."
 "The key process in intelligence is the
reminding process".
 "People don't ever reason from first
principles. They always choose a matching
case. It may be a bad match, but in that
case they need more experience.”

How a CBR system works:
the knowledgebase

The knowledge base contains a
collection of representative cases (of
faults, say, if the system is concerned
with fault diagnosis), with their
 symptoms,
 causes,
 and
treatments.
How a CBR system works:
the process
The user is instructed to provide the
(relevant) features of the current case.
 The similarity between this set of features,
and the features characteristic of each of the
stored cases is calculated, and the best
match is chosen.

How a CBR system works:
the process
Current
Case
284742
134752
135753
134744
284743
144702
Case 1
Case 2
Case 3
Case 4
Case 5
How a CBR system works:
the process
Current
Case
284742
134752
135753
134744
284743
144702
Case 1
Case 2
Case 3
Case 4
Case 5
How a CBR system works:
the process
Current
Case
284742
134752
135753
134744
284743
144702
Case 1
Case 2
Case 3
Case 4
Case 5
How a CBR system works:
the process
Current
Case
284742
134752
135753
134744
284743
144702
Case 1
Case 2
Case 3
Case 4
Case 5
How a CBR system works:
the process
Current
Case
284742
134752
135753
134744
284743
144702
Case 1
Case 2
Case 3
Case 4
Case 5
How a CBR system works:
the process
Current
Case
284742
134752
135753
134744
284743
144702
Case 1
Case 2
Case 3
Case 4
Case 5
How a CBR system works:
the process
The features which have been identified as
important in the stored cases, and which the
user is asked about, are known as “indices”.
 Each has a value. In the example I just
showed you, each was represented by a
number.

How a CBR system works:
the process

If necessary, this case is adapted so that it is
a better match for the current circumstances.

The case is then presented as the solution,
with the opportunity to examine the
'precedent' case.
How a CBR system works

The sequence of operations, for a
simple CBR system:
1) assign indices
2) retrieve a similar case
Flow chart for a simple CBR system
Input
Indexing rules
Case memory
1. Assign indices
2. Retrieve
Output
Similarity
metrics
How a CBR system works

The sequence of operations, for a
“full-blown” CBR system:
1) assign indices
2) retrieve a similar case
3) modify the past case
4) test the case
5a) assign indices to this new case,
and store as a working solution
OR
5b) explain failure, repair the solution,
and test again.
Flow chart for a full-blown CBR system
Input
Indexing rules
Case memory
1. Assign indices
2. Retrieve
5b. Store
3. Modify
5a. Assign indices
4. Test
Working
solution
Failed
solution
6a. Explain
Similarity
metrics
Modification
rules
6b. Repair
Repair
rules
Available techniques for
case memory organisation

Memory organisation by:
 linear ("flat") case memory
 case hierarchy
 nested cases
 decision-tree orientated memory
 knowledge-guided indexing
Available techniques for
case retrieval

Retrieval by:
 Nearest
neighbour case matching
 Weighted nearest neighbour case matching
 Decision tree methods
 Knowledge-guided retrieval

The last four memory organisation
approaches, and the last two retrieval
approaches, might be thought of as
hybrid systems.
“Nearest neighbour”
algorithm: an example

Suppose that we have a sick soyabean
plant, and we wish to discover which of
a number of known specimens of sick
soyabean plants it is most like.
“Nearest neighbour”
algorithm: an example

Choose (let’s say) three characteristics
of the leaves that can be represented as
numbers:
 Amount
of the leaf that is covered by the
discolouration
 Lightness of the discoloured parts of the
leaf
 Lightness of the remaining parts of the leaf.
“Nearest neighbour”
algorithm: an example
Suppose that the first two cases to be
matched are:
 case 1: coverage - 8
lightness-1 - 4
lightness-2 - 6
 case 2: coverage - 10
lightness-1 - 7
lightness-1 - 6

“Nearest neighbour”
algorithm
This can be treated as two points in
three-dimensional space:
 x, y, z coordinates of case 1: (8, 4, 6)
 x, y, z coordinates of case 2: (10, 7, 6)

“Nearest neighbour”
algorithm
y
10
9
8
7
6
5
4
3
2
1
1
2
0
1 2 3 4 5 6 7 8 9 10
x
3
4
5
6
7
8
9
10
z
A system of
3-dimensional
co-ordinates
“Nearest neighbour”
algorithm
y
10
9
8
7
6
5
4
3
2
1
1
2
0
 - case 1

1 2 3 4 5 6 7 8 9 10
x
3
4
5
6
7
8
9
10
z
The 1st case
represented
as a point
“Nearest neighbour”
algorithm
y
10
9
8
7
6
5
4
3
2
1
1
2
0

 - case 2
1 2 3 4 5 6 7 8 9 10
x
3
4
5
6
7
8
9
10
z
The 2nd case
represented
as a point
“Nearest neighbour”
algorithm
y
10
9
8
7
6
5
4
3
2
1
1
2
0
 - case 1


 - case 2
1 2 3 4 5 6 7 8 9 10
x
3
4
5
6
7
8
9
10
z
The two cases
represented
as points
“Nearest neighbour”
algorithm
y
10
9
8
7
6
5
4
3
2
1
1
2
0
 - case 1


 - case 2
1 2 3 4 5 6 7 8 9 10
x
3
4
5
6
7
8
9
10
z
The distance
between the
two cases
“Nearest neighbour”
algorithm
y
10
9
8
7
6
5
4
3
2
1
1
2
6
7
8
9
10
z
0
 - case 1
 - case 2
 - case 3


1 2 3 4 5 6 7 8 9 10
x
3
4
5
Adding a
third case:
(2, 3, 9)
“Nearest neighbour”
algorithm
There is a simple formula that tells you
the distance between two points in 3dimensional space.
 To find out whether case 1 is more
similar to case 2 or to case 3, you simply
calculate the two distances, and pick the
smaller of the two.

“Nearest neighbour”
algorithm

To find out which of a whole series of
cases case 1 is most similar to,
calculate the distance from case 1 to
each of them, and pick the smallest
figure.
“Nearest neighbour”
algorithm
Suppose it was 4 features, or 7, or 100?
Would you have to draw 4-dimensional
or 7-dimensional or 100-dimensional
graphs?
 No, it’s simply necessary to have a
formula for calculating distances in 4, or
7, or 100-dimensional space, and such
formulae are readily available.

Case adaptation

"Fixing" inconsistencies between
diagnosis and symptoms.
 Techniques:
 the
end user does it
 knowledge-based (qualitative reasoning,
etc)
 a fixed procedure.
Case adaptation

Note that there is a problem about updating
the case-base with adapted cases.
 Since
the new case isn’t exactly like any of the
cases in the case-base, it can’t really be said to
have been solved by the expert judgement that
was used to build the case-base in the first place.
 There
is a real chance that the conclusion that
the system came to is wrong in this case
 If
wrongly concluded cases are added to the
case-base, it becomes progressively degraded.
Case adaptation

Typically, the procedure is to put fresh
cases into a special file, and have the
Domain Expert pass judgement on them
before they are added to the case-base.
Appropriate domains
CBR is suitable:
 when the domain is broad but shallow.
 when experience rather than theory is
the primary source of information.
 when the requirement is for the best
available solution, rather than a
guaranteed exact solution.
 when solutions are reusable, rather than
unique to each situation.

Example of a successful
system
CBR is particularly used for help-desk
applications.
 For instance the COMPAQ SMART
system.

Example of a successful
system

The problem was that:
 Thousands of customers were calling
Compaq directly every day, requesting
support.
 Many of the staff were new; there was
a major training problem.
 There was a need for consistent &
accurate answers and responses
 There was a need for retention of
corporate knowledge.
Example of a successful
system
The COMPAQ SMART system, once
developed and installed, succeeded in
solving 85-95% of calls.
 Typical time to solve a problem was less
than 2 minutes.

Advantages of CBR

Case-based reasoning:
 tends to focus on the problem's
essential features.
 can solve problems in domains that
are only partially understood.
 can provide solutions when no
algorithmic method is available.
 can interpret open-ended and illdefined concepts.
Steps in building a casebased reasoning system
1. Obtain data for cases.
2. Design cases based on data.
3. Determine the case memory structure.
4. Decide the case retrieval method.
5. Decide whether a case adaptation
procedure is appropriate (and, if so,
implement it).
6. Develop the rest of the system (e.g. the
user interface).
Some currently-available CBR
tools (with vendors)

Esteem (Esteem Software)
 CBR Express & CBR v.2.0 (Inference)
 ReMind (Intelligent Applications,
Cognitive Systems)
 ReCall (ISoft)
 KATE-CBR (Acknosoft)
 Some
of these are UK products, some
American, some French.
Example of a large CBR
project: the Cassipoée system

Used a combination of inductive and
CBR techniques.
 Written using KATE-CBR, by AcknoSoft
of Paris, on behalf of an engineering firm
owned by General Electric and
SNECMA.
 A diagnostic system for aircraft engines:
CFM 56-3 engines in Boeing 737s and
Airbus A340s.
Example of a large CBR
project: the Cassipoée system

The cases came from a legacy database
of 23000 engine maintenance reports,
built up over 8 years.
 Experienced
engineers worked over the
cases, eliminating items where there
was no diagnosis or mis-diagnosis, and
duplicates.
 This left 16000 cases, each with up to
100 features.
Example of a large CBR
project: the Cassipoée system

Case selection was by a decision tree,
generated from the cases.

This directed the questioning of the user,
to provide a set of symptoms, to select
cases.
Example of a large CBR
project: the Cassipoée system

Extra features:
 Integrated
with an Illustrated Part
Catalogue
 Generates reports of reliability and
maintainability using EXCEL
 Uses e-mail to collect maintenance
reports world-wide.
Example of a large CBR
project: the Cassipoée system

Success
 Very
fast diagnosis: reduced diagnosis time
by 50%
 Won 1st prize for innovative software
applications at the European XPS show,
Germany, March 1995.
A note on knowledge
acquisition

In rule-based reasoning, knowledge
is extracted from experts and encoded
in rules. This is often difficult to do. In
case-based reasoning, most (but not all)
knowledge is in the form of cases. 
A note on knowledge
acquisition

Case-based reasoners also need the
same semantic knowledge that rulebased reasoners need. In addition,
case-based reasoners need adaptation
rules and similarity metrics - more types
of knowledge, but perhaps knowledge
that is easier to acquire. 
A note on knowledge
acquisition

Several recent studies point to the
relative ease with which case-based
reasoners can be built as compared to
building the same rule-based systems. 
Kolodner (1993), p.94
Knowledge acquisition
In one study, the Digital Equipment
Corporation commissioned two systems
(for customer technical support), with
equivalent functionality.
 One, called CANASTA, was rule-based;
one, called CASCADE, was case-based.

Knowledge acquisition

CANASTA took 960 days of development
time

CASCADE required 105 days.

However, the personnel required for the
CANASTA development were more valuable
than those required for CASCADE
 if
one takes account of this, the development of
CANASTA took the equivalent of 1600 days,
and CASCADE the equivalent of 193 days.
Knowledge acquisition
CANASTA took 960 days of development
time
 CASCADE required 105 days.
 However, the personnel required for the
CANASTA development were more
valuable than those required for
CASCADE; if one takes account of this,
the development of CANASTA took the
equivalent of 1600 days, and CASCADE
the equivalent of 193 days.

Knowledge acquisition
The accuracy and efficiency of the two
systems were reckoned to be
equivalent.
 The continuing maintenance costs of
CANASTA were high, while those of
CASCADE were negligible. (Simoudis,
1991 & 1992).

A comparison between rule-based &
case-based reasoning
Criterion
Knowledge
unit
Granularity
Knowledge
acquisition
Rule-based
reasoning
Rule
Case-based
reasoning
Case
Fine
Coarse
Obtaining
rules &
hierarchies
Obtaining
cases &
hierarchies
A comparison between rule-based &
case-based reasoning
Criterion
Rule-based
reasoning
Explanation Backtrace of
mechanism rules fired
Characteristic Answer +
output
confidence
measure
Knowledge
Potentially
transfer
high
Case-based
reasoning
Precedent
cases
Answer +
precedent
cases
Low
A comparison between rule-based &
case-based reasoning
Criterion Rule-based
reasoning
Domain Domain
require- vocabulary,
ments good set of
inference
rules, rules
which hold
throughout
domain
Case-based
reasoning
Domain
vocabulary,
casebase of
example
cases,
stability:
modified cases
still hold
A comparison between rule-based &
case-based reasoning
Advantages
Rule-based
reasoning
Flexible
use of
knowledge,
potentially
optimal
answers.
Case-based
reasoning
Rapid
knowledge
acquisition,
explanation by
example
A comparison between rule-based &
case-based reasoning
Rule-based Case-based
reasoning reasoning
ComputationDisadvantages ally
expensive,
long
development
time,
impenetrable
explanations
Suboptimal
solutions,
redundancy in
knowledge base