Transcript Document

Multi-agent architectures that facilitate
apprenticeship learning for
real-time decision making: Minerva and Gerona
David C. Wilkins
Center for Study of Language and Expertise
Stanford University
David Fried
Department of Computer Science
University of Illinois at U-C
November 5, 2005
Supported by ONR: N00014-00-1-0660, N00014-02-1-0731
1
Outline
• Goal
– Expert shells  multi-agent capabilities
• Minerva – medical diagnosis (1992-1994)
– Apprentice program observes expert, improves agent
• Genona – ship damage control (2002-2005)
– Apprentice program observes student, improves student
• Summary and conclusions
2
Expert Shells -> Multi-Agent Capabilities
• Traditional performance capabilities
– Correct solution, Efficient problem solving
• Multi-agent capabilities
– Critiquing
• Expert agent watches – finds errors omission/commission
– Apprenticeship Learning
• Expert agent watches expert, improves expert agent
• Expert agent watches student, improves student
• Research philosophy
– Critiquing & apprenticeship should be natural artifact of shell architecture
– Same apprenticeship method should support both learning and tutoring
– Unified arch for dimensions of expertise is approach to cognitive modeling
3
Apprenticeship Learning Paradigm
Problem
Human
Problem Solver
Actions
Expert
Agent
Learning
Program
Actions
KN Differences
• Situated Learning: within context of problem solving
• Good for knowledge refinement of human or expert agent
4
Apprenticeship Learning Challenges
• Global credit assignment
– Does good explanation of human action exist?
– Challenge: some explanation usually exists
• Local credit assignment
– What KN difference creates good explanation?
– Challenge: Many repairs will create explanation
• Variance among human problem solvers
– How to distinguish between allowable variations among
human problem solvers (who among other things often
disagree) and variations that suggest knowledge errors
• Solution
– Minerva shell architecture
5
Minerva-Based Apprenticeship Learning:
Domain of Neurology Diagnosis
1. Debra Arbed, a 39 year old black female.
2. Chief complaint is headache, nausea, vomiting, stiff neck.
3. Headache duration? 6 hours.
4. Headache severity? 4 on scale of 0-4.
5. Fever? No.
6. Recent seizures? No.
7. Visual problems? No.
8. Headache onset? Abrupt.
30. Final diagnosis is subarachnoid hemorrhage.
31. Secondary dx is acute bacterial meningitis.
6
Evolution of Decision-Making Expert Shells:
Separation of Different Knowledge Types
Mycin
(1972)
Guidon
Tieresias
(1978)
Program
Neomycin
(1982)
Guidon2
(1987)
Odysseus
(1988)
Minerva
(1992)
Odysseus2
(1994)
Inference
Inference
Sched Kn
Inference
Task Kn
Task Kn
Domain Kn
Domain Kn
Domain Kn
7
Domain, Task, and Scheduling KN are Distinct
• Domain KN: vocabulary and predicates mention domain
• Task KN: no mention of domain (e.g., medicine):
strategy(differentiate-hypotheses(Hyp1, Hyp2) :active-hypothesis(Hyp1), active-hypothesis(Hyp1),
different(Hyp1, Hyp2),
evidence-for(Finding1, Hyp1, Rule1, Cf1),
evidence-for(Finding1, Hyp2, Rule2, Cf2),
same-sign-cfs(Cf1, Cf2),
get-premise(Rule1, Finding, Premise1),
get-premise( Rule2, Finding, Premiise2),
premises-contradicting(Premise1, Premise2),
not rule-applied(Rule1),
strategy (apply-rule (Rule1))
• Scheduling KN: Chains (GSG…A) created by
unification. But which Action A is best?
8
Recursive Classification: Use in Scheduler
Inference Level
(Domain BBoard)
Scheduler Level
(Recursive HC)
Inference Level
(Scheduler BBoard)
Scheduler Level
(FIFO)
Strategy Level
(Hypothesis-Directed)
Strategy Level
(Exhaustive-Chaining)
Domain Level
(Medical knowledge)
Domain Level
(Scheduling knowledge)
Minerva-Medicine
Minerva-Scheduler
9
Recursive Classification:
Induction of Embedded Knowledge Base of
Scheduler Rules
• Induction of Scheduling rules:
– 10-70 (39 avg.) classes, 42 features
– 286 scheduling rules
– Disjoint training and validation sets.
• Critiquing evaluation
– Expert’s action upper 10% = 52.2%
– Expert’s action upper 20% = 67.4%
– Expert’s action upper 50% = 84.8%
10
Minerva: Related Research
• Blackboard Architectures (BB1, Hearsay III)
– Opaque code or scheduler hardwired: not learnable.
• Classification Shells (Mole, Neomycin, Protos, Internist)
– Scheduler is mostly hard-wired.
• Advanced Classification Shells (Ask/Mu)
– scheduler knowledge specialized 1 expert.
• Critiquing Systems (Disciple, Oncocin/Protégé)
– Classification vs. task reduction vs. therapy plans
11
The Problem of Ship Damage Control
• Ship crises
– Fire, smoke, flooding, pipe rupture
– Primary and secondary damage
• Damage Control Assistant (DCA)
– Responsible for overall crisis management
– Makes damage control decisions
– Coordinates investigation and repair teams
12
“Damage Control Assistant” Expertise
How to get decision-making practice?!
• Expertise requires practice
– Time-critical decision-making
– High stress, information overload
– Uncertain and incomplete information
• “Whole task” practice difficult to acquire
– Actual ship crises infrequent
– Realistic practice expensive and dangerous
– Rotation cycle is 2-3 years
13
The DCA Decision-Making Task:
Fires, Smoke, Floods, Ruptures, etc
• Event to DCA: fire observed in compartment 1-174-0-L
• Event to DCA: pipe rupture observed compart 1-191-0-Q
• Action by DCA: send repair party to compart 1-174-0-L
• Action by DCA: go to General Quarters (GQ)
• Action by DCA: start fire pump #3 on port side
• Critique to DCA: Error of omission: must request permission
of CO to turn on fire pump during GQ
• Action by DCA: Close firemain valve 3-274-2
• Critique to DCA: Error of commission: valve 3-274-2 does not
isolate pipe rupture
14
DC-Train 4.0 Simulation Capabilities
• Physical ship simulation
– Primary and secondary damage
– Fire, smoke, flooding, rupture, firemain
• Intelligent agent personnel simulation
– 67 ship personnel
• Commanding officer
• Engineering Officer of the Watch
• Investigator Teams, Repair Teams, etc.
15
DC-Train and SCoT-DC:
Post-Scenario Spoken Dialogue Tutoring
DCA
student
solves
problem
presented
by
DC-Train
Simulator
Expert &
Critiquing
Modules
Correct
Expert
Tutoring
Solution
& Dialogue
+
Modules
Critique
of Student
Actions
University of Illinois
|
DC-Train 4.0 w/ Critiquing |
Spoken
Dialogue
Interface
+
Interactive
Visualization
Interface
Stanford University
16
Spoken Dialogue Tutoring
Whole-Task Simulation-Based Training
of Crisis Decision Making Skills
DC-Train: Events
Physical
Simulator WorldState
and
Intelligent WorldInfo
Agents
Expert, Critiquing,
Explanation Models:
Graph Mod Operators
(GMOs, Meta-GMOs)
Actions
Event Comm Language (ECL)
is used along all arrows
DCA
Student
Causal
Story
Graph
(CSG)
Text-Based
and Spoken
Dialogue
Tutors17
Gerona Expert Agent Overview
• Goal:
– Agent architecture to support multiple uses:
• expert model, critiquing, question-answering,
explanations, spoken dialogue tutoring, etc.
• Solution
– Explicit Knowledge Representation
• ECL (vocabulary),
• GMOs, G-Clauses (expert and student critique models)
• Meta-GMOs (question-answering, explanations)
• CSGs (structured ECLs that represent all models)
– Good for knowledge acquisition from experts
– Gerona representation can be “executed” by an interpreter
18
Event Communication Language (ECL)
• Event Communication Language (ECL) statements
encode communication to and from the DCA, and
communication about state of world.
• Example
– English: Boundaries set: RL5 Talker to DCA: “DCA,
Repair 5 reports fire boundaries set for compartment
4-220-0-E, auxiliary machinery room #2.
– ECL message 6310: Boundaries set
ECL-6310 ([to], [from], “reports”, [problem],
“boundaries set for compartment”, [compartment])
19
Event Communication Language (ECL)
• ECL 2000 – WorldInfo (81)
– E.g., Contents of compartments, location of bulkheads
• ECL 3000 –WorldState Predicates (29)
– E.g., Boundaries contain compartment
• ECL 4000 – WorldState Functions (22)
– E.g., Compartment to Jurisdiction
• ECL 5000 – Actions from the DCA (48)
– E.g., Send firefighters, Start fire pump, Request permiss
• ECL 6000 – Events reported to DCA (88)
– E.g., Fire alarm, firemain pressure low, desmoking space
• ECL 7000 – Goals (36)
– E.g., Identify fire, contain fire, patch pipe rupture,
• ECL 8000 – Crises (7)
– E.g, Fire, hot mags, flood, smoke, pipe rupture, low fp 20
Causal Story Graph (CSG)
Crisis:
Fire
Satisfied
Goal:
Identify
Fire
Event:
Fire Report
Active Goal:
Control
Fire
Addressed
Goal:
Contain
Fire
Correct
Action:
Set Fire
Boundaries
Active Goal:
Extinguish
Fire
Active Goal:
Isolate
Space
Event:
Set Fire
Boundaries
in progress
Active Goal:
Apply Fire
Suppressant
Error of
Omission:
Electrically
Isolate
Space
Error of
Commission:
Fight Fire in
Space
Justification:
Why Error of
Commission?
21
Graph Modification Operators (GMO)
GMO 5120
FOR ECL 5120 “Fight Fire”
compartment -> Compartment
target -> Station
RULE 5120.fight-fire.critique.1
IF
goal(find, unaddressed, 7118, “Apply fire suppressant”,
[compartment = Compartment], _, G)
AND action(find, pending, 5120, “Fight fire in space”,
[compartment = Compartment], _, A)
AND goal(find, satisfied, 7116, “Isolate compartment if necessary”,
[compartment = Compartment], _, _),
AND goal(find, satisfied, 7117, “Active desmoke if necessary”,
[compartment = Compartment], _, _),
AND ship-state(find, _, 4302, “Best repair locker for compartment”,
[compartment = Compartment, station = Station], _, _)
22
Graph Modification Operators (cont)
THEN
action(modify, correct, 5120, “Fight fire in space”,
[compartment <- Compartment, station <- Station], _, A)
goal(modify, addressed, 7118, “Apply fire suppressant”,
[compartment <- Compartment], _, G)
END RULE
…
END GMO
23
Meta-GMO Question Types
• About 100 templates cover all past instructor-student QAs
– “Why” questions for justifying CSG nodes (12)
• “Why should I have ordered firefighting?”
– “What” questions for retrieving expert recommendations (32)
• “What should I have done after I got the fire report?”
– “What if” questions to get critiques on hypothetical actions (4)
• “What if I ordered fire boundaries to be set?”
– “When/How” questions to explain domain rules (9)
• “How do you determine what repair locker has jurisdiction?”
– “When/What/Is” questions evaluate conditions and relations (26)
• “Is there a starboard fire pump on at 3:00?”
– More complex questions involving chaining and inference (14)
• “How can I satisfy the preconditions for dewatering?”
• “If I ordered smoke boundaries, what could I do then?”
24
Meta-GMO Example
• “When is it appropriate to order firefighting?”
• Question ECL 9300 “when action”
MGMO 9300
FOR ECL 9300 “When Action”
LET action-ecl-number -> ActionECL
IF
g-clause(find, action(create, pending, ActionECL, _, _, _, _), GClauses)
g-clause(justify, GClauses, Justifications)
THEN
answer(create, _, 9300, “When Action”,
[action-ecl-number <- ActionECL, justification <- Justification],
miscellaneous-questions, JustificationNode)
END IF
END MGMO
25
In English
(direct translation)
“There are two conditions under which you should order
firefighting.
“First, when you receive a report that electrical and mechanical
isolation has completed, you still need to extinguish the fire in that
compartment, you have either active desmoked the compartment or do
not need to active desmoke the compartment, and either there is no
halon or halon has failed, find the best repair locker for that
compartment, and order that repair locker to fight the fire in the
compartment.
“Second, when you receive a report that halon has failed, you
have either isolated the compartment or the compartment cannot be
isolated, and you have either active desmoked the compartment or do
not need to active desmoke the compartment, find the best repair
locker for that compartment, and order that repair locker to fight the
fire in the compartment.”
26
In English
(intelligent translation)
“There are two things that might trigger ordering firefighting.
The first is a report of electrical and mechanical isolation achieved,
and the second is a report that halon has failed.
“The first case only applies when you need to extinguish a fire.
You also need to have active desmoked the compartment, if necessary,
and if the compartment has halon, it has to already have failed.
“In the second case, you must have active desmoked if necessary
and isolated the compartment if possible.
“In both cases, you should send the best repair locker for the
compartment to fight the fire.”
27
Meta-Graph Modification Operators
(M-GMOs)
MGMO 9002 FOR ECL 9002 "Why Sub-Optimal Action?"
LET action-node -> ActionNode
RULE 9002.1 "Explain why the action isn't correct."
IF g-clause( find, action([create, modify], correct,
ActionNode.ecl, _, _, _, _), _, CorrectGClauses)
AND roll-back(before, ActionNode, _)
AND g-clause(justify-and-evaluate, CorrectGClauses,
ActionNode, Justification)
THEN answer(create, _, 9002, "Why Sub-Optimal Action?",
[action-node <- ActionNode, justification <- Justification],
ActionNode, A)
END RULE
END MGMO
28
Power and Learnability
• A Gerona system responding to an
incoming message from an agent can do so
using an efficiently parallelizable algorithm.
• Total space complexity is O(n) and time
complexity is low-order polynomial.
• GMO rules are PAC-learnable using
“learning to take actions” paradigm, given
certain constraints on length.
29
Current Research Direction
• Extend SCoT-DC/DC-Train Spoken Tutor to
allow user-initiated tutoring.
• Approach is to map user-initiated questions in
natural language to Gerona question classes
• QABLE for Story Comprehension Q/A (Grois and Wilkins,
IJCAI-05 and ICML-05)
• Use Gerona domain model to constrain interpretations
(Fried, et al, 2003)
30
Summary
• Ability to critique and learn is facilitated by agent KR&I
– KN factorization, explicitness, modularity, being able to
reason over static and dynamic knowledge
• Two examples:
– Minerva: separation of domain, task, and scheduling
knowledge; use of Recursive Heuristic Classification
for scheduling.
– Gerona: graph operators construct a dynamic taskcentered representation
31