A Comparison of Rule-Based versus Exemplar-Based Categorization in a Model of Sensemaking

Download Report

Transcript A Comparison of Rule-Based versus Exemplar-Based Categorization in a Model of Sensemaking

A Comparison of Rule-Based versus
Exemplar-Based Categorization
Using the ACT-R Architecture
Matthew F. RUTLEDGE-TAYLOR, Christian LEBIERE,
Robert THOMSON,
James STASZEWSKI and John R. ANDERSON
Carnegie Mellon University, Pittsburgh, PA, USA
19th Annual ACT-R Workshop: Pittsburgh, PA, USA
Overview
• Categorization theories
• Facility Identification Task
– Study examples of four different facilities
– Categorize unseen facilities
• ACT-R Models
– Rule-based versus Exemplar-based
– Three different varieties of each based on information
attended
• Model Results
– Rule-based models are equivalent to exemplar-based
models in terms of hit-rate performance
• Discussion
Categorization theories
• Rule-based theories (Goodman, Tenenbaum, Feldman
& Griffiths, 2008)
– Exceptions, e.g. RULEX (Nosofsky & Palmeri, 1995)
– Probabilistic membership (Goodman et al., 2008)
• Prototype theories (Rosch, 1973)
– Multiple prototype theories
• Exemplar theories (Nosofsky, 1986)
– WTA vs weighted similarity
• ACT-R has been used previously to compare and
contrast exemplar-based and rule-based approaches to
categorization (Anderson & Betz, 2001)
Facility Identification Task
Notional Simulated Imagery
• Four kinds of
facilities
• Probabilistic
feature
composition
Building (IMINT)
Hardware
MASINT1
MASINT2
SIGINT
Facility Identification Task
• Probabilistic occurrences of features
Facility A
Facility B
Facility C
Facility D
Building
1
High
Mid
High
Mid
Building
2
High
Mid
High
High
Building
3
High
Mid
Mid
High
Building
4
High
High
Mid
Mid
Building
5
Low
High
Mid
High
Building
6
Low
High
High
High
Building
7
Low
High
High
Mid
Building
8
Low
Mid
Mid
High
MASINT1
Few
Many
Few
Many
MASINT2
Few
Many
Many
Few
SIGINT
Many
Few
Many
Few
Hardware
Few
Few
Few
Many
Three comparisons
• Human data versus model data
– Hit-rate accuracy
• Exemplar model versus rule-based model
– Blended retrieval of facility chunk, VS
– Retrieval of one or more rules that manipulate a
probability distribution
• Cognitive phenotypes: versions of both exemplar and
rule-based models that attend to different data
– Feature counts
– Buildings that are present
– Both
Three participant phenotypes
• Phenotype #1: Assumes buildings are key
– Attentive to specific buildings in the image
– Ignores the MASINT, SIGINT, and Hardware
• Phenotype #2: Assumes the numbers of each
feature type is key
– Attentive to counts of each facility feature
– Ignores the types of buildings (just counts them)
• Phenotype #3: Attends to both specific
buildings and feature counts
Facility Identification
6
2
7
3
Phenotype #1
Specific Buildings only:
SA model
Building #2
Building #3
Building #6
Building #7
Facility Identification
Phenotype #2
Feature type counts only:
PM model
Buildings 4
Hardware 1
MASINT1 6
MASINT2 2
SIGINT 5
Facility Identification
6
2
7
3
Phenotype #3
SA and PM
Building #2
Building #3
Building #6
Building #7
Hardware 1
MASINT1 6
MASINT2 2
SIGINT 5
ACT-R Exemplar based model
• Implicit statistical learning
– Commits tokens of facilities to declarative memory
• Slots for facility type (A, B, C or D)
• Slots for sums of each feature type
• Slot for presence (or absence) of each building (IMINT)
• Categorization
– Retrieval request made to DM based on facility
features in target
– Category slot values of retrieved chunk is used as
categorization decision of the model
Facility chunk
ACT-R: Chunk activation
• Ai = Bi + Si + Pi + Ɛi
•
•
•
•
Ai is the net activation,
Bi is the base-level activation,
Si is the effect of spreading activation,
Pi is the effect of the partial matching mismatch
penalty, and
• Ɛi is magnitude of activation noise.
Spreading Activation
• All values in all included buffers, spread activation to
DM
• All facility features stored held in the visual buffer
spread activation to all chunks in DM
• Primary retrieval factor for phenotype #1 (buildings)
Spreading Activation
Visual Buffer
Declarative Memory
Facility Chunk
Facility Chunk
Facility Chunk
Facility Chunk
category d
category d
category d
category a
b1 nil
b1 nil
b1 nil
b1 building1
b2 building2
b2 building2
b2 building2
b2 nil
b3 building3
b3 nil
b3 building3
b3 building3
b4 nil
b4 nil
b4 building4
b4 nil
b5 nil
b5 nil
b5 nil
b5 building5
b6 building6
b6 building6
b6 building6
b6 nil
b7 building7
b7 building7
b7 nil
b7 nil
Partial Matching
• The partial match is on a slot by slot basis
• For each chunk in DM, the degree to which each slot
mismatches the corresponding slot in the retrieval
cue determines the mismatch penalty
• Primary retrieval factor for phenotype #2 (counts)
Partial Matching
Retrieval Buffer
Declarative Memory
Facility Chunk
Facility Chunk
Facility Chunk
Facility Chunk
category d
category d
category d
category c
buildings b4
buildings b4
buildings b5
buildings b5
Masint1 m6
Masint1 m7
Masint1 m4
Masint1 m1
Masint2 n2
Masint2 n0
Masint2 n1
Masint2 n8
Sigints s5
Sigints s7
Sigints s5
Sigints s5
hardware h1
hardware h2
hardware h2
hardware h0
•Equal values = no penalty
•Similar values = low penalty
•Dissimilar values = high penalty
Heat Map on Counts of Features
Results of Exemplar Based Model
• PM only
– 0.462
• SA only
– 0.665
• PM + SA
– 0.720
Facility A
Facility B
Facility C
Facility D
Facility A
0.559
0.077
0.356
0.108
Facility B
0.090
0.490
0.124
0.288
Facility C
0.274
0.116
0.375
0.180
Facility D
0.077
0.316
0.145
0.424
Facility A
Facility B
Facility C
Facility D
Facility A
0.585
0.006
0.065
0.025
Facility B
0.017
0.635
0.062
0.108
Facility C
0.247
0.061
0.585
0.054
Facility D
0.151
0.297
0.287
0.813
Facility A
Facility B
Facility C
Facility D
Facility A
0.719
0.006
0.065
0.035
Facility B
0.013
0.716
0.058
0.133
Facility C
0.177
0.074
0.691
0.079
Facility D
0.090
0.203
0.185
0.753
• Human Participant Accuracy: 0.535
– Performance and interviews suggests
• Mix of phenotypes, with #2 (PM-like) most prevalent
• Employment of some explicit rules
ACT-R Rule Based Model
• Applied a set of rules to the unidentified
target facility
• Accumulated a net probability distribution
over the four possible facility categories
• Facility with greatest probability is the forced
choice category response by the model
ACT-R Rule Based Model
• Two kinds of rules
– SA-like: applies to presence of buildings
– PM-like: applies to feature counts
• Rules implemented as chunks in DM
• Sets of dedicated productions for retrieving
relevant rules
• High confidence in choice of rules
– Based on analysis of probabilities of features
ACT-R Rule Based Model
• Example building rule
- If
is present then facility A is 1.38 times
more likely (than if not present)
• Example count rule
- If there are 5 MASINT1 then facility A is 3
times more likely (than if more or less)
- Note: Count rules apply if count total in target is
within a threshold difference of number in rule
Rule chunks
ACT-R Rule Based Model
• Three versions of the rules based model
– Only apply building rules: similar to SA exemplar
model
– Only apply count rules: similar to PA exemplar
model
– Apply both building and count rules: similar to
combined exemplar model
ACT-R Rule Based Model Results
• Building rules only: 0.657
• Count rules only: 0.476
• Both building and count rules: 0.755
Strategy Rule-based Exemplar
% Difference
SA / Buildings
0.657
0.655
0.30
PM / Counts
0.476
0.462
2.94
Combined
0.755
0.720
4.64
Discussion
• Agreement between rule-based and exemplar models,
implemented in ACT-R, supports the equivalence of these
approaches
– They exploit the same available information
• The performance equivalence between the two establishes that
functional Bayesian inferencing can be accomplished in ACT-R
either through:
– explicit, rule application
– implicit, subsymbolic processes of the activation calculus, that support
the exemplar model
• ACT-R learning mechanisms of the subsymbolic system in ACT-R is
Bayesian in nature (Anderson, 1990; 1993)
• Blending allows ACT-R to implement importance sampling (Shi, et
al., 2010)
Control
PRIME
Acknowledgements
© 2010 HRL Laboratories, LLC. All Rights Reserved - HRL 313
This document may contain technology subject to U.S. Export controls.
• This work is supported by the Intelligence Advanced
Research Projects Activity (IARPA) via Department of
the Interior (DOI) contract number D10PC20021. The
U.S. Government is authorized to reproduce and
distribute reprints for Governmental purposes
notwithstanding any copyright annotation thereon. The
views and conclusions contained hereon are those of
the authors and should not be interpreted as
necessarily representing the official policies or
endorsements, either expressed or implied, of IARPA,
DOI, or the U.S. Government.
6
Blended Retrieval
• Standard retrieval
– One previously existing chunk is retrieved
• Effectively, WTA closest exemplar
• Blending
– One new chunk which is a blend of matching chunks is
retrieved (created)
– All slots not specified in the retrieval cue are assigned
blended values
– The contribution each exemplar chunk makes to
blended slot values is proportional to the activation of
the chunk