A Comparison of Rule-Based versus Exemplar-Based Categorization in a Model of Sensemaking
Download ReportTranscript A Comparison of Rule-Based versus Exemplar-Based Categorization in a Model of Sensemaking
A Comparison of Rule-Based versus Exemplar-Based Categorization Using the ACT-R Architecture Matthew F. RUTLEDGE-TAYLOR, Christian LEBIERE, Robert THOMSON, James STASZEWSKI and John R. ANDERSON Carnegie Mellon University, Pittsburgh, PA, USA 19th Annual ACT-R Workshop: Pittsburgh, PA, USA Overview • Categorization theories • Facility Identification Task – Study examples of four different facilities – Categorize unseen facilities • ACT-R Models – Rule-based versus Exemplar-based – Three different varieties of each based on information attended • Model Results – Rule-based models are equivalent to exemplar-based models in terms of hit-rate performance • Discussion Categorization theories • Rule-based theories (Goodman, Tenenbaum, Feldman & Griffiths, 2008) – Exceptions, e.g. RULEX (Nosofsky & Palmeri, 1995) – Probabilistic membership (Goodman et al., 2008) • Prototype theories (Rosch, 1973) – Multiple prototype theories • Exemplar theories (Nosofsky, 1986) – WTA vs weighted similarity • ACT-R has been used previously to compare and contrast exemplar-based and rule-based approaches to categorization (Anderson & Betz, 2001) Facility Identification Task Notional Simulated Imagery • Four kinds of facilities • Probabilistic feature composition Building (IMINT) Hardware MASINT1 MASINT2 SIGINT Facility Identification Task • Probabilistic occurrences of features Facility A Facility B Facility C Facility D Building 1 High Mid High Mid Building 2 High Mid High High Building 3 High Mid Mid High Building 4 High High Mid Mid Building 5 Low High Mid High Building 6 Low High High High Building 7 Low High High Mid Building 8 Low Mid Mid High MASINT1 Few Many Few Many MASINT2 Few Many Many Few SIGINT Many Few Many Few Hardware Few Few Few Many Three comparisons • Human data versus model data – Hit-rate accuracy • Exemplar model versus rule-based model – Blended retrieval of facility chunk, VS – Retrieval of one or more rules that manipulate a probability distribution • Cognitive phenotypes: versions of both exemplar and rule-based models that attend to different data – Feature counts – Buildings that are present – Both Three participant phenotypes • Phenotype #1: Assumes buildings are key – Attentive to specific buildings in the image – Ignores the MASINT, SIGINT, and Hardware • Phenotype #2: Assumes the numbers of each feature type is key – Attentive to counts of each facility feature – Ignores the types of buildings (just counts them) • Phenotype #3: Attends to both specific buildings and feature counts Facility Identification 6 2 7 3 Phenotype #1 Specific Buildings only: SA model Building #2 Building #3 Building #6 Building #7 Facility Identification Phenotype #2 Feature type counts only: PM model Buildings 4 Hardware 1 MASINT1 6 MASINT2 2 SIGINT 5 Facility Identification 6 2 7 3 Phenotype #3 SA and PM Building #2 Building #3 Building #6 Building #7 Hardware 1 MASINT1 6 MASINT2 2 SIGINT 5 ACT-R Exemplar based model • Implicit statistical learning – Commits tokens of facilities to declarative memory • Slots for facility type (A, B, C or D) • Slots for sums of each feature type • Slot for presence (or absence) of each building (IMINT) • Categorization – Retrieval request made to DM based on facility features in target – Category slot values of retrieved chunk is used as categorization decision of the model Facility chunk ACT-R: Chunk activation • Ai = Bi + Si + Pi + Ɛi • • • • Ai is the net activation, Bi is the base-level activation, Si is the effect of spreading activation, Pi is the effect of the partial matching mismatch penalty, and • Ɛi is magnitude of activation noise. Spreading Activation • All values in all included buffers, spread activation to DM • All facility features stored held in the visual buffer spread activation to all chunks in DM • Primary retrieval factor for phenotype #1 (buildings) Spreading Activation Visual Buffer Declarative Memory Facility Chunk Facility Chunk Facility Chunk Facility Chunk category d category d category d category a b1 nil b1 nil b1 nil b1 building1 b2 building2 b2 building2 b2 building2 b2 nil b3 building3 b3 nil b3 building3 b3 building3 b4 nil b4 nil b4 building4 b4 nil b5 nil b5 nil b5 nil b5 building5 b6 building6 b6 building6 b6 building6 b6 nil b7 building7 b7 building7 b7 nil b7 nil Partial Matching • The partial match is on a slot by slot basis • For each chunk in DM, the degree to which each slot mismatches the corresponding slot in the retrieval cue determines the mismatch penalty • Primary retrieval factor for phenotype #2 (counts) Partial Matching Retrieval Buffer Declarative Memory Facility Chunk Facility Chunk Facility Chunk Facility Chunk category d category d category d category c buildings b4 buildings b4 buildings b5 buildings b5 Masint1 m6 Masint1 m7 Masint1 m4 Masint1 m1 Masint2 n2 Masint2 n0 Masint2 n1 Masint2 n8 Sigints s5 Sigints s7 Sigints s5 Sigints s5 hardware h1 hardware h2 hardware h2 hardware h0 •Equal values = no penalty •Similar values = low penalty •Dissimilar values = high penalty Heat Map on Counts of Features Results of Exemplar Based Model • PM only – 0.462 • SA only – 0.665 • PM + SA – 0.720 Facility A Facility B Facility C Facility D Facility A 0.559 0.077 0.356 0.108 Facility B 0.090 0.490 0.124 0.288 Facility C 0.274 0.116 0.375 0.180 Facility D 0.077 0.316 0.145 0.424 Facility A Facility B Facility C Facility D Facility A 0.585 0.006 0.065 0.025 Facility B 0.017 0.635 0.062 0.108 Facility C 0.247 0.061 0.585 0.054 Facility D 0.151 0.297 0.287 0.813 Facility A Facility B Facility C Facility D Facility A 0.719 0.006 0.065 0.035 Facility B 0.013 0.716 0.058 0.133 Facility C 0.177 0.074 0.691 0.079 Facility D 0.090 0.203 0.185 0.753 • Human Participant Accuracy: 0.535 – Performance and interviews suggests • Mix of phenotypes, with #2 (PM-like) most prevalent • Employment of some explicit rules ACT-R Rule Based Model • Applied a set of rules to the unidentified target facility • Accumulated a net probability distribution over the four possible facility categories • Facility with greatest probability is the forced choice category response by the model ACT-R Rule Based Model • Two kinds of rules – SA-like: applies to presence of buildings – PM-like: applies to feature counts • Rules implemented as chunks in DM • Sets of dedicated productions for retrieving relevant rules • High confidence in choice of rules – Based on analysis of probabilities of features ACT-R Rule Based Model • Example building rule - If is present then facility A is 1.38 times more likely (than if not present) • Example count rule - If there are 5 MASINT1 then facility A is 3 times more likely (than if more or less) - Note: Count rules apply if count total in target is within a threshold difference of number in rule Rule chunks ACT-R Rule Based Model • Three versions of the rules based model – Only apply building rules: similar to SA exemplar model – Only apply count rules: similar to PA exemplar model – Apply both building and count rules: similar to combined exemplar model ACT-R Rule Based Model Results • Building rules only: 0.657 • Count rules only: 0.476 • Both building and count rules: 0.755 Strategy Rule-based Exemplar % Difference SA / Buildings 0.657 0.655 0.30 PM / Counts 0.476 0.462 2.94 Combined 0.755 0.720 4.64 Discussion • Agreement between rule-based and exemplar models, implemented in ACT-R, supports the equivalence of these approaches – They exploit the same available information • The performance equivalence between the two establishes that functional Bayesian inferencing can be accomplished in ACT-R either through: – explicit, rule application – implicit, subsymbolic processes of the activation calculus, that support the exemplar model • ACT-R learning mechanisms of the subsymbolic system in ACT-R is Bayesian in nature (Anderson, 1990; 1993) • Blending allows ACT-R to implement importance sampling (Shi, et al., 2010) Control PRIME Acknowledgements © 2010 HRL Laboratories, LLC. All Rights Reserved - HRL 313 This document may contain technology subject to U.S. Export controls. • This work is supported by the Intelligence Advanced Research Projects Activity (IARPA) via Department of the Interior (DOI) contract number D10PC20021. The U.S. Government is authorized to reproduce and distribute reprints for Governmental purposes notwithstanding any copyright annotation thereon. The views and conclusions contained hereon are those of the authors and should not be interpreted as necessarily representing the official policies or endorsements, either expressed or implied, of IARPA, DOI, or the U.S. Government. 6 Blended Retrieval • Standard retrieval – One previously existing chunk is retrieved • Effectively, WTA closest exemplar • Blending – One new chunk which is a blend of matching chunks is retrieved (created) – All slots not specified in the retrieval cue are assigned blended values – The contribution each exemplar chunk makes to blended slot values is proportional to the activation of the chunk