Symbolizing Quantity Praveen Paritosh [[email protected]] Department of Computer Science, Northwestern University, Evanston, IL 60201 Goals Representation 1.
Download ReportTranscript Symbolizing Quantity Praveen Paritosh [[email protected]] Department of Computer Science, Northwestern University, Evanston, IL 60201 Goals Representation 1.
Symbolizing Quantity Praveen Paritosh [[email protected]] Department of Computer Science, Northwestern University, Evanston, IL 60201 Goals Representation 1. Representation: What do people know about quantities? 2. Learning: How do people learn about quantities from experience? Experiment: Dimensional Partitions Representations don’t arise in vacuum. There are at least three sources of constraints on a cognitively-plausible representation – Reasoning, Ecological, and Psychological constraints. Size labeling: subjects were asked to label each country as SMALL/ MEDIUM/LARGE. Agreement = 81.2% (p<0.01) Examples Knowledge representation There is a disconnect between symbolic and numerical representations of quantity, e.g. CYC has the notion of large and knows the area of Brazil, but doesn’t know that Brazil is a large country. • Labels like large setup implicit ordinal relations, ease comparison. • Must keep tract of interesting points to classify and estimate Dimensional Partitions Symbols like Large and Small, which arise from distributional information about how the quantity varies. CARVE: A computational model Ci height short average tall Difficulties Varied sources • Personal experience: what spicy? • Science: phase transitions. • Society: poverty line. Context variability: What is expensive for me, or this place might not be true for someone else or somewhere else. Vague: Sorites paradox [Varzi, 2003] But people get along! Add these facts to original cases Quantity 1 Cj Ecological Quantities vary In range and distribution of values But in causally connected ways Structural bundles: e.g., as the engine mass increases, BHP, Bore, Displacement increases; RPM decreases. L1 Structural Partitions • Distributional information • Causal relationships between quantities Symbols like Boiling Point and Poverty Line, that denote changes in quality, usually changes in underlying causal story and structural aspects of objects in concern. Builds upon, and generalizes the ideas of: • Limit points [Forbus 1984] • Phase transitions [Sethna 1992] • Attribute co-variation or Feature correlation [Malt and Smith, 1984] Symbolizations of Quantity Named points and intervals on the space of values – • Freezing point/ Boiling Point • Poverty line/ Lower class/ Middle class/ Upper class • Short/ Average/ Tall • Cheap/ Expensive (isa Algeria (HighValueContextualizedFn Area AfricanCountries) . . . Dimensional partitioning for each quantity Temperature of water (degree Celsius) Psychological Landmark effects Similarity across landmarks higher than on the same side of landmark [Goldman, 1972]. Asymmetry in comparing to/from landmarks [Rosch, 1975, Holyoak and Mah, 1984]. Distributional assimilation Malmi and Samson, 1983 Social psychology on stereotypes Acquisition of dimensional adjectives Ryalls and Smith, 2000 S1 Theories/computational models of similarity, retrieval and generalization do not take quantities into account in a psychologically plausible manner. Similarity • How to compute similarity/difference along a dimension? • How to combine similarity/differences across multiple dimensions? Retrieval • A bird with wingspan of 1m should remind me of other large birds as much as a red object reminds me of other red objects. Generalization • Generating qualitatively important distinctions and learning distributional information from experience. Comparison Is John taller than Chris? Semantic Congruity Effect [Flora and Banks, 1977] Classification Is John tall? Is the water boiling? Estimation How tall is John? Anchoring and adjustment [Tversky and Kahneman, 1974] Country naming: subjects were asked to name each of the 54 countries on the map. Mean correctly named = 6/54 sd = 6.5 Freezing Point • Landmarks • Distributional information. Boiling Point Income of people ($) S2 Motivation Reasoning Structural clustering using SEQL S3 Quantities: Price, Height, Temperature, Intelligence, etc. • Basketball players are tall. • Life below poverty line is hard. • Canada is larger in area than US. • Kia makes cheap cars. L2 Dimensional Partitions • K-means clustering of values on each quantitative dimension. • High/Medium/LowValueContextualizedFn (isa Algeria (HighValueContextualizedFn Area AfricanCountries)) • 74% agreement on the Countries data. Structural Partitions • Projection of structural clusters generated by SEQL [Skorstad et al, 1988; Kuehne et al, 2000] onto quantities. • No interesting structural partitions found because of lack of rich causal knowledge in knowledge base Related Work Psychological Theories Computational Models • Quantitative Estimation: Peterson • Similarity and Generalization: MAC/FAC [Forbus, Gentner and Law, 1995], SME [Falkenhainer, Forbus and Gentner, 1989], and SEQL [Kuehne, Forbus, Gentner and Quinn, 2000] and Beach (1967), Tversky and Kahneman (1974), Brown and Siegler (1993), Linder (1999). Poverty Line • Similarity: Spatial [Shepard, 1962], Lower Class Middle Class Upper Class Ci* Set-theoretic [Tversky, 1977], and Structural [Gentner, 1983]. Size of dictionaries (Number of Pages, Weight) Pocket Editions Desktop Editions Library Editions Acknowledgements This research is supported by the Computer Science Division of the Office of Naval Research. The authors would like to thank Ken Forbus, Dedre Gentner, Chris Kennedy, Lance Rips and Sven Kuehne for insightful comments and discussion on the work presented here. Cognitive Science, 2004, Chicago