Foundations; semiotics, library, cognitive and social science and information modeling Peter Fox Xinformatics – ITEC, CSCI, ERTH 4400/6400 Week 3, February 5, 2013
Download ReportTranscript Foundations; semiotics, library, cognitive and social science and information modeling Peter Fox Xinformatics – ITEC, CSCI, ERTH 4400/6400 Week 3, February 5, 2013
Foundations; semiotics, library, cognitive and social science and information modeling Peter Fox Xinformatics – ITEC, CSCI, ERTH 4400/6400 Week 3, February 5, 2013 1 Contents • Review of last class, reading • Foundations; semiotics, library, cognitive and social science and information modeling but trying to still stay clear of architectures • Assignment 1 • Next classes 2 Reading Review • Information entropy • Information Is Not Entropy, Information Is Not Uncertainty! • More on entropy • Context • Information retrieval • Compression and encoding: D.A. Huffman • Abductive reasoning 3 Semiotics • Also called semiotic studies or semiology, is the study of sign processes (semiosis), or signification and communication, signs and symbols 4 A sign (Peirce and Eco 1979) 1. “A sign stands for something to the idea which it produces or modifies.... 2. That for which it stands is called its object, that which it conveys, its meaning; and the idea which it gives rise, its interpretant 3. ....[the sign creates in the mind] an equivalent sign, or perhaps a more developed sign.” (Peirce) 1. “That sign which it creates I call the interpretant of the first sign. 2. This sign stands for something, its object. 3. It stands for that object, not in all respects, but in reference to a sort of idea which I have sometimes called the ground of that representation.” (Eco) 5 Examples 6 Icons (Meaning based on similarity of appearance) 7 Index • A sign related to an object • Signifier <-> Signified • Meaning based on cause and effect relationships • E.g. in a particular configuration, the letters "E", "D" and "R" will form the sequence "R", "E", "D". • RED denotes a certain color, but neither the letters individually nor their formal combination into a word have anything to do with redness. 8 Index examples • Smoke, thermometer, clock, spirit-level, foot or fingerprint, knock on door • Signify what? • Fire, temperature, time, alignment, identity, announcement • Or? 9 Symbol (meaning based on convention) 10 Semiotic model 11 Syntax • Relation of signs to each other in formal structures • … the term syntax is also used to refer directly to the rules and principles that govern the … • But not the meaning or the use! 12 Semantics • Relation between signs and the things to which they refer; their denotata • Study of meaning of … (anything?) • Mainly need to worry about failures 13 Pragmatics • Relation of signs to their impacts on those who use them • the ways in which context contributes to meaning, conveying and use 14 But in a digital world? • Oh, and you thought I would answer all your questions and doubts ;-) 15 Library science • Curates the artifacts of knowledge • Has developed over centuries • Separates principles from what they are to how they have been implemented 16 Collections, Directories, … • Organizes and manages them for consumers – Cataloging and classification • Dictionaries, thesauri, encyclopaedias, maps, charts, ... • Reference services (authority) • Bibliographical organization and mapping • Important for logical and physical models and how to manage and provide content 17 Indexing and abstracts • To organize, find and summarize things • To facilitate search via information retrieval mechanisms (F. van Harlemen – ‘we only need information retrieval because we perform information burial’, 2010) • To facilitate precision in search via sufficient metainformation • Dewey decimal, Library of Congress • Search: Z39.50 (ISO23950), Circulation Interchange Protocol (CIP), MAchine Readable Catalog (MARC) 18 Preservation • ‘Maintaining or restoring access to artifacts, documents and records through the study, diagnosis, treatment and prevention of decay and damage’ (wikipedia) • Digital age – Curation and preservation – Translating the full life cycle (or the ecosystem of data and information) 19 Libraries also have taught us • Access – Limited or open • Rights and responsibilities – Attribution and citation – Proprietary and security • Ethical and legal issues – Free publication of how to violate laws, build bombs • Publishing – What is required to be published – Record and dissemination mechanisms 20 Cognitive Science • Cognitive science is the interdisciplinary study of the mind and intelligence • It operates at the intersection of psychology, philosophy, computer science, linguistics, anthropology, and neuroscience. 21 Mental Representation • Thinking = representational structures + procedures that operate on those structures. • Data structures + mental representations+ algorithms +procedures= running programs =thinking • Methodological consequence: study the mind by developing computer simulations of thinking. 22 What is an explanation of behavior? – Programs that simulate cognitive processes explain intelligent behavior by performing the tasks whose performance they explain. – Neurophysiological explanation is compatible with computational explanation, but operates at a different level. – At the neural level, cognitive processes are parallel, but at the symbolic level, the brain behaves like a serial system. – The human mind is an adaptive system, learning to improve its performance in accomplishing its goals. 23 Nature of Expertise • Manifests as cognition – refers to an information processing view of an individual's psychological functions – Process of thought as ‘knowing’ • Indicates a level of knowing and action that is above the non-expert • Characterizing the expert versus the nonexpert (or specialist vs. non-) is very important in information systems • E.g. can a non-expert system be just as easily used and exploited by an expert? 24 Epistemology • Theory of knowledge – and to do this effectively you need to be concerned with: – Truth, belief, and justification – Means of production of knowledge – Skepticism about different knowledge claims • Recall the data-information-knowledge ecosystem? • Understanding what part this plays in your modeling and architecture can be critical 25 Classical view of knowledge 26 Intuition • This returns us to semiotics and to some extent heuristics and abduction understanding without apparent effort • Heuristics - experience-based techniques that help in problem solving, learning and discovery • Abduction we’ve covered … • So how do you eek out (technical term) intuition? – Use the cognitive process – drawing or mapping! 27 Metamodeling and Mindmaps 28 More mind maps 29 Quality & Bias from the Aerosol Parameter FreeMind allows capturing various relations between various aspects of aerosol measurements, algorithms, conditions, validation, etc. The “traditional” worksheets do not support complex multiOntology dimensional nature of the task Some tools • For use case development – simple graphics tools, e.g. graffle • Mindmaps, e.g. Freemind • For modeling (esp. UML): – http://en.wikipedia.org/wiki/List_of_Unified_Model ing_Language_tools • For estimating information uncertainty, yes some algorithms and software exist • Concept, topic, subject maps!! (try searching) – http://cmap.ihmc.us 31 Cultural norms • Modes of what and how rewards are given • Between those who produce and those who consume data and information • How you collect, understand, model and design models and architectures is as much social as technical skill 32 Discipline norms • Rewards – Computer science – conference proceedings before – Physical science – journal publication after – Engineer - patents – Humanities – journal and conference • The line between producers and consumers is though of as blurred – refer to our information fig – is it? • Collecting, understanding, modeling and designing architectures is social more than a technical skill (sorry!) 33 Sociology of groups, teams 34 Social Science • Networks of information providers • Reputation matters a lot 35 Understanding each other 36 Information Modeling • Conceptual • Logical • Physical 37 Information models - bad • It's very easy to tell when a Web site you're trying to navigate has no underlying Information Model. Here are the tell-tale characteristics: – You can't tell how to get from the home page to the information you're looking for. – You click on a promising link and are unpleasantly surprised at what turns up. – You keep drilling down into the information layer after layer until you realize you're getting farther away from your goal rather than closer. – Every time you try to start over from the home page, you end up in the same wrong place. – You scroll through a long alphabetic list of all the articles ever written on a particular subject with only the title to guide you. 38 Information models – good • Oddly enough, you generally don't notice a wellconceived Information Model because it simply doesn't get in the way of your search. – On the home page, you notice promising links right away. – Two or three clicks get you to exactly what you wanted. – The information seems designed just for you because someone has anticipated your needs. – You can read a little or ask for more - the crossreferences are in the right places. – Right away you feel that you're on familiar ground similar types of information start looking the same. 39 Information Models • Conceptual models, sometimes called domain models, are typically used to explore domain concepts and often created – as part of initial requirements envisioning efforts as they are used to explore the high-level static business or science or medicine structures and concepts – as the precursor to logical models or as alternatives to them • Followed by logical and physical models 40 Logical models • A logical entity-relationship model is provable in the mathematics of data science. Given the current predominance of relational databases, logical models generally conform to relational theory. • Thus a logical model contains only fully normalized entities. Some of these may represent logical domains rather than potential physical tables. 41 Logical models • For a logical data model to be normalized, it must include the full population of attributes to be implemented and those attributes must be defined in terms of their domains or logical data types (e.g., character, number, date, picture, etc.). • A logical data model requires a complete scheme of identifiers or candidate keys for unique identification of each occurrence in every entity. Since there are choices of identifiers for many entities, the logical model indicates the current selection of identity. Propagation of identifiers as foreign keys may be explicit or implied. • Since relational storage cannot support many-to-many concepts, a logical data model resolves all many-to-many relationships into associative entities which may acquire 42 independent identifiers and possibly other attributes as well. Physical models • A physical model is a single logical model instantiated in a specific information system (e.g., relational database, RDF/XML document, etc.) in a specific installation. • The physical model specifies implementation details which may be features of a particular product or version, as well as configuration choices for that instance. 43 Physical models • E.g. for a database, these could include index construction, alternate key declarations, modes of referential integrity (declarative or procedural), constraints, views, and physical storage objects such as tablespaces. • E.g. for RDF/XML, this would include namespaces, declarative relations, etc. 44 Conceptual model example • Radiative process model after a volcanic eruption 45 Logical model example 46 For example for relational DBs Feature Conceptual Logical Physical Entity Names ✓ ✓ Entity Relationships ✓ ✓ Attributes ✓ Primary Keys ✓ ✓ Foreign Keys ✓ ✓ Table Names ✓ Column Names ✓ Column Data Types ✓ 47 Another simple example 48 Logical Model 49 Physical Information Model • Used to – design the internal schema (e.g. of a database, or file), – depict the structured information (e.g. tables), – specify the layout of those structures (e.g. columns of those tables) – specify the relationships between the structures (e.g. primary keys). 50 Object oriented design • Object-oriented modeling is a formal way of representing something in the real world (draws from traditional set theory and classification theory). Some basics to keep in mind in object-oriented modeling are that: – Instances are things. – Properties are attributes. – Relationships are pairs of attributes. – Classes are types of things. – Subclasses are subtypes of things. 51 Object model • Class: a means of grouping all the objects which share the same set of attributes and methods. • An object must belong to only one class as an instance of that class (instance-of relationship). • A class is similar to an abstract data type. • Class hierarchy and inheritance: derive a new class (subclass) from an existing class (superclass) – subclass inherits all the attributes and methods of the existing class and may have additional attributes and methods – single inheritance (class hierarchy) vs. multiple inheritance (class lattice). 52 Core object models consist of: • object and object identifier: Any real world entity is uniformly modeled as an object (associated with a unique id: used to pinpoint an object to retrieve). • attributes and methods: every object has a state (the set of values for the attributes of the object) and a behavior (the set of methods - program code which operate on the state of the object). • the state and behavior encapsulated in an object are accessed or invoked from outside the object. 53 Steps in modeling • • • • • • • Identify objects (entity) and their types Identify attributes Apply naming conventions Identify relationships Apply model patterns (if known) Assign relationships Normalize to reduce redundancy (this is called refactoring in software engineering) • Denormalize to improve performance 54 Not just an isolated set of models • Most important for handling errors, evolution, extension, restriction, … where to do that: – To the physical model? – To the logical model? – To the conceptual model? • Relating to and/ or integrating with other information models? – General rule – integrate at the highest level you can (i.e. more abstract) – Remember the cognitive aspects! 55 Recall our example 56 57 58 Tools for modeling.. • Models – http://www.datamodel.org/ – MSDN: http://msdn.microsoft.com/enus/library/bb399249.aspx • Schema – to rebuild logical models from physical! – Schematron differs in basic concept from other schema languages in that it not based on grammars but on finding tree patterns in the parsed document. This approach allows many kinds of structures to be represented which are inconvenient and difficult in grammar-based schema languages. If you know XPath or the XSLT expression language, you can start to use Schematron immediately. – http://www.schematron.com/ 59 Tools • Concept mapping – http://cmap.ihmc.us/ • Mindmapping – http://en.wikipedia.org/wiki/List_of_mind_mappin g_software • White board • Piece of paper … you get the idea? 60 Discussion • About semiotics • Library science • Cognitive science • Social science • Modeling 61 Reading for this week • Is retrospective but … relates to your assignment 62 Assignment 1 • Analysis of cognitive, collection and social/cultural aspects of information in signs, discussed and decomposed along the lines we have talked about today and last week with some modeling thrown in and then you present in class • Due on Feb 26th – write up and presentations • Assignment 2 available Feb 12th. 63 What is next • February 12 – Week 4 – Capturing the problem: Use Case development and requirements analysis • February 19 – no class, Tuesday follows Monday schedule • February 26 – Week 5 – presentations (5 mins to present, including questions) 64