Foundations; semiotics, library, cognitive and social science and information modeling Peter Fox Xinformatics – ITEC, CSCI, ERTH 4400/6400 Week 3, February 5, 2013

Download Report

Transcript Foundations; semiotics, library, cognitive and social science and information modeling Peter Fox Xinformatics – ITEC, CSCI, ERTH 4400/6400 Week 3, February 5, 2013

Foundations; semiotics,
library, cognitive and social
science and information
modeling
Peter Fox
Xinformatics – ITEC, CSCI, ERTH 4400/6400
Week 3, February 5, 2013
1
Contents
• Review of last class, reading
• Foundations; semiotics, library, cognitive
and social science and information
modeling but trying to still stay clear of
architectures
• Assignment 1
• Next classes
2
Reading Review
• Information entropy
• Information Is Not Entropy, Information Is Not
Uncertainty!
• More on entropy
• Context
• Information retrieval
• Compression and encoding: D.A. Huffman
• Abductive reasoning
3
Semiotics
• Also called semiotic studies or semiology, is
the study of sign processes (semiosis), or
signification and communication, signs and
symbols
4
A sign (Peirce and Eco 1979)
1. “A sign stands for something to the idea which it
produces or modifies....
2. That for which it stands is called its object, that
which it conveys, its meaning; and the idea which it
gives rise, its interpretant
3. ....[the sign creates in the mind] an equivalent sign,
or perhaps a more developed sign.” (Peirce)
1. “That sign which it creates I call the interpretant of
the first sign.
2. This sign stands for something, its object.
3. It stands for that object, not in all respects, but in
reference to a sort of idea which I have sometimes
called the ground of that representation.” (Eco)
5
Examples
6
Icons
(Meaning based
on similarity of
appearance)
7
Index
• A sign related to an object
• Signifier <-> Signified
• Meaning based on cause and effect
relationships
• E.g. in a particular configuration, the letters
"E", "D" and "R" will form the sequence "R",
"E", "D".
• RED denotes a certain color, but neither the
letters individually nor their formal
combination into a word have anything to do
with redness.
8
Index examples
• Smoke, thermometer, clock, spirit-level, foot
or fingerprint, knock on door
• Signify what?
• Fire, temperature, time, alignment, identity,
announcement
• Or?
9
Symbol (meaning based on convention)
10
Semiotic model
11
Syntax
• Relation of signs to
each other in formal
structures
• … the term syntax is
also used to refer
directly to the rules and
principles that govern
the …
• But not the meaning or
the use!
12
Semantics
• Relation between
signs and the
things to which
they refer; their
denotata
• Study of meaning
of … (anything?)
• Mainly need to
worry about
failures
13
Pragmatics
• Relation of signs to their
impacts on those who use them
• the ways in which context
contributes to meaning,
conveying and use
14
But in a digital world?
• Oh, and you thought I would answer all your
questions and doubts ;-)
15
Library science
• Curates the artifacts of knowledge
• Has developed over centuries
• Separates principles from what they are to
how they have been implemented
16
Collections, Directories, …
• Organizes and manages them for consumers
– Cataloging and classification
• Dictionaries, thesauri, encyclopaedias, maps,
charts, ...
• Reference services (authority)
• Bibliographical organization and mapping
• Important for logical and physical models and
how to manage and provide content
17
Indexing and abstracts
• To organize, find and summarize things
• To facilitate search via information retrieval
mechanisms (F. van Harlemen – ‘we only
need information retrieval because we
perform information burial’, 2010)
• To facilitate precision in search via sufficient
metainformation
• Dewey decimal, Library of Congress
• Search: Z39.50 (ISO23950), Circulation
Interchange Protocol (CIP), MAchine
Readable Catalog (MARC)
18
Preservation
• ‘Maintaining or restoring access to artifacts,
documents and records through the study,
diagnosis, treatment and prevention of decay
and damage’ (wikipedia)
• Digital age
– Curation and preservation
– Translating the full life cycle (or the ecosystem of
data and information)
19
Libraries also have taught us
• Access
– Limited or open
• Rights and responsibilities
– Attribution and citation
– Proprietary and security
• Ethical and legal issues
– Free publication of how to violate laws, build
bombs
• Publishing
– What is required to be published
– Record and dissemination mechanisms
20
Cognitive Science
• Cognitive science is the interdisciplinary study of
the mind and intelligence
• It operates at the intersection of psychology,
philosophy, computer science, linguistics,
anthropology, and neuroscience.
21
Mental Representation
• Thinking = representational structures +
procedures that operate on those structures.
• Data structures + mental representations+
algorithms +procedures= running programs
=thinking
• Methodological consequence: study the mind
by developing computer simulations of
thinking.
22
What is an explanation of behavior?
– Programs that simulate cognitive processes
explain intelligent behavior by performing the
tasks whose performance they explain.
– Neurophysiological explanation is compatible
with computational explanation, but operates at a
different level.
– At the neural level, cognitive processes are
parallel, but at the symbolic level, the brain
behaves like a serial system.
– The human mind is an adaptive system, learning
to improve its performance in accomplishing its
goals.
23
Nature of Expertise
• Manifests as cognition
– refers to an information processing view of an
individual's psychological functions
– Process of thought as ‘knowing’
• Indicates a level of knowing and action that is
above the non-expert
• Characterizing the expert versus the nonexpert (or specialist vs. non-) is very
important in information systems
• E.g. can a non-expert system be just as
easily used and exploited by an expert?
24
Epistemology
• Theory of knowledge – and to do this
effectively you need to be concerned with:
– Truth, belief, and justification
– Means of production of knowledge
– Skepticism about different knowledge claims
• Recall the data-information-knowledge
ecosystem?
• Understanding what part this plays in your
modeling and architecture can be critical
25
Classical view of knowledge
26
Intuition
• This returns us to semiotics and to some
extent heuristics and abduction understanding without apparent effort
• Heuristics - experience-based techniques that
help in problem solving, learning and
discovery
• Abduction we’ve covered …
• So how do you eek out (technical term)
intuition?
– Use the cognitive process – drawing or mapping!
27
Metamodeling and Mindmaps
28
More mind maps
29
Quality & Bias
from the Aerosol Parameter
FreeMind allows capturing
various relations between various
aspects of aerosol
measurements, algorithms,
conditions, validation, etc. The
“traditional” worksheets do not
support complex multiOntology
dimensional nature of the task
Some tools
• For use case development – simple graphics
tools, e.g. graffle
• Mindmaps, e.g. Freemind
• For modeling (esp. UML):
– http://en.wikipedia.org/wiki/List_of_Unified_Model
ing_Language_tools
• For estimating information uncertainty, yes
some algorithms and software exist
• Concept, topic, subject maps!! (try searching)
– http://cmap.ihmc.us
31
Cultural norms
• Modes of what and how rewards are given
• Between those who produce and those who
consume data and information
• How you collect, understand, model and
design models and architectures is as much
social as technical skill
32
Discipline norms
• Rewards
– Computer science – conference proceedings
before
– Physical science – journal publication after
– Engineer - patents
– Humanities – journal and conference
• The line between producers and consumers
is though of as blurred – refer to our
information fig – is it?
• Collecting, understanding, modeling and
designing architectures is social more than a
technical skill (sorry!)
33
Sociology of groups, teams
34
Social Science
• Networks of information providers
• Reputation matters a lot
35
Understanding each other
36
Information Modeling
• Conceptual
• Logical
• Physical
37
Information models - bad
• It's very easy to tell when a Web site you're trying to
navigate has no underlying Information Model. Here
are the tell-tale characteristics:
– You can't tell how to get from the home page to the
information you're looking for.
– You click on a promising link and are unpleasantly
surprised at what turns up.
– You keep drilling down into the information layer after
layer until you realize you're getting farther away from
your goal rather than closer.
– Every time you try to start over from the home page, you
end up in the same wrong place.
– You scroll through a long alphabetic list of all the articles
ever written on a particular subject with only the title to
guide you.
38
Information models – good
• Oddly enough, you generally don't notice a wellconceived Information Model because it simply
doesn't get in the way of your search.
– On the home page, you notice promising links right away.
– Two or three clicks get you to exactly what you wanted.
– The information seems designed just for you because
someone has anticipated your needs.
– You can read a little or ask for more - the crossreferences are in the right places.
– Right away you feel that you're on familiar ground similar types of information start looking the same.
39
Information Models
• Conceptual models, sometimes called domain
models, are typically used to explore domain
concepts and often created
– as part of initial requirements envisioning efforts as they
are used to explore the high-level static business or
science or medicine structures and concepts
– as the precursor to logical models or as alternatives to
them
• Followed by logical and physical models
40
Logical models
• A logical entity-relationship model is provable
in the mathematics of data science. Given the
current predominance of relational
databases, logical models generally conform
to relational theory.
• Thus a logical model contains only fully
normalized entities. Some of these may
represent logical domains rather than
potential physical tables.
41
Logical models
• For a logical data model to be normalized, it must include the
full population of attributes to be implemented and those
attributes must be defined in terms of their domains or
logical data types (e.g., character, number, date, picture,
etc.).
• A logical data model requires a complete scheme of
identifiers or candidate keys for unique identification of
each occurrence in every entity. Since there are choices of
identifiers for many entities, the logical model indicates the
current selection of identity. Propagation of identifiers as
foreign keys may be explicit or implied.
• Since relational storage cannot support many-to-many
concepts, a logical data model resolves all many-to-many
relationships into associative entities which may acquire
42
independent identifiers and possibly other attributes as well.
Physical models
• A physical model is a single logical model
instantiated in a specific information system
(e.g., relational database, RDF/XML
document, etc.) in a specific installation.
• The physical model specifies implementation
details which may be features of a particular
product or version, as well as configuration
choices for that instance.
43
Physical models
• E.g. for a database, these could include index
construction, alternate key declarations,
modes of referential integrity (declarative or
procedural), constraints, views, and physical
storage objects such as tablespaces.
• E.g. for RDF/XML, this would include
namespaces, declarative relations, etc.
44
Conceptual model example
• Radiative process model after a volcanic
eruption
45
Logical model example
46
For example for relational DBs
Feature
Conceptual Logical Physical
Entity Names
✓
✓
Entity Relationships ✓
✓
Attributes
✓
Primary Keys
✓
✓
Foreign Keys
✓
✓
Table Names
✓
Column Names
✓
Column Data Types
✓
47
Another simple example
48
Logical Model
49
Physical Information Model
• Used to
– design the internal schema (e.g. of a database, or file),
– depict the structured information (e.g. tables),
– specify the layout of those structures (e.g. columns of
those tables)
– specify the relationships between the structures (e.g.
primary keys).
50
Object oriented design
• Object-oriented modeling is a formal way of
representing something in the real world
(draws from traditional set theory and
classification theory). Some basics to keep in
mind in object-oriented modeling are that:
– Instances are things.
– Properties are attributes.
– Relationships are pairs of attributes.
– Classes are types of things.
– Subclasses are subtypes of things.
51
Object model
• Class: a means of grouping all the objects which
share the same set of attributes and methods.
• An object must belong to only one class as an
instance of that class (instance-of relationship).
• A class is similar to an abstract data type.
• Class hierarchy and inheritance: derive a new class
(subclass) from an existing class (superclass)
– subclass inherits all the attributes and methods of the
existing class and may have additional attributes and
methods
– single inheritance (class hierarchy) vs. multiple
inheritance (class lattice).
52
Core object models consist of:
• object and object identifier: Any real world entity is
uniformly modeled as an object (associated with a
unique id: used to pinpoint an object to retrieve).
• attributes and methods: every object has a state
(the set of values for the attributes of the object) and
a behavior (the set of methods - program code which operate on the state of the object).
• the state and behavior encapsulated in an object
are accessed or invoked from outside the object.
53
Steps in modeling
•
•
•
•
•
•
•
Identify objects (entity) and their types
Identify attributes
Apply naming conventions
Identify relationships
Apply model patterns (if known)
Assign relationships
Normalize to reduce redundancy (this is
called refactoring in software engineering)
• Denormalize to improve performance
54
Not just an isolated set of models
• Most important for handling errors, evolution,
extension, restriction, … where to do that:
– To the physical model?
– To the logical model?
– To the conceptual model?
• Relating to and/ or integrating with other
information models?
– General rule – integrate at the highest level you
can (i.e. more abstract)
– Remember the cognitive aspects!
55
Recall our example
56
57
58
Tools for modeling..
• Models
– http://www.datamodel.org/
– MSDN: http://msdn.microsoft.com/enus/library/bb399249.aspx
• Schema – to rebuild logical models from physical!
– Schematron differs in basic concept from other schema
languages in that it not based on grammars but on finding
tree patterns in the parsed document. This approach
allows many kinds of structures to be represented which
are inconvenient and difficult in grammar-based schema
languages. If you know XPath or the XSLT expression
language, you can start to use Schematron immediately.
– http://www.schematron.com/
59
Tools
• Concept mapping
– http://cmap.ihmc.us/
• Mindmapping
– http://en.wikipedia.org/wiki/List_of_mind_mappin
g_software
• White board
• Piece of paper … you get the idea?
60
Discussion
• About semiotics
• Library science
• Cognitive science
• Social science
• Modeling
61
Reading for this week
• Is retrospective but … relates to your
assignment
62
Assignment 1
• Analysis of cognitive, collection and
social/cultural aspects of information in signs,
discussed and decomposed along the lines
we have talked about today and last week
with some modeling thrown in and then you
present in class
• Due on Feb 26th – write up and presentations
• Assignment 2 available Feb 12th.
63
What is next
• February 12 – Week 4 – Capturing the
problem: Use Case development and
requirements analysis
• February 19 – no class, Tuesday follows
Monday schedule
• February 26 – Week 5 – presentations (5
mins to present, including questions)
64