XInformatics course summary Peter Fox Xinformatics 4400/6400 Week 13, April 30, 2013 Contents • Summary of this course • What you needed to learn/ objectives • Questions •

Download Report

Transcript XInformatics course summary Peter Fox Xinformatics 4400/6400 Week 13, April 30, 2013 Contents • Summary of this course • What you needed to learn/ objectives • Questions •

XInformatics course summary
Peter Fox
Xinformatics 4400/6400
Week 13, April 30, 2013
1
Contents
• Summary of this course
• What you needed to learn/
objectives
• Questions
• Discussion
2
3
The key is:
• As the volume, complexity and heterogeneity of
information increases…
– Suddenly information looks more like a continuum
– Not always in the ‘right’ structure
– All known methods, algorithms do not scale (except for very
simple operations)
– And because it is information, humans are part of the loop
and you’ve all seen how modern information systems are
more or less useable depending on a number of factors
• Thus - understand and apply theoretical foundations
– To date these are developed in an analog world, not a
digital one!!
4
Intersecting disciplines:
Library Science
OrganizesCataloging and
classification
Preservation ‘maintaining or
restoring access to
artifacts’
Cognitive
Science
mental
representation,
the nature of
expertise,
and intuition
Social Science
Collaborati
on
Cultural norms
Rewards
5
A Use Case
• … is a collection of possible sequences of
interactions between the information
system under discussion/ design and its
actors, relating to a particular goal
• … consists of a prose description of an
information system's behavior when
interacting with the actors
• … is a technique for capturing functional
requirements of an information system
• … captures non-functional requirements
Ultimately: Wetware
• ‘Before you make the software
interoperable, you need to make the
people interoperable’: Ian Jackson,
7
E.g. Table of Contents
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
==Plain Language Description==
===Short Definition===
===Purpose===
===Describe a scenario of expected use===
===Definition of Success===
==Formal Use Case Description==
=== Use Case Identification===
===Revision Information===
===Definition===
===Successful Outcomes===
===Failure Outcomes===
==General Diagrams==
===Schematic of Use case===
==Use Case Elaboration==
===Actors===
====Primary Actors====
====Other Actors====
===Preconditions===
===Postconditions===
===Normal Flow (Process Model)===
===Alternative Flows===
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
===Special Functional Requirements===
===Extension Points===
==Diagrams==
===Use Case Diagram===
===State Diagram===
===Activity Diagram===
===Other Diagrams===
==Non-Functional Requirements==
===Performance===
===Reliability===
===Scalability===
===Usability===
===Security===
===Other Non-functional Requirements===
==Selected Technology==
===Overall Technical Approach===
===Architecture===
===Technology A===
====Description====
====Benefits====
====Limitations====
===Technology B===
====Description====
====Benefits====
====Limitations====
==References==
Developed for NASA TIWG
THE PHYSICS OF INFORMATION
© 2005 EvREsearch LTD
EvREsearch©
Physics of information = entropy
= uncertainty/ integrity
• Information of a random variable is defined as
the Sum of p x log p, where p=probability. It
represents the uncertainty of the variable
• Mutual information of two variables = how
much information one variable contains about
the other
– i.e. the decrease of the uncertainty of one
variable by knowing the other
• In probabilistic terms, the entropy decreases
by conditioning on the distribution
10
Information theory
• Semiotics - study of sign processes or
signification and communication, signs and
symbols, into three branches:
– Syntax: Relation of signs to each other in formal
structures
– Semantics: Relation between signs and the
things to which they refer - meaning
– Pragmatics: Relation of signs to their impacts on
those who use them
11
Abduction
• method of logical inference
(Peirce)
• prior to induction and
deduction i.e. "hunch”
• starts with a set of
(seemingly unrelated) facts
+ intuition (some
connection) and brought
together – via abductive
reasoning
• abduction is the process of
inference that produces a
hypothesis as its end result
12
Mode of noise introduction
From Shannon and Weaver (1949)
Msg?
Information
Source
Signal?
Web
Content,
Structure
Recvd?
Msg?
Web
browser?
Noise
source
HTML page,
user
13
Noise
• Uncertainty, especially any that is introduced
is a source of noise, or more accurately –
bias in the use or interpretation of the
information
– Is context and structure dependent
– Noise/ bias contamination is rampant in
information systems
• Quality assessment, control and verification is
14
less developed for information sources
Presentation
• Separation of content from presentation!!
• The theory here is empirical or semi-empirical
• Is developed based on an understanding of
minimizing information uncertainty beginning
with content, context and structural
considerations and cognitive and social
factors to reduce uncertainty
• Physiology for humans, color, …
15
Organization
• Organizations - producers v/s consumers
• Organization of information presentation, e.g.
layout on a web page
• Yes - content, context and structure
• How to organize:
– What have you seen?
– Needed?
– Not had resolved?
16
Mental Representation
• Thinking = representational structures +
procedures that operate on those
structures
• Did you make progress?
• Methodological consequence: what
have you learned about the study how
we think about information systems?
17
Behind this: Information Models
• Conceptual models, domain models, explore
domain concepts
• High-level conceptual models are created as part of
initial requirements envisioning efforts - to explore
the high-level static business or science or medicine
structures and concepts and relations among them
• Conceptual models are created as the precursor to
logical models or as alternatives to them
– To build something they must be followed by logical and
physical models
18
(Information) Architectures
• Definition:
– “is the art of expressing a model or
concept of information used in activities
that require explicit details of complex
systems” (wikipedia)
– “… as in the creating of systemic,
structural, and orderly principles to make
something work - the thoughtful making of
either artifact, or idea, or policy that informs
because it is clear.” Wuman
19
Architectures
• Building on content,
context, and structure,
think of information
architectures as “in front of
the interface” and
“behind the interface”
• What’s the proportion – is
it just like an iceberg? I.e.
the majority of information
architecture work is out of
sight, "below the water.”
20
Reference architectures
• “provides a proven template solution for an
architecture for a particular domain. It also provides
a common vocabulary with which to discuss
implementations, often with the aim to stress
commonality.
• A reference architecture often consists of a list of
functions and some indication of their interfaces (or
APIs) and interactions with each other and with
functions located outside of the scope of the
reference architecture.” (wikipedia)
• At this stage of the course, have you seen a
reference architecture? Did you like it?
21
Design?
• In the context of information systems design,
information architecture refers to the analysis
and design of the data stored by information
systems, concentrating on entities, their
attributes, and their interrelationships.
• It refers to the modeling of information for an
individual source …
22
Design theory
• Elements
– Form
– Value
– Texture
– Lines
– Shapes
– Direction
– Size
– Color
• Relation to signs and relations
between/ among them
23
Principles of design
•
•
•
•
•
•
•
Balance
Gradation
Repetition
Contrast
Harmony
Dominance
Unity
24
Broad life-cycle elements
• Acquisition: Process of recording or
generating a concrete artefact from the
concept (the act of transduction)
• Curation: The activity of managing the use of
data from its point of creation to ensure it is
available for discovery and re-use in the
future
• Preservation: Process of retaining usability of
data in some source form for intended and
unintended use
• Stewardship: Process of maintaining integrity
across acquisition, curation and preservation
25
Acquisition
• What do you know about the
developer of the means of
acquisition
– Documents –not easy to find/
read/ understand
– Remember unclear use cases,
information model, all lead to
uncertainty and bias!!!
• Have a checklist (the
Management list) and review
it often
26
Curation
• Activity that takes information from Producers
to Consumers!
• Organization and presentation may need to
change
• Document what is done and why, track the
provenance!
• How do you remain as technology-neutral as
possible and why would you want to?
• Add metainformation
27
Preservation
• Archiving is but one component
• Intent is that ‘you can open it any time in the
future’ and that ‘it will be there’
• Involves steps not be conventionally thought
of
• Think far into the future …. history gives
some guide to future considerations
28
Information Management
•
•
•
•
•
•
Creation of logical collections
Physical handling
Interoperability support
Security support
Ownership
Metadata collection, management and
access.
• Persistence
• Knowledge and information discovery
• Dissemination and publication
29
Information Workflow
• Series of tasks performed to produce a final
outcome – you know like the steps in a use
case!
• Information workflow = “analysis pipeline”
– Automate tedious jobs that users traditionally
performed by hand for each dataset
– Process large volumes of data/ information faster
than one could do by hand
– Document what is done
– Collect provenance, enable an audit, etc.
30
Information integration
• Involves: combining information residing in
different sources and providing users with a
unified view of them
• Getting the ‘unified view’ – lots of informatics
here – recall unify from design?
• Recall the domain examples:
– Geo?
– Medical/ health?
– Others?
31
Discovery?
• Discussion
– What is the reality? Did any of you find the
recording of the sound of an (African) swallow?
• Finding media types
– Information retrieval and information architecture
considerations – when a usual search engine
cannot find what you want
– Content-based discovery, context-based, and
yes, structure-based…
32
Visualization?
• Reducing the amount of data, quantization
– Patterns
– Features
– Events
– Trends
– Irregularities
– Exit points for analysis
• Also presentation of “data”
• Cognitive science and the mental
representation
33
Information audit
• Analysis and evaluation of a
firm's information system
(whether manual or
computerized) to detect and
rectify blockages,
duplication, and leakage of
information.
34
Objective of an audit?
• The objectives of an audit are to improve
accuracy, relevance, security, and timeliness
of the recorded information
• It is a process that effectively determines
the current information environment
within an organization by identifying and
mapping:
– What information is currently available?
– Where the information lives?
– Etc.
35
“Unstructured Information”
• If a structured representation of
fundamentally unstructured information is
useless how do we respond?
– Remember – USE!
• What role does visual representation play in
structuring information? Remember this?
36
Data<->Information<->Knowledge
•
•
•
•
What’s in your future?
Data Science
Semantic eScience
Job!
Data
Information
Creation
Gathering
Presentation
Organization
Experience
Knowledge
Integration
Conversation
Context
37
In one slide?
• Use case – you have to know the goal (+more)
• Conceptual and logical models -> information
models
• Understand information flows and uncertainties
(sign systems), the life cycle and manage them
• Apply information, library, cognitive, social science,
and design elements to developing a design of an
architecture
• Think the design through (e.g. get closer to the
physical model (workflow?)) and assess the
presentation, organization, content, context,
structure, syntax, semantic and pragmatics
38
What would your slide include?
39
Objectives
• To instruct future information architects how to
sustainably generate information models, designs
and architectures
• To instruct future technologists how to understand
and support essential data and information needs of
a wide variety of producers and consumers
• For both to know tools, and requirements to
properly handle data and information
• Will learn and be evaluated on the underpinnings of
informatics, including theoretical methods,
technologies and best practices.
40
Learning Outcomes
• Develop and demonstrate skill in mevelopment and
conduct of multi-skilled teams in the application of
informatics
• Develop conceptual and logical information models
and explain them to non-experts
• Demonstrate the application information theory and
design principles to information systems
• Demonstrate knowledge and application of
informatics standards
• Develop and demonstrate skill in informatics tool
use and evaluation
41
Discussion
• All of the material?
• Please fill out the course evaluation
42
What is next
Time today for project team meetings
•Today – write-ups are due
•May 7 – final project presentations (BE ON
TIME, i.e. 5-10mins BEFORE 9AM)
•Make sure your team members are on time
•And, be prepared to be asked (and answer)
questions
43