Information modeling, Information architectures theory and practice (Internet, Web, Grid, Cloud) Peter Fox Xinformatics – ITEC/CSCI/ERTH-4400/6400 Week 5, February 25, 2014
Download ReportTranscript Information modeling, Information architectures theory and practice (Internet, Web, Grid, Cloud) Peter Fox Xinformatics – ITEC/CSCI/ERTH-4400/6400 Week 5, February 25, 2014
Information modeling, Information architectures theory and practice (Internet, Web, Grid, Cloud) Peter Fox Xinformatics – ITEC/CSCI/ERTH-4400/6400 Week 5, February 25, 2014 1 Contents • Review of last class/ reading • Class exercise in Information Modeling (intro slides first) • Information architectures theory and practice (Reference, Internet, Web, Grid, Cloud) and class project definitions • Next classes 2 Information Models • Conceptual models, sometimes called domain models, are typically used to explore domain concepts and often created – as part of initial requirements envisioning efforts as they are used to explore the highlevel static business or science or medicine structures and concepts • Followed by logical and physical models 3 Logical models • A logical entity-relationship model is provable in the mathematics of data science. Given the current predominance of relational databases, logical models generally conform to relational theory. • Thus a logical model contains only fully normalized entities. Some of these may represent logical domains rather than potential physical tables. 4 Physical models • A physical model is a single logical model instantiated in a specific information system (e.g., relational database, RDF/XML document, etc.) in a specific installation. • The physical model specifies implementation details which may be features of a particular product or version, as well as configuration choices for that instance. 5 Physical models • E.g. for a database, these could include index construction, alternate key declarations, modes of referential integrity (declarative or procedural), constraints, views, and physical storage objects such as tablespaces. • E.g. for RDF/XML, this would include namespaces, declarative relations, etc. 6 Object oriented design • Object-oriented modeling is a formal way of representing something in the real world (draws from traditional set theory and classification theory). Some basics to keep in mind in object-oriented modeling are that: – Instances are things. – Properties are attributes. – Relationships are pairs of attributes. – Classes are types of things. – Subclasses are subtypes of things. 7 Object model • Class: a means of grouping all the objects which share the same set of attributes and methods. • An object must belong to only one class as an instance of that class (instance-of relationship). • A class is similar to an abstract data type. • Class hierarchy and inheritance: derive a new class (subclass) from an existing class (superclass) – subclass inherits all the attributes and methods of the existing class and may have additional attributes and methods – single inheritance (class hierarchy) vs. multiple inheritance (class lattice). 8 Core object models consist of: • object and object identifier: Any real world entity is uniformly modeled as an object (associated with a unique id: used to pinpoint an object to retrieve). • attributes and methods: every object has a state (the set of values for the attributes of the object) and a behavior (the set of methods - program code which operate on the state of the object). • the state and behavior encapsulated in an object are accessed or invoked from outside the object. 9 Information Modeling • Conceptual • Logical • Physical 10 For example for relational DBs Feature Conceptual Logical Physical Entity Names ✓ ✓ Entity Relationships ✓ ✓ Attributes ✓ Primary Keys ✓ ✓ Foreign Keys ✓ ✓ Table Names ✓ Column Names ✓ Column Data Types ✓ 11 Steps in modeling • • • • • • • Identify objects (entity) and their types Identify attributes Apply naming conventions Identify relationships Apply model patterns (if known) Assign relationships Normalize to reduce redundancy (this is called refactoring in software engineering) 12 Exercise! 13 Not just an isolated set of models • Most important for handling errors, evolution, extension, restriction, … where to do that: –To the physical model? NO –To the logical model? MAYBE –To the conceptual model? YES IF POSSIBLE 14 Not just an isolated set of models • To relate to and/ or integrate with other information models: –General rule – integrate at the highest level you can (i.e. more abstract) –Remember the cognitive aspects! Less detail is easier to understand 15 (Information) Architecture • Definition: – “is the art of expressing a model or concept of information used in activities that require explicit details of complex systems” (wikipedia) – “… I mean architect as in the creating of systemic, structural, and orderly principles to make something work - the thoughtful making of either artifact, or idea, or policy that informs because it is clear.” Wuman 16 More detail to connect us • “The term information architecture describes a specialized skill set which relates to the interpretation of information and expression of distinctions between signs and systems of signs.” (wikipedia, emphasis added) 17 Meaning not deep thought • “Information architecture is the categorization of information into a coherent structure, preferably one that the most people can understand quickly, if not inherently. • It's usually hierarchical, but can have other structures, such as concentric or even chaotic.” (wikipedia) 18 Familiar example – learning portal 19 And relation to design? • “In the context of information systems design, information architecture refers to the analysis and design of the data stored by information systems, concentrating on entities, their attributes, and their interrelationships. • It refers to the modeling of data for an individual database and to the corporate data models an enterprise uses to coordinate the definition of data in several (perhaps scores or hundreds) of distinct databases. • The "canonical data model" is applied to integration technologies as a definition for specific data passed between the systems of an enterprise. • At a higher level of abstraction it may also refer to the 20 definition of data stores.” (wikipedia) Art or skill? • Form follows function (Sullivan) – who put this into effect in building structures, homes? • Based on two previous foundations classes, information theory and signs, it should be clear that the answer is ‘yes’ (both). 21 Design theory • Elements – Form – Value – Texture – Lines – Shapes – Direction – Size – Color • Relate these to previous class, signs and relations between them 22 Examples 23 Remember this one? 24 Principles of design • Balance – Balance in design is similar to balance in physics • Gradation – of size and direction produce linear perspective. – of color from warm to cool and tone from dark to light produce aerial perspective. – can add interest and movement to a shape. – from dark to light will cause the eye to move along a shape. • Repetition – with variation is interesting, without variation repetition can become monotonous. 25 Balance, gradation, repetition 26 Principles of design • Contrast – is the juxtaposition of opposing elements e.g. opposite colors on the color wheel - red / green, blue / orange etc. – in tone or value - light / dark. – in direction - horizontal / vertical. – The major contrast in a painting should be located at the center of interest. – Too much contrast scattered throughout a painting can destroy unity and make a work difficult to look at. – Unless a feeling of chaos and confusion are what you are seeking, it is a good idea to carefully consider where to place your areas of maximum contrast. 27 Contrast 28 Principles of design • Harmony – in painting is the visually satisfying effect of combining similar, related elements. e.g. adjacent colors on the color wheel, similar shapes etc. • Dominance – gives a scene interest, counteracting confusion and monotony – can be applied to one or more of the elements to give emphasis 29 Harmony, Dominance 30 Principles of design • Unity – Relating the design elements to the idea being expressed in a rendering reinforces the principal of unity. – E.g. a scene with an active aggressive subject would work better with a dominant oblique direction, course, rough texture, angular lines etc. whereas a quiet passive subject would benefit from horizontal lines, soft texture and less tonal contrast. – in a painting also refers to the visual linking of 31 various elements of the work. Unity 32 Color • Primary Colors - Red, Yellow, Blue - these colors should not be intermingled, they must be bought together in some other form • Secondary Color - Orange, Violet, Green, these colors are created by mixing two primaries. • Intermediate Colors - Red Orange, Yellow Green, Blue Violet, etc.; mixing a primary with a secondary creates these colors. • Complementary Colors - are colors that are opposite each other on the color wheel. When placed next to each other they look bright and when mixed together they neutralize each other. 33 Wheels 34 Color applied • Harmony is when an artist uses certain combinations of colors that create different looks or feelings • Analogous Colors are colors that are next to each other on the color wheel for example red, red orange, and orange are analogous colors. • Triadic Harmony is where three equally spaced colors on the color wheel are used for example, Yellow, Red, Blue is a triadic harmony color scheme. • Monochromatic is where one color is used but in different values and intensity. 35 Color applied • Warm colors are on one side of the color wheel and they give the felling of warmth for example red, orange and yellow are the color of fire and feel warm. • Cool colors are on the other side of the color wheel and they give the feeling of coolness for example blue, violet, are the color of water, and green are the color of cool grass. 36 Reference architectures • “provides a proven template solution for an architecture for a particular domain. It also provides a common vocabulary with which to discuss implementations, often with the aim to stress commonality. • A reference architecture often consists of a list of functions and some indication of their interfaces (or APIs) and interactions with each other and with functions located outside of the scope of the reference architecture.” (wikipedia) 37 U.S. Federal Enterprise Arch • E.g. The Federal Enterprise Architecture Reference Model Ontology (FEA-RMO) is a domain specific ontology of the Federal Enterprise Architecture reference models. • FEA-RMO directly translates the Performance, Business, Service Component, and Technical reference models into their executable representation in OWL-DL. – http://web-services.gov/fea-rmo.html 38 FEA Domain model 39 Figure 6-1 DRM Abstract Model 40 Data Description 41 Data Sharing 42 Data Context 43 IA=IM? • Did we just examine an enterprise reference architecture that was actually a domain (conceptual) information model along with its logical model? • How about THAT! 44 MVC • Model • View • Controller 45 %&' ( )*$2,$/"3 45$- *. $4667&180&9: $4)1; &0*10( )*$ Internet/ Intranet • Communications versus information architecture? • http://www.slideshare.net/postwait/scalableinternet-architecture • See the reading for this week, the role of the Internet Engineering Task Force (IETF) and architecture 46 A<C0, 2?$+=+F, J +:$. , $@<G, $1J 6K, J , 2F, ?$+, G, -<K$/ F@, -$+=+F, J $C/ J 6/ 2, 2F+$12$<??1F1/ 2$F/ $6-/ G1?, $F/ $ , 2@<2C, $F@, $+=+F, J $C<6<A1K1F1, +b$I/ -$, O<J 6K, >$C<C@123$+, -G, -+$I/ -$1J 6-/ G123$F@, $-, +6/ 2+, $F1J , $I/ -$<KK$F@, $ H, A$6<3, +$12CK5?123$J <6$3<KK, -=$<2?$F@, $12F, -<CF1G, $G1+5<K1V<F1/ 2$F/ / K:$( $C/ J 6K, F, $+=+F, J $<-C@1F, CF5-, $ +@/ H123$F@, $G<-1/ 5+$C/ J 6/ 2, 2F+$<2?$?<F<$IK/ H$1+$1KK5+F-<F, ?$12$7135-, $W:$ E.g. 47 $ %&'( )*$<,$/=>0*? $4)1; &0*10( )*$/( 669)0&: ' $- *. >&0*$3 *@*796? *: 0$ WWW • Design for the web (Tim Berners Lee) • “Principles such as simplicity and modularity are the stuff of software engineering; decentralization and tolerance are the life and breath of Internet. To these we might add the principles of least powerful language, and the test of independent invention when considering evolvable Web technology.” 48 Original design issues • See http://www.w3.org/DesignIssues/Overview.html • Here are the criteria and features to be considered: – Intended uses of the system. – Availability on which platforms? – Navigational techniques and tools: browsing, indexing, maps, resource discovery, etc – Keeping track of previous versions of nodes and their relationships – Multiuser access: protection, editing and locking, annotation. – Notifying readers of new material available – The topology of the web of links – The types of links which can express different relationships between nodes 49 Original design issues • These are the three important issues which require agreement between systems which can work together – Naming and Addressing of documents – Protocols – The format in which node content is stored and transferred • Implementation and optimization – Caching , smart browsers, knowbots etc., format conversion, gateways 50 Web architectural elements URI HTML HTTP 51 Common Gateway Interface 52 Client – Server and multi tier 53 Web page/site architecture • Hierarchies, we call them levels: – Top level (the main page) – Second (and further) level (via navigation) – Balancing the levels • Remember your use case, the actors, the resources, the information model, information entropy, the signs, ... 54 CEDAR 1.0 circa 1990 55 CEDAR 2.0 circa 1994 56 2000 57 58 Multi-tiered Interoperability People Agency Policy Makers System Scientists Politicians Decision-level semantic mediation: high-level vocabularies that facilitate policy-level decision-making Integrated Applications Inter-disciplinary Data Visualization Apps Semantic interoperability Integration Frameworks & Methodologies Eco & other system Assessment Apps Application-level semantic mediation: mid-level vocabularies that facilitate the interoperability of system models and data products Sof t ware, Tools&Apps Disciplinespecific model(s) Semantic interoperability Dataproduct Generator Semantic query, hypothsis and inference Information/ Science Apps Query, access and use of data Data-level Semantic mediation: lower-level vocabularies applied to each data source for a specific science domain of interest Data Repositories Federal Repository Commercial Database Researcher Private Database Other Data Sources Metadata, schema, data ... ... ... Grid • “One of the main strategies of Grid computing is to use middleware to divide and apportion pieces of a program among several computers, sometimes up to many thousands. • Grid computing involves computation in a distributed fashion, which may also involve the aggregation of large-scale cluster computing based systems.” (wikipedia) 60 “What is the Grid?” • A Three Point Checklist, Ian Foster lists these primary attributes: – Computing resources are not administered centrally – Open standards are used. – Nontrivial quality of service is achieved. 61 Open Grid Services Architecture 62 Statefull versus stateless • A key distinction between Grids and Web environments is state, i.e. the knowledge of ‘who’ knows and remembers ‘what’ • Increasingly there is a need for maintaining some form of state, i.e. reducing information entropy in web and internet-based architectures • Thus, enter the need for ‘state for a defined purpose’… 63 Cloud • "a computing capability that provides an abstraction between the computing resource and its underlying technical architecture (e.g., servers, storage, networks), enabling convenient, on-demand network access to a shared pool of configurable computing resources that can be rapidly provisioned and released with minimal management effort or service provider interaction.” (wikipedia) • Logical extension of virtualization 64 • Often tied to the cost model Primary Benefits of Cloud Computing • To deliver a future state architecture that captures the promise of Cloud Computing, architects need to understand the primary benefits of Cloud computing • Decoupling and separation of the business service from the infrastructure needed to run it (virtualization) • Flexibility to choose multiple vendors that provide reliable and scalable business services, development environments, and infrastructure that can be leveraged out of the box and billed on a metered basis—with no long term contracts • Elastic nature of the infrastructure to rapidly allocate and de-allocate massively scalable resources to business services on a demand basis • Cost allocation flexibility for customers wanting to move CapEx into OpEx • Reduced costs due to operational efficiencies, and more rapid deployment of new business services 65 Software as a service (SaaS) • A SaaS provider typically hosts and manages a given application in their own data center and makes it available to multiple tenants and users over the Web. • Some SaaS providers run on another cloud provider’s PaaS or IaaS service offerings. • Oracle CRM On Demand, Salesforce.com, and Netsuite are some of the well known SaaS 66 Infrastructure as a service (IaaS) • is the delivery of hardware (server, storage and network), and associated software (operating systems virtualization technology, file system), as a service. It is an evolution of traditional hosting that does not require any long term commitment and allows users to provision resources on demand. • Unlike PaaS services, the IaaS provider does very little management other than keep the data center operational and users must deploy and manage the software services themselves--just the way they would in their own data center. Amazon Web Services Elastic Compute Cloud (EC2) and Secure Storage Service (S3) are examples of IaaS offerings. 67 Platform as a service (Paas) • is an application development and deployment platform delivered as a service to developers over the Web. • facilitates development and deployment of applications without the cost and complexity of buying and managing the underlying infrastructure, providing all of the facilities required to support the complete life cycle of building and delivering web applications and services entirely available from the Internet. • consists of infrastructure software, and typically includes a database, middleware and development tools. • A virtualized and clustered grid computing architecture is often the basis for this infrastructure software. 68 Platform as a service (Paas) • Some PaaS offerings have a specific programming language or API. • For example, Google AppEngine is a PaaS offering where developers write in Python or Java. • EngineYard is Ruby on Rails. • Sometimes PaaS providers have proprietary languages like force.com from Salesforce.com and Coghead, now owned by SAP 69 Simple cloud architectures 70 More complex clouds 71 More details… 72 Cloud domain decompostion By functional domain 73 Towards a reference architecture? 74 Back to IMs as IAs • What would an information model architecture of cloud (X-as-a-service)? 75 Discussion • About architecture in general? • Design? • Internet, web, grid, cloud? 76 What is next • Reading for this week – Architectures – Design – Color • Week 6 – Information Integration, Life-cycle and Visualization and your first view of the group projects • Spring break • Then … presentations 77 Data-Information-Knowledge Ecosystem Experience Data Creation Gathering Information Presentation Organization Knowledge Integration Conversation Context 78