Transcript Document
TU/e eindhoven university of technology Web Information Systems Engineering Flavius Frasincar [email protected] /department of mathematics and computer science April 17, 2003 ISA 1 TU/e eindhoven university of technology Contents • • • • What is a Web Information System (WIS)? WIS Features Problem: Data Management in WIS Solution: Model-Driven Methodology (with Tasks Separation) • Methodologies for WIS: – Strudel Methodology – Hera Methodology • Summary /department of mathematics and computer science April 17, 2003 ISA 2 TU/e eindhoven university of technology World Wide Web • 1990: Tim Berners Lee ( World Wide Web • The Web success is based on: ) invents the – hypermedia (link) nature: links allow for a natural and flexible access to information according to the associative nature of human mind – global availability – interoperability – simplicity – free etc. /department of mathematics and computer science April 17, 2003 ISA 3 TU/e eindhoven university of technology Web Information Systems (WISs) • 1998: Tomas Isakowitz at al. coined the term Web Information Systems for: “information systems that are based on Web technology” • WISs are different from traditional information systems as they “have the potential of reaching a wider audience” through different platforms • There is an even increased need to integrate data as the data sources are distributed over the Web and they are possibly heterogeneous /department of mathematics and computer science April 17, 2003 ISA 4 TU/e Three Generations of WISs eindhoven university of technology • First Generation: are based on hand-crafted HTML – Difficult to maintain (update) • Second generation: generate HTML on demand by automatically filling templates – Data is machine readable/transformable – Difficult to make the data machine understandable • Third generation: Semantic Web Information Systems (SWISs) are WISs based on Semantic Web technology (RDF, OWL etc.) – Data is machine understandable /department of mathematics and computer science April 17, 2003 ISA 5 TU/e Present the Deep Web eindhoven university of technology Deep Web vs. Surface Web: •500 times larger •1000 times better quality /department of mathematics and computer science April 17, 2003 ISA 6 TU/e eindhoven university of technology WIS Features • Data-intensive: integrate data from multiple heterogeneous sources • Pervasive: support different platforms e.g. network (T1, 128K, 56K), display (PC, Palm, WAP Phone) • User Adaptable: consider user’s preferences and user’s state of mind while interacting with the system • Flexible: support semistructured data • Automatic: need little or no human intervention • User interactive: e.g. online shops (Amazon) /department of mathematics and computer science April 17, 2003 ISA 7 TU/e Problem: Data Management eindhoven university of technology • WIS are hard to specify and implement • Methodologies exist for manual WIS design but few of them target automation • Difficult tasks to perform: – – – – – – Multiplatform support Automatic updates Automatic site reconstruction (WIS Adaptation) Optimize WIS performance (WIS Optimization) Enforce WIS integrity constraints (WIS Analysis) Achieve flexibility, extensibility etc. /department of mathematics and computer science April 17, 2003 ISA 8 TU/e eindhoven university of technology Semistructured Data • It is characterized by: – Irregular structure: missing or additional attributes, multiple attributes – Few type constraints: attributes with different types in different objects, heterogeneous collections – Rapidly evolving schema or missing schema • It is typically modeled by a DLG (Directed Labeled Graph) • Examples: HTML, XML, RDF, LaTeX Bib etc . /department of mathematics and computer science April 17, 2003 ISA 9 TU/e Solution: Tasks Separation eindhoven university of technology • Isolate and automate common tasks for WIS design: – Choose and access the data (data integration and retrieval) to be presented – Design the navigational structure for this data – Design the visual aspects of the presentation • Use a model-driven approach for task specification (the fairy says it brings “wisdom” [theory], “richness”[money], and “beauty” [judge it yourself] – Stefano Ceri) /department of mathematics and computer science April 17, 2003 ISA 10 TU/e WIS Presentation Generation Srategies eindhoven university of technology • Static (eager approach): presentations are materialized completely, each page is precomputed • Dynamic or On-demand (lazy approach): after each link “click” the next page to be presented is computed /department of mathematics and computer science April 17, 2003 ISA 11 TU/e eindhoven university of technology Methodologies • Dexter-based: HDM (Hypermedia Design Method) • ER-based: RMM (Relationship Management Methodology) • OMT-based: OOHDM • UML-based: OO-H (Conallen), UWE (UML-based Web Engineering),W2000 (HDM extension) • RDF-based: XWMF (eXtensible Web Modeling Framework), Hera • Other: Strudel, Araneus, WebML (Web Modeling Language), Autoweb, Trellis, XAHM (XML-based Adaptive Hypermedia Model ), WSDM, W3DT etc. /department of mathematics and computer science April 17, 2003 ISA 12 TU/e eindhoven university of technology Strudel Methodology http://www.research.att.com/~mff/strudel AT&T /department of mathematics and computer science April 17, 2003 ISA 13 TU/e eindhoven university of technology Strudel Architecture Object-Oriented Database Relational Database XML Database … Uniform Data Model STRUQL Site Graph HTMLTemplate Template HTML HTML Template HTML Presentation /department of mathematics and computer science April 17, 2003 ISA 14 TU/e eindhoven university of technology Input Data <publications> … <pub id=pub2> <pub id=pub1> <title> Catching the …</ title> <title>Declarative spec…</title> <author>Mary Fernandez</author> <author>Mary Fernandez</author> <author> Daniela Florescu </author> <author>Dan Suciu</author> <year>1998 </year> <year>2000</year> <booktitle> SIGMOD </booktitle> <journal>VLDB</journal> <abstract>Strudel is a …</abstract> <abstract> The Strudel …</abstract> <category>Languages</category> <category>WIS</category> … <category>Methods</category> </pub2> … </publications> </pub1> /department of mathematics and computer science April 17, 2003 ISA 15 TU/e Semistructured Data Model eindhoven university of technology Root publications pub pub pub2 pub1 year year 2000 author author … Direct Labeled Graph (DLG) … M. Fernandez 1998 M. Fernandez /department of mathematics and computer science April 17, 2003 ISA 16 TU/e eindhoven university of technology STRUQL (Site TRansformation Und Query Language) where Root”publications”r, r”pub” x, xl v { where l=“year” link YearPage(v) ”year” v, YearPage(v) ”paperPage” x, RootPage() ”yearPage” YearPage(v) collect RootPage{RootPage()}, YearPage{YearPage(v)} }… /department of mathematics and computer science April 17, 2003 ISA 17 TU/e eindhoven university of technology Site Graph RootPage() “yearPage” “yearPage” YearPage(2000) YearPage(1998) “paperPage” “paperPage” “year” … “year” “paperPage” … “paperPage” 2000 PaperPage(pub1) 1998 PaperPage(pub2) /department of mathematics and computer science April 17, 2003 ISA 18 TU/e STRUDEL Template Language eindhoven university of technology • RootPage collection: • PaperPage collection: <html> <sfor p in yearPage order=ascend key=year> <sfmt @p [email protected]> </sfor> </html> <i> <sif booktitle> <sfmt booktitle> <selse> <sfmt journal> </sif> </i><br> <sfor p in author> <sfmt @p>, </sfor><br> <sfmt year><br> • YearPage collection: <h1><sfmt year></h1> <ul> <sfor p in paperPage> <li><sfmt @p></li> </sfor> </ul> /department of mathematics and computer science April 17, 2003 ISA 19 TU/e eindhoven university of technology STRUDEL +/+ : Tasks separation (content and presentation) Declarative specifications (enables presentation content adaptation) Verification of integrity constraints (e.g. “All paper pages are reachable from RootPage”) - : Intermixes schema and content defintion in the data graph Does not separate navigation from visual details of the presentation Does not use standard technologies /department of mathematics and computer science April 17, 2003 ISA 20 TU/e eindhoven university of technology Hera Methodology http://wwwis.win.tue.nl/~hera TU/e /department of mathematics and computer science April 17, 2003 ISA 21 TU/e eindhoven university of technology Hera Architecture Object-Oriented Database Relational Database ODB-XML Wrapper RDB-XML Wrapper User/Platform Adaptation Mediator/ Integrator Query Logical Presentation XML Database … Information Retrieval Hypermedia Presentation Logical-HTML Presentation Logical-WML Presentation Logical-SMIL Presentation HTML Presentation WML Presentation SMIL Presentation … /department of mathematics and computer science April 17, 2003 ISA 22 TU/e Hera Presentation Methodology eindhoven university of technology Conceptual Design Conceptual Model Application Design Application Model Transformation Adaptation Design Transformation Presentation Design Presentation Model /department of mathematics and computer science April 17, 2003 ISA 23 TU/e Conceptual Model (CM) eindhoven university of technology • Provides a uniform semantic view over different data sources that are integrated within a given Web application • Consists of hierarchies of concepts relevant within the given domain • Concept relationships are: – Attribute relationships: refer to literal values that characterize a concept – Reference relationships: refer to other concepts /department of mathematics and computer science April 17, 2003 ISA 24 TU/e eindhoven university of technology Example: CM String Integer name String exemplifies name String biography String created_by exemplified_by Technique String year name Artifact creates description Creator painted_by Painting Painter paints Property subClassOf subPropertyOf picture Image /department of mathematics and computer science April 17, 2003 ISA 25 TU/e Example: CM in RDF/XML eindhoven university of technology <rdfs:Class rdf:ID="Creator"/> <rdfs:Class rdf:ID="Artifact"/> <rdfs:Class rdf:ID="Painter"> <rdfs:Class rdf:ID="Painting"> <rdfs:subClassOf rdf:resource="#Artifact"/> <rdfs:subClassOf rdf:resource="#Creator"/> </rdfs:Class> </rdfs:Class> <rdf:Property rdf:ID="year"> <rdfs:domain rdf:resource="#Artifact"/> <rdfs:range rdf:resource=“#Integer"/> </rdf:Property> <rdf:Property rdf:ID="picture"> <rdfs:domain rdf:resource="#Painting"/> <rdfs:range rdf:resource=“#Image"/> </rdf:Property> <rdf:Property rdf:ID="creates" sys:cardinality="multiple" sys:inverse="created_by"> <rdfs:domain rdf:resource="#Creator"/> <rdfs:range rdf:resource="#Artifact"/> </rdf:Property> /department of mathematics and computer science April 17, 2003 ISA 26 TU/e Application Model (AM) eindhoven university of technology • Captures the logical (navigational) aspects of the presentation • Based on the concept of slice which contains attributes and possibly other slices – A slice is a meaningful presentation unit – A slice is associated to a concept from CM • Slice relationships are: – Aggregation relationships: embed a set of slices (abstraction for index, tour, indexed guided tour etc). – Reference relationships: link abstraction with an anchor specified /department of mathematics and computer science April 17, 2003 ISA 27 TU/e eindhoven university of technology Example: AM technique painting name name description picture year painting picture painted_by exemplified_by painter name Set main main /department of mathematics and computer science April 17, 2003 ISA 28 TU/e Example: AM in RDF/XML eindhoven university of technology <rdfs:Class rdf:ID="Slice.technique.main" <rdfs:Class rdf:ID="Slice.painting.main" slice:owner="CM #Painting"> slice:owner=“CM#Technique" <rdfs:subClassOf rdf:resource="#Slice"/> slice:main="Yes"> <rdfs:subClassOf rdf:resource=“#Slice"/> </rdfs:Class> </rdfs:Class> <rdf:Property rdf:ID="slice-ref"> <slice:prop-ref rdf:resource="CM #ex_by"/> <rdfs:Class rdf:ID="S.painting.picture" <rdfs:domain rdf:resource="#S.t.main"/> slice:owner=“CM#Painting" <rdfs:range rdf:resource="#S.p.picture"/> slice:attr-ref=“CM#picture"> <rdfs:subClassOf rdf:resource="#Slice"/> </rdf:Property> </rdfs:Class> <rdf:Property rdf:ID=“link_1"> <rdfs:subPropertyOf rdf:resource =“#link”> <rdf:Property rdf:ID="media"> <rdfs:domain rdf:resource="#S.p.picture"/> <rdfs:domain rdf:resource="# S.p.picture"/> <rdfs:range rdf:resource="#S.p.main"/> <rdfs:range rdf:resource=“#Image"/> </rdf:Property> </rdf:Property> /department of mathematics and computer science April 17, 2003 ISA 29 TU/e eindhoven university of technology Adaptation • Captures two kinds of adaptation – Adaptability takes into account the device capabilities and user preferences (UAProf = User Agent Profile) – Adaptivity means that the presentation changes itself according to the “state of the user’s mind” while being browsed (UM = User Model) • Adaptation based on conditioning the appearance of slices using UAProf and/or UM • Adaptivity uses AHAM (Adaptive Hypermedia Application Model) update rules for updating UM /department of mathematics and computer science April 17, 2003 ISA 30 TU/e Adapted Application Model eindhoven university of technology prf:ImageCapable = Yes technique painting name name description picture year painting picture painted_by exemplified_by painter name Set main main um:Technique < 10 um:Painting < 10 /department of mathematics and computer science April 17, 2003 ISA 31 TU/e eindhoven university of technology Presentation Model • Defines the physical appearance of the presentation • Based on the concept of region which contains attributes and possibly other regions: – Each region has a rectangular area associated – Slices are translated to regions, one slice can be mapped to several regions • Slice relationships are materialized with: – Navigational relationships – Spatial relationships – Temporal relationships /department of mathematics and computer science April 17, 2003 ISA 32 TU/e eindhoven university of technology Presentation Model Region bookcase shelf painting 0 Attribute P.picture P.picture (Associated to a certain painting P) P.name 1 … right 2 xy 0 below Navigational Relationship Bookcase regions Spatial Relationship P1 P2 P3 P1 ‘Stone Bridge’ 1638 P4 P5 P7 … P6 … Priority (Priority 0 is always fulfilled) Screen rendering /department of mathematics and computer science April 17, 2003 ISA 33 TU/e Presentation in Browsers eindhoven university of technology HTML SMIL WML HyperText Markup Language Synchronized Multimedia Integration Language Wireless Markup Language /department of mathematics and computer science April 17, 2003 ISA 34 TU/e eindhoven university of technology Implementation • Models are represented in RDF and they are serialized in RDF/XML • User Agent Profile (UAProf): a Composite Capability/Preference Profiles (CC/PP) vocabulary to model device capabilities and user preferences • XSLT processor for transforming between different model instances (stylesheet-based transformation) – Xalan (XSLT 1.0) – Saxon (XSLT 2.0): multiple output files support /department of mathematics and computer science April 17, 2003 ISA 35 TU/e eindhoven university of technology Data Transformations • Step 0: Preparation – Substep 0.1: Application Model Unfolding creates the skeleton of an AM instance – Substep 0.2: Application Model Adaptation adds slice visibility conditions to the previous skeleton – Substep 0.3: Main Transformation Specification Generation builds the specification for the next step • Step 1: Main Transformation populates the AM with the input CM instance • Step 2: Presentation Generation produces code for different browsers (HTML, WML, SMIL) /department of mathematics and computer science April 17, 2003 ISA 36 TU/e eindhoven university of technology Data Transformations CC/PP user/platform vocabulary (rdfs) conceptual model vocabulary (rdfs) system media vocabulary (rdfs) application model vocabulary (rdfs) UAProf vocabulary (rdfs) user profile vocabulary (rdfs) application indepedent application dependent conceptual model (rdfs) conceptual model instance (rdf) application model (rdfs) (1) cmi2ami (xsl) user/platform profile (rdf) input dependent ami2html (xsl) application model instance (rdf) (2) HTML ami2wml (xsl) (0.1) reference instantiation (2) WML application model unfolded (rdf) (0.3) rdf2xsl (xsl) (0.2) adaptation (2) XSLT transf. ami2smil (xsl) (xsl) SMIL application model unfolded, adapted (rdf) RT /department of mathematics and computer science April 17, 2003 ISA 37 TU/e eindhoven university of technology Hera +/+ : Tasks separation (content, navigation, and presentation) Model-based specifications (enables presentation content adaptation) Uses standard technology: RDF, RDF/XML, XSLT - (Future Work): Specifications are semi-formal (difficult to check integrity constraints) Does not (yet) support user interaction /department of mathematics and computer science April 17, 2003 ISA 38 TU/e eindhoven university of technology Summary • What is a Web Information System (WIS) • Features of WIS: data intensive, pervasive etc. • Design methodologies for WIS: – Strudel (from industry) – Hera (from university) • Model-based approach for WIS design • WIS design tasks separation: – Data Selection – Navigation – Presentation /department of mathematics and computer science April 17, 2003 ISA 39