Summary of NSF Databases and the

Download Report

Transcript Summary of NSF Databases and the

Recap @ Science on the Semantic Web, Rutgers, October 2002

Invitational Workshop on Database and Information Systems Research For Semantic Web and Enterprises

Amit Sheth & Robert Meersman NSF Information & Data Management PI’s Workshop Amit Sheth & Isabel Cruz

“Ask not what the Semantic Web Can do for you, ask what you can do for the Semantic Web”

Hans-Georg Stork, European Union http://lsdis.cs.uga.edu/SemNSF

Context for Amicalola workshop

• Series of Workshops and upcoming conferences: Lisbon (9/00), Hong Kong (5/01), Palo Alto (7/01), Amsterdam (12/01); since then WWW2002/ISWC – Observation: visible lack of DB/IS involvement • “Semantic Web – The Road Ahead,” [Decker, Hans-Georg Stork, Sheth, … SemWeb’2001 at WWW10, Hongkong, May 1, 2001.

] • Semantic Web: Rehash or Research Goldmine [Fensel, Mylopoulous, Meersman, Sheth (Chair), CooPIS’01] • At Castel Pergine, Italy

Semantics & IDM – Brief History

(partial)

• Semantic Data Modeling M. Hammer and D. McLeod: " The Semantic Data Model: A Modelling Machanism for Data Base Applications "; Proc.. ACM SIGMOD, 1978.

• Conceptual Modeling Michael Brodie, John Mylopoulos, and Joachim W. Schmidt. On Conceptual Modeling . Springer Verlag, New York, NY, 1984. A series of preceding workshops.

• Data Semantic: What, Where and How?

"Database Semantics", R.A. Meersman and T.B. Steel (eds), Proceedings of the IFIP DS-1 Conference, North-Holland (1985).

- So Far (Schematically) yet So Near (Semantically) –Sheth, Keynote at DS-5 - Meersman, Navathe, Rosenthal, Sheth (Chair); IFIP DS-6 Panel • Semantic Interoperability on Web many projects in 90s – 1994 CIKM paper on Semantic Information Brokering talked about query processing in a multi-ontology environment • Domain Modeling, Metadata, Context, Ontologies, Semantic Interoperability, Semantics in Schema Integration, Semantic Information Brokering, Spatio-temporal-geographic- image-video multimodal semantics • All these involving Semantics, Databases, IS and even Web – before “Semantic Web” term is coined

Challenges – unique role of IDM

SCALE and PERFORMANCE Acceptable computation (query/analysis) time when you have millions and billions of instances (documents, digital content) and metadata (annotation) • locking for sharing/storage management • Semantic similarity, mappings, interoperability (schema transformation/integration aka ontology mismatch) • indexing for expediting computations • workflow for Web Services-based processes

Organization/Output

• 20+ senior researchers/practitioners • 2.5 days in Georgia Mountains • Proceedings of position papers (also talks) • Three workgroups: Application Pull (Brodie/Dayal) , Ontology (Decker/Kashyap) and Web Services (Fensel/Singh) • • Review at OntoWeb3 Panel • Final Report • SIGMOD Record special issue December 2002 Every thing is at lsdis.cs.uga.edu/SemNSF/

Participants

Karl Aberer , LSIR, EPFL, Switzerland Mike Brodie , Verizon Isabel Cruz , The University of Illinois at Chicago Umeshwar Dayal , Hewlett-Packard Labs Stefan Decker , Stanford University Max Egenhofer , University of Maine Dieter Fensel , Vrije Universiteit Amsterdam William Grosky ,University of Michigan-Dearborn Michael Huhns , University of South Carolina Ramesh Jain , UC-San Diego, and Praja Yahiko Kambayashi , Kyoto University Vipul Kashyap , National Library of Medicine Ling Liu , Georgia Institute of Technology Frank Manola , The MITRE Corporation Robert Meersman , Vrije Universiteit Brussel (VUB) Amit Sheth , University of Georgia and Voquette Munindar Singh , North Carolina State University George Stork , EU Rudi Studer , AIFB Universität Karlsruhe Bhavani Thuraisingham , NSF-CISE-IIS Michael Uschold, The Boeing Company

Medical metaphor • Ontologies: anatomy • Processes: physiology • Applications: pathology

Application

Pull …Agenda • Premises – Every resource meaningfully available – Current & Planned Web Services – Beneficiaries and Requirements • Potential Semantic Services – B2B, C2C, Intra-Enterprise – Example Semantic Web Services • Challenges / Questions / Concepts • What the Semantic Web Will Look Like

Application Pull …Scenarios

• Scenarios – Tax preparation (Individual) – Supply Chain (B2B) – Scientific Research • Semantics will be added at three different levels in successive phases – Information – Transactions – Collaborations

Application Pull …Benefits / Requirements

• Lowering barriers to entry – Costs – Entrants • Consumers • Service providers • Dynamic – Ability to adjust to rapidly changing circumstances • Continuous – Continuous activity (i.e., taxes, financial monitoring – Event Detection – Do taxes anywhere activity) anytime, • X-Internet – Executable – Extended • Improved – Transparency – Timeliness – Accuracy – Optimization – Eliminate tasks mundane • Additional services • Reliability and trust • Archiving – Data – Meta-data – Transaction histories

Application Pull …Challenges

• Upper ontologies – Entities • Personal • Organizations – Activities / Events – Processes • Ontologies – Products – Services – Financial contracts – Business objects – Tax laws (all agencies) – Financial activities – Service providers – Financial planning – Supply chain processes – Activities (to be monitored) • Ontology activities – Search – Select – Create, refine – Maintain, version • Local • Shared • Global – Mapping • Ontology-based activities – Accountability • Arbitration • Trust • Tracing • Engineering – Managing ontologies and mappings – Scalability, robustness,

Requirements/ Analysis Ontology Learning Consistency Checking Ontology Search Compare/Similarity Merge/ Refine/Assemble Evaluation Maintenance Versioning Creation/ Change Deployment (e.g., Hypothesis Generation, Query)

DB Research in the Ontology LifeCycle • Operations to compare Models/Ontologies • Scalability/Storage Indexing of Ontologies

– DB approaches data model specific – Need to support graph based data models

• Temporal Query Languages

Lots of work in Schema Integration/translation

Ontology WG: DB Research in the Ontology LifeCycle II • Schema Mapping

– Meta Model specific – Representation of exceptions, e.g., tweety – Specification of Inexact Schema Correspondences • E.g., 40% of animals are 30% of humans

• Meta Model Transformations/Mappings (e.g., UML to RDF Schema)

Ontology WG: DB Research in the Ontology LifeCycle III • Ontology Versioning

– Collaborative editing – Meta Model specific versioning – Version of Schema/Meta Model Transformations

Ontology WG: DB Research & Semantic Interoperation

• Inference v/s Query Rewriting/Processing for Semantic Integration: • E.g., RichPerson = (AND Person (> Salary 100)) • Can Query Processing/Concept Rewriting provide the same functionality as inferences ? More efficiently ?

• Distributed Inferences and Loss of Information • Query Languages for combining metadata and data queries • Graph-based data models and query languages • Schema Correspondences/Mappings •Intensional Answers (Answers are descriptions, e.g. (AND Person (> Salary 100)) instead of a list of all rich people) • Semantic Associations (identification of meaningful relationships between different documents and entities) Semantic Index

Semantic WS Scope

All Std Worth pursuing

Formally self-described

currency.com

Self-described Program Amazon Hard code html People

Mike’s Humor

• Services vs. Ontologies

“Well done is better than well said.”

Ben Franklin

Research Issues

• Environment • Representation • Programming • Interaction (system) • Architecture • Utilities • Scalable, openness, autonomy, heterogeneity, evolving • Self-description, conversation, contracts, commitments, QoS • Compose & customize, workflow, negotiation • Trust, security, compliance • P2P, privacy, • Discovery, binding, trust service

SWS – Fitting in and expanding IS/DB/DM: Or why Bhavani & George should care?

Data => services, similar yet more challenging: – Modeling – Organizing collections – Discovery and comparison (reputation) – Distribution and replication – Access and fuse (composition) – Fulfillment • Contracts, coordination versus transactions • Quality: more general than correctness or precision • Compliance – Dynamic, flexible information security and trust.

Research Issues

• Conversational (state-based, event-based, history based) • Interoperability of conversational services – compose, translate, • Representations for services: programmatic self description • Commitments, contracts, negotiation, compliance, cooperation • Discovery, location, binding • Transactional workflow: rollback, roll-forward, semantic exception handling, recovery • Trustworthy service (discovery, provisioning, composition, description) • Security; privacy vs. personalization • Quality-of-Service, w.r.t. various aspects, negotiable

DB / IS subcommu nity DB theory How is it relevant research on the SW to Type theory, Complexity, theory of concurrency Data(base) semantics Normalization/ design Data modeling View integration Everything; in particular ontology language development; constraints; data structures Not specifically as such; some work on Non-First Normal Form reuse/extend/map DM formalisms, techniques and methods e.g. EER, ORM, UML for ontology (content) specification and design Ontology alignment, translation, object identities, updateable views…; model mappings How may the SW stimulate research in this community Ontology axiomatics and theory; formal semantics; semantics for incomplete, inconsistent and evolving representations Ontology modeling; formal semantics of web services Requirement for formal properties for ontology organization; perhaps ontology design guidelines or “semantic normal forms”; conflict resolution; redundancy checks in general semantic data modeling; ontology content creation techniques and methods; complex ontological relationships; domain models see Federated DBs; ontology support for view and application integration; ontology composition and update

Schema integration Deductive DB/Datalog Multimedia DB Temporal/Spati al DB Document DB OO DB Visual DB apply to autonomously ontologies? conflict detection designed schemas; global schemas as pre Learn from its processing and F-logic failure, query Image ontologies; semantic indexing; similarity-based search GIS semantics and archiving; histories data management; Ontology alignment; new kinds of models will pose new kinds of problems how to handle different complexity levels efficiently Image-based ontologies?

Digital libraries, unstructured data; standards for digital library resource descriptions to beused on the SW Object-oriented models databases; for and ontologies, modeling object-based extensible of behavior; build OODB into Java object Visualization for the SW, queries; ontology visualization visual requirement to model temporal knowledge as first class citizen in ontologies; spatial, temporal modeling in upper ontologies; versioning of GIS becomes critical issue Lack of a priori global model presents a research challenge management of large collections of object-, behavior- and resource identifiers semantic upgrades of image databases to be used as visual ontologies

XML/Web DB Distributed DB Constraint DB Transaction modeling Transaction processing Mobile DB Main memory DB Most relevant, caching everything Constraint semantics enforcement as mechanism; semantics-based query processing loosening of ACID properties limits of what can/must be transactional not directly; “mobile” is a platform issue Semantic caching Size and semantics; XML shortcomings for semantics definition trust/privacy/compliance distributed services DBMS; issues in design/dynamic tailoring of DDBMS underlying web Non-closed world assumption issues Web services, Extended distributed transaction models; non-CWA issues; smart user profiling ACID properties of Web services; semantic support for very long transactions context-aware (Semantic) Web computing; location-independent device semantics; mobility issues raised/enabled by the possibly semantic caching i.e. using application semantics or context

Parallel DB DB machines DB security Federated DB Query processing Query optimization Information retrieval unclear at present; straightforward reuse/apply (e.g. parallel queries, transactions, …) in certain niches A lot, e.g., access control Autonomy; integrating sources, information in approaches heterogeneous particular sources; for data web mediator/ wrapper-based architectures high applicability; e.g. “smart” query enhancement high applicability; e.g. use domain knowledge to optimize query execution and rewriting broad applicability of techniques and theory; Not clear at present Web SoA; parallel architectures for ontology servers?

Not clear at present Web SoA trust and privacy, QoS; dynamically changing and conflicting security requirements www = huge federated DB; develop more powerful (scalable) approaches for ontology alignment and integration; heterogeneous sources may have different credibility; service composition

DB interoperability DB versioning Metadata Mediation/Mi ddleware DB warehousing Data(base) mining Database architectures and DBMS Everything; esp. see federated DBs; see schema integration Link maintenance; versioning ontology Web services will benefit DW architectures for decision support; improve e.g. web service efficiency; see the (S)Web as a giant DW web mining; clustering; learning; information extraction profiles DBMS (components) function/module in a as web service(s); add semantics to every DBMS’s architectures Semantic aspects of interoperability; see federated DBs; quality of interoperation Annotations, ontology versioning of instance data modeling, Annotations, versioning ontology P2P, collaboration, mediating components new modeling, market for Smart data warehousing; share/compose application semantics; ontology behind “real” data mining from text; exploit semantics in mining; derive semantics inductively from query results on “real” data including exceptions; machine learning Ontology support in data dictionaries; new, more flexible DB architectures for better SW support and processing on the web

Web-IS architectures Functional modeling IS organizations in Web-IS applications IS IS workflow modeling methodologies CASE tools fitting enterprise IS (components) into the SW; Web IS; also see DBMS architectures design of web services; functional modeling that deals explicitly with a domain’s semantics looser coupling required, provide potential for organizations workflow modeling to morph into the SW; see also exception handling in long (business) transactions; workflows as “the” paradigm for “programming” the SW ontology lifecycle issues; as IS components organization become more intelligent, work shifts to self ontology management systems New architectures and design principles for Web IS Decomposition and composition of web services; event modeling serving new organizations of business, community and government with emergent SW-based IS technology smart (ontology-driven) SW portals and search engines (“Google++”-type); SW based “direct marketing”-style systems; smart user profiling unreliability of components; unavailability of services New thinking required! E.g. Web IS community unknown a priori in enterprises; how must business processes change to deal with existence of the SW; develop/maintain SW-based systems for user

User interfaces DB application architectures AI-and-DB new applications principles for GUIs knowledge inference of design representation, New and complex requirements and methods, immersive environments Web application service Uncharted territory 1 Uncharted territory 2 In general, most algorithms in DM are poor when they are applied to access, report etc data on the web.

Domain semantics in such requests need to be exploited; however “centralized” solutions (where resources need to notify potential requestors) will not be scalable.

Sensor input management and stream data