Geen diatitel

Download Report

Transcript Geen diatitel

CURRENT RESEARCH INFORMATION SYSTEMS
&
TECHNOLOGIES
© Geert Van Grootel & euroCRIS
CRIS Technologies
18/09/2003
1
Introduction
• Who am I?
– Geert Van Grootel
• Senior researcher : Science division, Ministry of the
Flemish community.
– IWETO: Flemish CRIS
• CERIF taskgroup member
• euroCRIS treasurer.
© Geert Van Grootel & euroCRIS
CRIS Technologies
18/09/2003
2
Structure of presentation
•
•
•
•
Introduction: Terminology
Past & present technologies
Examples of implementations
CERIF & Technology
© Geert Van Grootel & euroCRIS
CRIS Technologies
18/09/2003
3
Context & Integration of
CRIS’s
– Different organisational levels & geography
•
•
•
•
•
a scientific discipline
intra institutional
between institutions
Different levels of government
regional, national, international, global.
– Different levels of system integration
• integrated system (ERP)
• intra process data capture & collection
• extra process
© Geert Van Grootel & euroCRIS
CRIS Technologies
18/09/2003
4
CRIS
• Current research Information
System
– Technologies behind CRIS’s
•
•
•
•
Document stores
Relational Database Managment Systems (RDBMS)
Object Oriented Database Managment Systems (OODBMS)
Information Retrieval systems (IR)
© Geert Van Grootel & euroCRIS
CRIS Technologies
18/09/2003
5
Document stores
• Document systems
– Based on Markup Languages (SGML, XML)
• in extistance since the 80’s
– Rise in popularity with XML behind it as semi structured
database.
– Querying is usually poor
• query language is procedural and navigational as opposed to
declarative predicates
– Difficult to maintain
• updating is slow when changes effect several entity instances
but fast when only with one document.
– Variable report capabilities: group, sum, average,...
© Geert Van Grootel & euroCRIS
CRIS Technologies
18/09/2003
6
Information Retrieval Systems
• Advantages for databases with many textual attributes
– via Full inverted index
•
•
•
•
very fast retrieval
very slow update
little or no structural capability ( relations between entities)
little or no reporting capability
– group, sum, average,...
© Geert Van Grootel & euroCRIS
CRIS Technologies
18/09/2003
7
OODBMS
• Crucial to OODBMS is the concept of objects
– Data (structure view)
– Methods (process view)
– Messages (event view)
• Any process has to be codes specifically for any object
– solutions is inheritence to help reduce coding efforts
• Disadvantages
– performance, worse than RDBMS
– poorer data representational capabilities
© Geert Van Grootel & euroCRIS
CRIS Technologies
18/09/2003
8
RDBMS
• Pro’s
• Con’s
– Mathematically formal
– easy to understand
– standard query
language (SQL)
– mature technology
– hard to represent
complex objects
– High performance
needs expert
knowledge
Flexible linking relations between business objects
© Geert Van Grootel & euroCRIS
CRIS Technologies
18/09/2003
9
Technology for CRISs
• Essential Building blocks
– Metadata
– Dictionaries, Thesauri & Ontologies
– Keys & Binary Relations
© Geert Van Grootel & euroCRIS
CRIS Technologies
18/09/2003
10
Data & Metadata
• Incredible amount of data but much of this
data is unaccesible
• What we need:
– Find relevant data as information
– Understand it : syntax, semantics
– Understand any restrictions on its use
• The key to this is METADATA
© Geert Van Grootel & euroCRIS
CRIS Technologies
18/09/2003
11
Metadata
• Importance
– Integrity control
– Access control
– Support of data
• Classification, valid
terms
– Interoperability
• Benefits
–
–
–
–
–
• other CRISs
• other Systems
• Data exchange
• Data access
© Geert Van Grootel & euroCRIS
Data quality
Access
Understanding answers
Improving queries
Interoperability
– MIS, RMS
– Bibliographic systems
– Scientific data
CRIS Technologies
18/09/2003
12
Three Kinds of Metadata
view to users
SCHEMA
NAVIGATIONAL
ASSOCIATIVE
constrain it
data
(document)
© Geert Van Grootel & euroCRIS
CRIS Technologies
18/09/2003
13
SCHEMA METADATA
view to users
SCHEMA
NAVIGATIONAL
ASSOCIATIVE
constrain it
data
(document)
© Geert Van Grootel & euroCRIS
CRIS Technologies
18/09/2003
14
Metadata Kinds: Schema
• intensional description of extensional instances
– database:
• name
• size
• security authorisations
– attributes:
SCHEMA
constrain it
• name
• type
• constraints
• formal logic relationship to data instances
© Geert Van Grootel & euroCRIS
CRIS Technologies
18/09/2003
15
ASSOCIATIVE METADATA
view to users
SCHEMA
NAVIGATIONAL
ASSOCIATIVE
constrain it
data
(document)
© Geert Van Grootel & euroCRIS
CRIS Technologies
18/09/2003
16
Associative Metadata
• information for application assistance
view to users
– catalog record (e.g. Dublin Core) - descriptive ASSOCIATIVE
– content rating (e.g. PICS)
- restrictive
– security, privacy (cryptography, digital signatures)
- restrictive
– information from dictionaries, thesauri, hyperglossaries,
domain ontologies
- supportive
• no formal logic relationship to data instances
© Geert Van Grootel & euroCRIS
CRIS Technologies
18/09/2003
17
NAVIGATIONAL METADATA
view to users
SCHEMA
NAVIGATIONAL
ASSOCIATIVE
constrain it
data
(document)
© Geert Van Grootel & euroCRIS
CRIS Technologies
18/09/2003
18
NAVIGATIONAL METADATA
• How to get to information resource direct
–
–
–
–
–
filename
DB name + navigational algorithm
DB name + predicate (query)
URL
URL + predicate (query)
NAVIGATIONAL
• or any of the above via
– web indexing system (eg AltaVista, ExCite…)
– local indexing system bookmarks or proxy server)
© Geert Van Grootel & euroCRIS
CRIS Technologies
18/09/2003
19
Metadata
Collecting observed facts
DATA
Structuring in Context
INFORMATION
Inducing commonly accepted belief
KNOWLEDGE
INSIGHT
© Geert Van Grootel & euroCRIS
CRIS Technologies
18/09/2003
20
Technology for CRISs
• Essential Building blocks
– Metadata
– Dictionaries, Thesauri & Ontologies
– Keys & Binary Relations
© Geert Van Grootel & euroCRIS
CRIS Technologies
18/09/2003
21
ONTOLOGY
• What is an Ontology
– A specification of a conceptualization.
– A formal description of the concepts and relationships
that can exist for an agent or a community of agents
– The knowledge of a domain defined in a formal
declarative language
– The collection of semantic definitions for a domain.
– In practice a resource of terms, their definitions and
their logical inter-relationships.
© Geert Van Grootel & euroCRIS
CRIS Technologies
18/09/2003
22
DOMAIN ONTOLOGY
• Domain Ontology
– An ontology covering a specific subject area of interest
(a domain).
– The set of objects that can represented can be called the
“universe of discourse”.
– E.g. For a project to exist it must have a startdate, a
subject, a goal, a promotor and a budget
– Project <- [startdate AND subject AND goal AND
promotor AND budget > 0]
© Geert Van Grootel & euroCRIS
CRIS Technologies
18/09/2003
23
DOMAIN ONTOLOGY
• Domain Ontologies in IT
– A representation in first order logic allowing
•
•
•
•
Facts to be expressed
Relationships to be expressed
Constraints to be expressed
New facts and relationships to be deduced or
induced
© Geert Van Grootel & euroCRIS
CRIS Technologies
18/09/2003
24
And so….
• Metadata is the key to
– GRIDs
– SEMANTIC WEB
© Geert Van Grootel & euroCRIS
CRIS Technologies
18/09/2003
25