Transcript Slide 1

WG 2 Workshop on
New Metadata
Standards
May 23, 2008
Sydney Australia
Leader: Denise Warzel
Many organizations have recognized the need to develop and share programmatically
accessible terminologies, services and data collection forms. This workshop will focus on
building a consensus and plan for advancing the development of new metadata registry
standards for these types of objects: Services, Terminologies and Forms.
To enable data processing systems to be written to discover and use these types of
resources accurate, unambiguous and verifiable metadata is a prerequisite. With
appropriate metadata standards for these types of objects, systems could be designed to
allow owners to register and share electronically the essential characteristics necessary for
common understanding of what and how these resources are intended for use. While the
ISO/IEC 11179 addresses unambiguous representation and registration of data elements,
no such standard exists for these important objects that are essential for developing
interoperable information systems. A standard metadata framework to describe the
essentialcharacteristics of these types of objects in ways that can be compared and
interpreted is needed.
The envisioned metadata standards would be similar to the ISO/IEC 11179 Metadata
Registries standard Part 3 in which the registry and basic attributes of each type of object
are defined and can be specified and registered in a metadata registry.
This session will facilitate developing a common understanding of some of these
characteristics for each type of object. Participants will share existing metamodels and
emerging specifications, discuss common needs, approaches and priorities and possible
next steps that could lead to the development of new ISO standards in these areas.
This session would be of interest to information developers, information managers, data
administrators, standards developers and others who are responsible for designing
systems in which these types of metadata objects are understandable and shareable.
Many organizations have recognized the need to develop and share programmatically accessible terminologies,
services and data collection forms. This workshop will focus on building a consensus and plan for advancing the development of new
metadata registry standards for these types of objects: Services, Terminologies and Forms.
Type of Metadata Standard: (Service, Terminology, Form or Other):
Existing/Recommended metamodel:
Benefits:
Use Cases/Justification if this standard existed:
Known initiatives/individuals that could be pooled to create a Functional Specification:
Plans for Implementation:
Suggested next steps if known:
Goals for Today
• Discuss new areas for development of metadata
standards
• Specifically: Terminology, Forms, Services
• Reach common understanding about what each of these
metamodels would describe (agree what we mean by
“Terminology”, “Forms”, “Services”)
• Reach consensus on priorities/areas of common
interests
• Learn how to progress something into a new standard
• Form a Strategy for progressing these new standads
MDR Topology
Tools &
Sources
DE
VD
Global or
Parent node
Metadata
Management
Metadata
Management
Submission
Harmonization
Review
Registration &
Management
Submission
Harmonization
Review
Registration &
Management
Registry
UML
Source
Code
Documents
Service Interface
Authors, Curators
Terminology
Registry
Repository
Registry
Image
Run time
Operational
Repository
Service
Metadata
CDE
Metadata
Common
Registry
Services
Service
Discovery
CDE Search
Model Discovery
Registry
Repository
Service Interface
Repository
Metadata Exchange
Concept
Service Interface
Form
Local or
Child node
Schema
Repository
Terminology
Metadata
Search
Repository
Browser (s)
Browser (s)
Repository
Etc…
Transformation
XML…
Enterprise Vocabulary Services
Service,
Schema,
Model
Validations
Systems
Metadata
Authoring
Areas of Interest
• Registry Metamodel
– Profiles that would allow registry operators to ‘declare’
themselves, provide attributes that would enable others to
understand certain characteristics about the registry, such as
business rules, naming conventions, contact information etc.
– Issue 127: 2005 Global Attributes for a Registry to be used with
an ROR
– (ISO 29002)
– What is contained in the Registry (e.g. What type of registry is it?
Image Registry, Service Registry, 11179 Metadata Registry, etc)
– Establish “Trust”
• Interest is registry for CWM mappings (Baba)
‘Terminology’
• There are different types of terminologies
• This would cover all types of
terminologies:
• Taxonomies, Formal Ontologies,
Standardized list of terms, Vocabularies,
Dictionaries, etc.
• The metadata would describe the
Terminology as a WHOLE – (NOT the
model for the terms in the terminology)
Terminology Issues
• Agree on a common definition of ‘Terminology’
– Look at TC 37 work (Sue Ellen Wright)
• Add proper Provenance class
• Related Work:
– Registry for Knowledge Organization Systems (NKOS) – Gail
Hodge
– ‘Rights’ – UMLS has both public and restricted content
– UK Joint Information Systems committee (JISC) – Dennis
Nicolson(?), Doug Tudhope
– NCBO – BioPortal – resource quality, attributes
– Government Terminology Services – each country has some
terminologies they are serving and have an implicit metamodel
that could be harmonized
• (JoAnne Evans – Monash University)
• Open Data Linking
Model - Overview
9
Areas of Interest
• Services Metamodel
– Profiles that would allow a Service Registry
operators to ‘declare’ themselves, provide
attributes that would enable others to
understand certain characteristics about the
registry, such as business rules, naming
conventions, contact information etc
– Establish “Trust”
‘Service’ definition
• OASIS (organization) defines service as "a
mechanism to enable access to one or
more capabilities, where the access is
provided using a prescribed interface and
is exercised consistent with constraints
and policies as specified by the service
description."
Service Metadata Issues
• myGrid Service Model
• caGrid Service Metamodel
• Need a way to describe a “Registry
Service”?? Should use WSDL – service
metadata would provide a way to
standardize a description of the described
in a WSDL
• IBM has introduced Service Science as a
new curriculum – may have some
information that would be compatible
caGrid Service Metamodel
11179:Administered Item
caGrid Service Metamodel
Courtesy: K. Wolstencroft et al The myGrid ontology: bioinformatics service discovery, Int. J.
Bioinformatics Research and Applications, Vol. x, No. x, 200x
not sure if ‘format’ is
the similar to
‘dimensionality’
caGrid Service Metamodel
Not in caGrid Service Domain model:
•Operation.task
•Operation.method
•Operation.resource
•Operation.relationship (shim services)
match
new
Courtesy: K. Wolstencroft et al The myGrid ontology: bioinformatics service discovery, Int. J.
missing Research and Applications, Vol. x, No. x, 200x
Bioinformatics
Service Types
• Domain: Performs scientific function
• Shim: Does not perform scientific function,
but is needed to make one service work
with another
– E.g.
Courtesy: K. Wolstencroft et al The myGrid ontology: bioinformatics service discovery, Int. J.
Bioinformatics Research and Applications, Vol. x, No. x, 200x
myGRID Service Operation:
•Operation.resource
•Operation.method (algorithm)
•Operation.task
These concepts could come from
myGrid ontology
The concepts for each “Input” could come from
the semantics registered for the UML class
myGrid Portions courtsey: K. Wolstencroft et al The myGrid ontology: bioinformatics service
discovery, Int. J. Bioinformatics Research and Applications, Vol. x, No. x, 200x
Comparison of myGrid semantics and caGrid Semantics
myGRID semantic categories
•Informatics. captures the key concepts of data, data
structures, databases and metadata. The data and metadata
hierarchies in the ontology contain this information.
• Bioinformatics. This builds on informatics as well as data
and metadata, there are domain-specific data sources
(e.g., the model organism sequencing databases), and
domain-specific algorithms for searching and analysing
data (e.g., the sequence alignment algorithm, clustalw). The
algorithm and data_resource hierarchies contain this
information.
• Molecular biology. This includes the higher level concepts
used to describe the bioinformatics data types used as
inputs and outputs in services. These concepts include
examples such as, protein sequence, and nucleic acid
sequence.
• Tasks. A hierarchy describing the generic tasks a service
operation can perform. Examples include retrieving,
displaying and aligning.
• Formats. A hierarchy describing bioinformatics file formats.
For example, fasta format for sequence data, or phylip
format for phylogenetic data.
•Informatics. Not sure if we have a corollary or need this.
• Bioinformatics. We could probably use theirs for attaching a
concept to Operation algorithm
• Molecular biology. We have this already, its the concepts
associated with the UML Class level
• Tasks. We could probably use theirs for Operation task.
• Formats. Probably useful, this is at the Service level – not
sure if this is similar to the ‘dimensionality’ attribute we have at
the input/output level?
Courtesy: K. Wolstencroft et al The myGrid ontology: bioinformatics service discovery, Int. J.
Bioinformatics Research and Applications, Vol. x, No. x, 200x
“Forms” Description
• A structured set of metadata for the collection of information and for
the ‘form’ itself
• There are various types of forms
– Statistical forms
– Surveys, Questionnaires
– Population Science Measures
• Eg quality of life
• Link to Rules for Form Completion
• Rights – can the form be reused? By whom? What is the process for
authorization, etc.
• Includes Standard way to describe minimal Rendering information
• To describes how the form looks
• ‘Behavior’ – skip pattern
– Form Structure
• Standard way to associate the semantics of the Question and
Answer Set (e.g. to associate an item on the form with a 11179 Data
Element, DEC or Value Domain
Forms MetaModel
Related Projects
• OASIS
• UN/CEFACT – Forms Metamodel
– UN eDocs – Cross Border Trade ‘form templates’ –
may have an implicit form metamodel
– ISO 15000-5 ebXML
– ISO 7372
– Adobe pdf form metamodel
– MS InfoPath – has a metamodel for their forms
– XForms - language for representing forms
Forms Metadata
•
•
•
•
•
caDSR Protocol Forms Metadata
cancerGrid Protocol Forms
HL7 Clinical Document Architecture
Westat Survey/Instrument Metamodel
??
class ProtocolCaseReportForms
domain::AdministeredComponent
+
begi nDate: Date
+
changeNote: Stri ng
+
createdBy: Stri ng
+
dateCreated: Date
+
dateModi fi ed: Date
+
del etedIndi cator: Stri ng
+
endDate: Date
+
i d: Stri ng
+
l atestVersi onIndi cator: Stri ng
+
l ongName: Stri ng
+
modi fi edBy: Stri ng
+
ori gi n: Stri ng
+
preferredDefi ni ti on: Stri ng
+
preferredName: Stri ng
+
publ i cID: Long
+
regi strati onStatus: Stri ng
+
unresol vedIssue: Stri ng
+
versi on: Fl oat
+
workfl owStatusDescri pti on: Stri ng
+
workfl owStatusName: Stri ng
+admi ni steredComponent
1
+admi ni steredComponentCl assSchemeItemCol l ecti on
0..*
domain::
AdministeredComponentClassSchemeItem
domain::
domain::Instruction
domain::Protocol
+
createdBy: Stri ng
+
dateCreated: Date
+
dateModi fi ed: Date
+
approvedBy: Stri ng
+
i d: Stri ng
+
approvedDate: Date
+
modi fi edBy: Stri ng
+
changeNumber: Stri ng
+
changeT ype: Stri ng
+
l eadOrgani zati onName: Stri ng
+
phase: Stri ng
+
protocol ID: Stri ng
+
revi ewedBy: Stri ng
+
revi ewedDate: Date
+
type: Stri ng
+protocol Col l ecti on
DataElement
+
type: Stri ng
+i nstructi on
0..*
+dataEl ement
0..1
0..*
+admi ni steredComponentCl assSchemeItemCol l ecti on
+formEl ement
1
domain::FormElement
0..*
+protocol Col l ecti on
0..*
+sourceFormEl ement
1
1
+questi onCol l ecti on 0..*
+targetFormEl ement
domain::
ConditionMessage
domain::Question
+
i d: Stri ng
+
defaul tVal i dVal ueId: Stri ng
+
message: Stri ng
+
defaul tVal ue: Stri ng
+
messageT ype: Stri ng
+
di spl ayOrder: Integer
+
i sEdi tabl e: Stri ng
+
i sMandatory: Stri ng
+condti onMessage
0..*
+tri ggerActi onCol l ecti on
+questi on
+questi onCol l ecti on
0..*
0..*
+tri ggerActi onCol l ecti on
+tri ggerActi onCol l ecti on
0..*
+tri ggerActi
0..*
onCol l ecti on
0..*
+questi on
0..*
1
+questi on
0..1
1
+questi on
+questi onCondi ti on
domain::TriggerAction
0..1
+questi onCondi ti on 1
+
acti on: Stri ng
+
createdBy: Stri ng
+
cri teri onVal ue: Stri ng
+
dateCreated: Date
+
dateModi fi ed: Date
+
forcedVal ue: Stri ng
+
i d: Stri ng
+
i nstructi on: Stri ng
+
modi fi edBy: Stri ng
+
tri ggerRel ati onshi p: Stri ng
+val i dVal ueCol l ecti on
+forcedCondi
+questi
ti onTonCondi
ri ggeredActi
ti on onCol l ecti on
domain::
QuestionCondition
0..*
+tri ggeredActi
+enforcedCondi
onCol l ecti onti on
0..*
+
0..1
+parentQuesti onCondi ti on
i d: Stri ng
0..1
1
+questi onComponentCol l ecti on
+questi onCondi ti on
+
descri pti on: Stri ng
+
di spl ayOrder: Integer
-
meani ngT ext: Stri ng
+defaul tVal i dVal ue 0..1
0..1
+condi ti onComponent
+questi
onCondi
ti on
0..* l ecti on 0..*
+condi
ti onComponentCol
0..*
0..*
+questi onRepeti ti onCol l ecti on
0..*
domain::QuestionConditionComponents
+formCol l ecti on
0..*
domain::ValidValue
+val i dVal ue
0..1
+
constantVal ue: Stri ng
+
di spl ayOrder: Integer
+
i d: Stri ng
+
l ogi cal Operand: Stri ng
+
operand: Stri ng
+questi onRepeti
0..*
ti onCol l ecti on
domain::QuestionRepetition
+
defaul tVal ue: Stri ng
+
i sEdi tabl e: Stri ng
+
repeatSequenceNumber: Integer
+modul e 1
0..*
+condi ti onComponent
0..*
+functi on
0..1
domain::Module
domain::Form
+
di spl ayName: Stri ng
+
type: Stri ng
domain::Function
+form
1
+
createdBy: Stri ng
+
dateCreated: Date
+
dateModi fi ed: Date
+
i d: Stri ng
+
modi fi edBy: Stri ng
+
name: Stri ng
+
symbol : Stri ng
+modul eCol l ecti on
0..*
+
di spl ayOrder: Integer
+
maxi mumQuesti onRepeat: Integer
class ProtocolCaseReportForms
domain::
AdministeredComponentClassSchemeItem
domain::
domain::Instruction
domain::Protocol
+
createdBy: String
+
dateCreated: Date
+
dateModified: Date
+
approvedBy: String
+
id: String
+
approvedDate: Date
+
modifiedBy: String
+
changeNumber: String
+
changeT ype: String
+
leadOrganizationName: String
+
phase: String
+
protocolID: String
+
reviewedBy: String
+
reviewedDate: Date
+
type: String
+protocolCollection
DataElement
+
type: String
+instruction
0..*
+dataElement
0..1
0..*
+administeredComponentClassSchemeItemCollection
+formElement
1
domain::FormElement
0..*+protocolCollection
0..*
+sourceFormElement
1
1
+questionCollection 0..*
+targetFormElement
domain::
ConditionMessage
domain::Question
+
id: String
+
defaultValidValueId: String
+
message: String
+
defaultValue: String
+
messageT ype: String
+
displayOrder: Integer
+
isEditable: String
+
isMandatory: String
+condtionMessage
0..*
+triggerActionCollection
+question
+questionCollection
0..*
0..*
+triggerActionCollection
+triggerActionCollection
0..*
+triggerActionCollection
0..*
0..*
+question
0..*
1
+question
0..1
1
+question
+questionCondition
domain::TriggerAction
0..1
+questionCondition 1
+
action: String
+
createdBy: String
+
criterionValue: String
+
dateCreated: Date
+
dateModified: Date
+
forcedValue: String
+
id: String
+
instruction: String
+
modifiedBy: String
+
triggerRelationship: String
+validValueCollection
+forcedConditionT
+questionCondition
riggeredActionCollection
domain::
QuestionCondition
0..*
+validValue
0..1
+triggeredActionCollection
+enforcedCondition
0..*
+
0..1
+parentQuestionCondition
id: String
0..1
1
+questionComponentCollection
+questionCondition
+
description: String
+
displayOrder: Integer
-
meaningT ext: String
+defaultValidValue 0..1
0..1
+conditionComponent
+questionCondition
0..*
+conditionComponentCollection
0..*
0..*
0..*
+questionRepetitionCollection
0..*
domain::QuestionConditionComponents
+formCollection
0..*
domain::ValidValue
+
constantValue: String
+
displayOrder: Integer
+
id: String
+
logicalOperand: String
+
operand: String
+questionRepetitionCollection
0..*
domain::QuestionRepetition
+
defaultValue: String
+
isEditable: String
+
repeatSequenceNumber: Integer
+module 1
0..*
+conditionComponent
0..*
+function
0..1
domain::Module
domain::Form
+
displayName: String
+
type: String
domain::Function
+
createdBy: String
+
dateCreated: Date
+
dateModified: Date
+
id: String
+
modifiedBy: String
+
name: String
+
symbol: String
+form
1
caDSR Protocol Forms Metamodel December 2007
+moduleCollection
0..*
+
displayOrder: Integer
+
maximumQuestionRepeat: Integer
HL7 Clinical Document Architecture (CDA) Metamodel
Metadata Standards Development
• SC 32 and WG 2 Options for progressing these
– ServiceMetadata, Forms Metadata, Terminology Metadata and Global Registry
Metadata (RoR - Issue 127)
– Also need a Service Specification for Registries
– Work with Issue 127, or create another issue to put into 11179
– WG 2 ‘Study Period’
• Look at this and decide what approach we should take
• New part of 11179? New Standard?
• ‘Study Period’ involves more open participation, not as formal as a WG 2
meeting
• Could be proposed as a new Study Period next week as a ‘Resolution’
– Needs a Leader, provides an opportunity to circulate something for people to
“sign-up” - Could Report back at the next WG 2 meeting in the Winter 2008 (5-8
months)
• ISO 19115 (TC 211) Geographic Information
• TC 184/SC 4 and ISO 8000 Data Quality – Quality of Services? Gerald Radack and
Peter Benson
• OMG Evan Wallace/Elisa Kendall
• IT 4 Australian, New Zealand