Transcript Slide 1
WG 2 Workshop on New Metadata Standards May 23, 2008 Sydney Australia Leader: Denise Warzel Many organizations have recognized the need to develop and share programmatically accessible terminologies, services and data collection forms. This workshop will focus on building a consensus and plan for advancing the development of new metadata registry standards for these types of objects: Services, Terminologies and Forms. To enable data processing systems to be written to discover and use these types of resources accurate, unambiguous and verifiable metadata is a prerequisite. With appropriate metadata standards for these types of objects, systems could be designed to allow owners to register and share electronically the essential characteristics necessary for common understanding of what and how these resources are intended for use. While the ISO/IEC 11179 addresses unambiguous representation and registration of data elements, no such standard exists for these important objects that are essential for developing interoperable information systems. A standard metadata framework to describe the essentialcharacteristics of these types of objects in ways that can be compared and interpreted is needed. The envisioned metadata standards would be similar to the ISO/IEC 11179 Metadata Registries standard Part 3 in which the registry and basic attributes of each type of object are defined and can be specified and registered in a metadata registry. This session will facilitate developing a common understanding of some of these characteristics for each type of object. Participants will share existing metamodels and emerging specifications, discuss common needs, approaches and priorities and possible next steps that could lead to the development of new ISO standards in these areas. This session would be of interest to information developers, information managers, data administrators, standards developers and others who are responsible for designing systems in which these types of metadata objects are understandable and shareable. Many organizations have recognized the need to develop and share programmatically accessible terminologies, services and data collection forms. This workshop will focus on building a consensus and plan for advancing the development of new metadata registry standards for these types of objects: Services, Terminologies and Forms. Type of Metadata Standard: (Service, Terminology, Form or Other): Existing/Recommended metamodel: Benefits: Use Cases/Justification if this standard existed: Known initiatives/individuals that could be pooled to create a Functional Specification: Plans for Implementation: Suggested next steps if known: Goals for Today • Discuss new areas for development of metadata standards • Specifically: Terminology, Forms, Services • Reach common understanding about what each of these metamodels would describe (agree what we mean by “Terminology”, “Forms”, “Services”) • Reach consensus on priorities/areas of common interests • Learn how to progress something into a new standard • Form a Strategy for progressing these new standads MDR Topology Tools & Sources DE VD Global or Parent node Metadata Management Metadata Management Submission Harmonization Review Registration & Management Submission Harmonization Review Registration & Management Registry UML Source Code Documents Service Interface Authors, Curators Terminology Registry Repository Registry Image Run time Operational Repository Service Metadata CDE Metadata Common Registry Services Service Discovery CDE Search Model Discovery Registry Repository Service Interface Repository Metadata Exchange Concept Service Interface Form Local or Child node Schema Repository Terminology Metadata Search Repository Browser (s) Browser (s) Repository Etc… Transformation XML… Enterprise Vocabulary Services Service, Schema, Model Validations Systems Metadata Authoring Areas of Interest • Registry Metamodel – Profiles that would allow registry operators to ‘declare’ themselves, provide attributes that would enable others to understand certain characteristics about the registry, such as business rules, naming conventions, contact information etc. – Issue 127: 2005 Global Attributes for a Registry to be used with an ROR – (ISO 29002) – What is contained in the Registry (e.g. What type of registry is it? Image Registry, Service Registry, 11179 Metadata Registry, etc) – Establish “Trust” • Interest is registry for CWM mappings (Baba) ‘Terminology’ • There are different types of terminologies • This would cover all types of terminologies: • Taxonomies, Formal Ontologies, Standardized list of terms, Vocabularies, Dictionaries, etc. • The metadata would describe the Terminology as a WHOLE – (NOT the model for the terms in the terminology) Terminology Issues • Agree on a common definition of ‘Terminology’ – Look at TC 37 work (Sue Ellen Wright) • Add proper Provenance class • Related Work: – Registry for Knowledge Organization Systems (NKOS) – Gail Hodge – ‘Rights’ – UMLS has both public and restricted content – UK Joint Information Systems committee (JISC) – Dennis Nicolson(?), Doug Tudhope – NCBO – BioPortal – resource quality, attributes – Government Terminology Services – each country has some terminologies they are serving and have an implicit metamodel that could be harmonized • (JoAnne Evans – Monash University) • Open Data Linking Model - Overview 9 Areas of Interest • Services Metamodel – Profiles that would allow a Service Registry operators to ‘declare’ themselves, provide attributes that would enable others to understand certain characteristics about the registry, such as business rules, naming conventions, contact information etc – Establish “Trust” ‘Service’ definition • OASIS (organization) defines service as "a mechanism to enable access to one or more capabilities, where the access is provided using a prescribed interface and is exercised consistent with constraints and policies as specified by the service description." Service Metadata Issues • myGrid Service Model • caGrid Service Metamodel • Need a way to describe a “Registry Service”?? Should use WSDL – service metadata would provide a way to standardize a description of the described in a WSDL • IBM has introduced Service Science as a new curriculum – may have some information that would be compatible caGrid Service Metamodel 11179:Administered Item caGrid Service Metamodel Courtesy: K. Wolstencroft et al The myGrid ontology: bioinformatics service discovery, Int. J. Bioinformatics Research and Applications, Vol. x, No. x, 200x not sure if ‘format’ is the similar to ‘dimensionality’ caGrid Service Metamodel Not in caGrid Service Domain model: •Operation.task •Operation.method •Operation.resource •Operation.relationship (shim services) match new Courtesy: K. Wolstencroft et al The myGrid ontology: bioinformatics service discovery, Int. J. missing Research and Applications, Vol. x, No. x, 200x Bioinformatics Service Types • Domain: Performs scientific function • Shim: Does not perform scientific function, but is needed to make one service work with another – E.g. Courtesy: K. Wolstencroft et al The myGrid ontology: bioinformatics service discovery, Int. J. Bioinformatics Research and Applications, Vol. x, No. x, 200x myGRID Service Operation: •Operation.resource •Operation.method (algorithm) •Operation.task These concepts could come from myGrid ontology The concepts for each “Input” could come from the semantics registered for the UML class myGrid Portions courtsey: K. Wolstencroft et al The myGrid ontology: bioinformatics service discovery, Int. J. Bioinformatics Research and Applications, Vol. x, No. x, 200x Comparison of myGrid semantics and caGrid Semantics myGRID semantic categories •Informatics. captures the key concepts of data, data structures, databases and metadata. The data and metadata hierarchies in the ontology contain this information. • Bioinformatics. This builds on informatics as well as data and metadata, there are domain-specific data sources (e.g., the model organism sequencing databases), and domain-specific algorithms for searching and analysing data (e.g., the sequence alignment algorithm, clustalw). The algorithm and data_resource hierarchies contain this information. • Molecular biology. This includes the higher level concepts used to describe the bioinformatics data types used as inputs and outputs in services. These concepts include examples such as, protein sequence, and nucleic acid sequence. • Tasks. A hierarchy describing the generic tasks a service operation can perform. Examples include retrieving, displaying and aligning. • Formats. A hierarchy describing bioinformatics file formats. For example, fasta format for sequence data, or phylip format for phylogenetic data. •Informatics. Not sure if we have a corollary or need this. • Bioinformatics. We could probably use theirs for attaching a concept to Operation algorithm • Molecular biology. We have this already, its the concepts associated with the UML Class level • Tasks. We could probably use theirs for Operation task. • Formats. Probably useful, this is at the Service level – not sure if this is similar to the ‘dimensionality’ attribute we have at the input/output level? Courtesy: K. Wolstencroft et al The myGrid ontology: bioinformatics service discovery, Int. J. Bioinformatics Research and Applications, Vol. x, No. x, 200x “Forms” Description • A structured set of metadata for the collection of information and for the ‘form’ itself • There are various types of forms – Statistical forms – Surveys, Questionnaires – Population Science Measures • Eg quality of life • Link to Rules for Form Completion • Rights – can the form be reused? By whom? What is the process for authorization, etc. • Includes Standard way to describe minimal Rendering information • To describes how the form looks • ‘Behavior’ – skip pattern – Form Structure • Standard way to associate the semantics of the Question and Answer Set (e.g. to associate an item on the form with a 11179 Data Element, DEC or Value Domain Forms MetaModel Related Projects • OASIS • UN/CEFACT – Forms Metamodel – UN eDocs – Cross Border Trade ‘form templates’ – may have an implicit form metamodel – ISO 15000-5 ebXML – ISO 7372 – Adobe pdf form metamodel – MS InfoPath – has a metamodel for their forms – XForms - language for representing forms Forms Metadata • • • • • caDSR Protocol Forms Metadata cancerGrid Protocol Forms HL7 Clinical Document Architecture Westat Survey/Instrument Metamodel ?? class ProtocolCaseReportForms domain::AdministeredComponent + begi nDate: Date + changeNote: Stri ng + createdBy: Stri ng + dateCreated: Date + dateModi fi ed: Date + del etedIndi cator: Stri ng + endDate: Date + i d: Stri ng + l atestVersi onIndi cator: Stri ng + l ongName: Stri ng + modi fi edBy: Stri ng + ori gi n: Stri ng + preferredDefi ni ti on: Stri ng + preferredName: Stri ng + publ i cID: Long + regi strati onStatus: Stri ng + unresol vedIssue: Stri ng + versi on: Fl oat + workfl owStatusDescri pti on: Stri ng + workfl owStatusName: Stri ng +admi ni steredComponent 1 +admi ni steredComponentCl assSchemeItemCol l ecti on 0..* domain:: AdministeredComponentClassSchemeItem domain:: domain::Instruction domain::Protocol + createdBy: Stri ng + dateCreated: Date + dateModi fi ed: Date + approvedBy: Stri ng + i d: Stri ng + approvedDate: Date + modi fi edBy: Stri ng + changeNumber: Stri ng + changeT ype: Stri ng + l eadOrgani zati onName: Stri ng + phase: Stri ng + protocol ID: Stri ng + revi ewedBy: Stri ng + revi ewedDate: Date + type: Stri ng +protocol Col l ecti on DataElement + type: Stri ng +i nstructi on 0..* +dataEl ement 0..1 0..* +admi ni steredComponentCl assSchemeItemCol l ecti on +formEl ement 1 domain::FormElement 0..* +protocol Col l ecti on 0..* +sourceFormEl ement 1 1 +questi onCol l ecti on 0..* +targetFormEl ement domain:: ConditionMessage domain::Question + i d: Stri ng + defaul tVal i dVal ueId: Stri ng + message: Stri ng + defaul tVal ue: Stri ng + messageT ype: Stri ng + di spl ayOrder: Integer + i sEdi tabl e: Stri ng + i sMandatory: Stri ng +condti onMessage 0..* +tri ggerActi onCol l ecti on +questi on +questi onCol l ecti on 0..* 0..* +tri ggerActi onCol l ecti on +tri ggerActi onCol l ecti on 0..* +tri ggerActi 0..* onCol l ecti on 0..* +questi on 0..* 1 +questi on 0..1 1 +questi on +questi onCondi ti on domain::TriggerAction 0..1 +questi onCondi ti on 1 + acti on: Stri ng + createdBy: Stri ng + cri teri onVal ue: Stri ng + dateCreated: Date + dateModi fi ed: Date + forcedVal ue: Stri ng + i d: Stri ng + i nstructi on: Stri ng + modi fi edBy: Stri ng + tri ggerRel ati onshi p: Stri ng +val i dVal ueCol l ecti on +forcedCondi +questi ti onTonCondi ri ggeredActi ti on onCol l ecti on domain:: QuestionCondition 0..* +tri ggeredActi +enforcedCondi onCol l ecti onti on 0..* + 0..1 +parentQuesti onCondi ti on i d: Stri ng 0..1 1 +questi onComponentCol l ecti on +questi onCondi ti on + descri pti on: Stri ng + di spl ayOrder: Integer - meani ngT ext: Stri ng +defaul tVal i dVal ue 0..1 0..1 +condi ti onComponent +questi onCondi ti on 0..* l ecti on 0..* +condi ti onComponentCol 0..* 0..* +questi onRepeti ti onCol l ecti on 0..* domain::QuestionConditionComponents +formCol l ecti on 0..* domain::ValidValue +val i dVal ue 0..1 + constantVal ue: Stri ng + di spl ayOrder: Integer + i d: Stri ng + l ogi cal Operand: Stri ng + operand: Stri ng +questi onRepeti 0..* ti onCol l ecti on domain::QuestionRepetition + defaul tVal ue: Stri ng + i sEdi tabl e: Stri ng + repeatSequenceNumber: Integer +modul e 1 0..* +condi ti onComponent 0..* +functi on 0..1 domain::Module domain::Form + di spl ayName: Stri ng + type: Stri ng domain::Function +form 1 + createdBy: Stri ng + dateCreated: Date + dateModi fi ed: Date + i d: Stri ng + modi fi edBy: Stri ng + name: Stri ng + symbol : Stri ng +modul eCol l ecti on 0..* + di spl ayOrder: Integer + maxi mumQuesti onRepeat: Integer class ProtocolCaseReportForms domain:: AdministeredComponentClassSchemeItem domain:: domain::Instruction domain::Protocol + createdBy: String + dateCreated: Date + dateModified: Date + approvedBy: String + id: String + approvedDate: Date + modifiedBy: String + changeNumber: String + changeT ype: String + leadOrganizationName: String + phase: String + protocolID: String + reviewedBy: String + reviewedDate: Date + type: String +protocolCollection DataElement + type: String +instruction 0..* +dataElement 0..1 0..* +administeredComponentClassSchemeItemCollection +formElement 1 domain::FormElement 0..*+protocolCollection 0..* +sourceFormElement 1 1 +questionCollection 0..* +targetFormElement domain:: ConditionMessage domain::Question + id: String + defaultValidValueId: String + message: String + defaultValue: String + messageT ype: String + displayOrder: Integer + isEditable: String + isMandatory: String +condtionMessage 0..* +triggerActionCollection +question +questionCollection 0..* 0..* +triggerActionCollection +triggerActionCollection 0..* +triggerActionCollection 0..* 0..* +question 0..* 1 +question 0..1 1 +question +questionCondition domain::TriggerAction 0..1 +questionCondition 1 + action: String + createdBy: String + criterionValue: String + dateCreated: Date + dateModified: Date + forcedValue: String + id: String + instruction: String + modifiedBy: String + triggerRelationship: String +validValueCollection +forcedConditionT +questionCondition riggeredActionCollection domain:: QuestionCondition 0..* +validValue 0..1 +triggeredActionCollection +enforcedCondition 0..* + 0..1 +parentQuestionCondition id: String 0..1 1 +questionComponentCollection +questionCondition + description: String + displayOrder: Integer - meaningT ext: String +defaultValidValue 0..1 0..1 +conditionComponent +questionCondition 0..* +conditionComponentCollection 0..* 0..* 0..* +questionRepetitionCollection 0..* domain::QuestionConditionComponents +formCollection 0..* domain::ValidValue + constantValue: String + displayOrder: Integer + id: String + logicalOperand: String + operand: String +questionRepetitionCollection 0..* domain::QuestionRepetition + defaultValue: String + isEditable: String + repeatSequenceNumber: Integer +module 1 0..* +conditionComponent 0..* +function 0..1 domain::Module domain::Form + displayName: String + type: String domain::Function + createdBy: String + dateCreated: Date + dateModified: Date + id: String + modifiedBy: String + name: String + symbol: String +form 1 caDSR Protocol Forms Metamodel December 2007 +moduleCollection 0..* + displayOrder: Integer + maximumQuestionRepeat: Integer HL7 Clinical Document Architecture (CDA) Metamodel Metadata Standards Development • SC 32 and WG 2 Options for progressing these – ServiceMetadata, Forms Metadata, Terminology Metadata and Global Registry Metadata (RoR - Issue 127) – Also need a Service Specification for Registries – Work with Issue 127, or create another issue to put into 11179 – WG 2 ‘Study Period’ • Look at this and decide what approach we should take • New part of 11179? New Standard? • ‘Study Period’ involves more open participation, not as formal as a WG 2 meeting • Could be proposed as a new Study Period next week as a ‘Resolution’ – Needs a Leader, provides an opportunity to circulate something for people to “sign-up” - Could Report back at the next WG 2 meeting in the Winter 2008 (5-8 months) • ISO 19115 (TC 211) Geographic Information • TC 184/SC 4 and ISO 8000 Data Quality – Quality of Services? Gerald Radack and Peter Benson • OMG Evan Wallace/Elisa Kendall • IT 4 Australian, New Zealand