ANSI X3.285 Metamodel for the Management of Sharable Data

Download Report

Transcript ANSI X3.285 Metamodel for the Management of Sharable Data

Metadata Registry Standards: A Key to Information Integration Judith Newton NIST DAMA NCR May, 1999

Agenda • Specification and Standardization of Data Elements: ISO 11179, Parts 1-6 • Metamodel for Management of Shareable Data, ANS X3.285

• Specification of Data Value Domains, ISO TR 15452 • NWI for Content Issues

Parts of 11179

Status • Part1: Framework • Part2: Classification • Part 3: Basic Attributes • Part 4: Formulation of Definitions • Part 5: Naming and Identification • Part 6: Registration DIS DIS IS IS IS IS

Part 1 - Framework

Organization of Framework Document • Definitions • Fundamental Concepts • Other parts • Informative Annexes

Definition: Data Element

• A unit of data for which the definition, identification, representation, and permissible values are specified by means of a set of attributes.

Database, File, Etc.

Transaction, Exchange Unit, Etc.

Database, File, Etc.

Record, Segment, Class, Tuple, Etc.

Field, Column, Etc.

Data Element

Identifier Definition Name Value Domain Etc.

Data Element

Identifier Definition Name Value Domain Etc.

Data Element

Identifier Definition Name Value Domain Etc.

Character, Image, Sound, Etc.

Fundamental Model

• Taken From Data Modeling • 3 Components – object class – property – representation

Definition: Object Class

• Things for which to Store Data • Entities in E-R Models • Classes in O-O Models • Employers, Persons, Automobiles, Orders, etc.

Definition: Property

• A peculiarity common to all members of an object class.

• Distinguishes or Describes Objects • Attributes or Data Members in Models • Identifier, Age, Address, etc.

Definition: Representation

• The combination of a representation class, value domain, datatype, and, if necessary, a unit of measure or a character set.

Part 2 - Classification

Classification Structures

•What forms can classification take?

– Keywords – Controlled word lists – Terms from models – Thesaurus – Taxonomy – Ontology • Acyclic directed graph, lattice • Multiple inheritance

Classification Fundamental Notions

• Each node in a classification structure is a taxon (plural: taxa).

– Given a classification structure, any taxa relating to a data element can be recorded – The taxa can be recorded in a separate “classification” attribute – With adequate software, users could access and navigate the classification structure – A nonintelligent identifier for each taxon helps to deal with change

Status

• ISO – Draft International Standard • Continuing R&D – Concept is evolving • Search engines • Middleware - agents, mediators, request brokers • XML tags – New project - terminology for registries

Part 3 - Basic Attributes

Scope of Part 3

• Specifies set of “basic attributes” of data elements – independent of their usage in application systems, data bases, data interchange messages.

– Recognizes need for additional attributes.

– No logical or physical structure of the data implied.

Categories of Basic Attributes • Identifying – identification of a data element • Definitional – description of semantic aspects of data element • Relational – associations among and/or between data elements • Representational – representational aspects • Administrative – management and control

Example Data Element

ATTRIBUTE NAME Name Identifier Version Registration Authority Synonymous Name Context Definition EXAMPLE

Country code 3166 1990 ISO code for country names

OBLIG Classification Scheme Keywords Related Data Reference Type of Relationship Representation Category Form of Representation Datatype of DE values Max Size of DE values Min Size of DE values Layout of Representation Permissible DE values Responsible Organization Registration Status Submitting Organization Comments

geopolitical entity, country character string code alphabetic character 2 2 All 2-alpha codes in 3166 ISO Maintenance Agency Standard M C M O C O O O O O C M M M M M C C C O C M

Summary

• Part 3 is a good start to establishing an unambiguous set of specifics documenting data elements.

• However, – Further work on the other 11179 parts and beyond has resulted in many refinements and advances addressing a variety of data-related concepts.

– A new work item involves replacing Part 3 with X3.285.

Part 4 - Data Definitions

Data Definition Rules

• A data definition shall: – Be unique (within a data dictionary) – Be stated in the singular – State what the concept is, rather than what it is not – Be stated as a descriptive phrase or sentence(s) – Contain only commonly understood abbreviations – Be expressed without embedding definitions of other data elements or underlying concepts

Data Definition Guidelines

• A data definition should: – State the essential meaning of the concept – Be precise and unambiguous – Be concise – Be able to stand alone – Be expressed without embedding rationale, functional usage, domain information or procedural information – Avoid circular reasoning – Use consistent terminology and structure for related definitions

Part 5 - Naming and Identification

Identification of Data Elements

Five attributes serve to identify a data element.

• name •context • registration authority identifier • data identifier • version identifier Name and context always occur in pairs. The other three attributes compose the International Registration Data Identifier.

Principles for Registration Identification of Data

Each data element has a unique identifier within the register of a Registration Authority.

The combination of: • Registration authority identifier • Data identifier and • Version identifier uniquely identify a data element.

To be assigned an identifier, the element must be derived, attributed, defined, named, and registered according to ISO/IEC 11179.

A data element shall have at least one name within a context .

Rule Derivation for Data Elements: Naming principles are described in general terms with examples furnished.

Rules are derived from the principles by which standard names are developed.

These rules form a naming convention.

Because syntax, semantic and lexical rules vary by organization, such as corporations or standards-setting bodies for business areas, no specific naming convention rules are prescribed in the International Standard.

The naming principles described in the standard can be applied to other entities, such as attributes and objects.

Rule Types Data element names are formed of components.

Each is assigned meaning (semantics) and relative or absolute position (syntax) within a name.

They are subject to lexical rules.

The components are: object class terms, property terms, representation terms, and qualifier terms. The first three components have counterparts in the Metamodel.

Naming Component Example OBJECT CLASS TERM:

Country

NAME:

Trading partner country name

REPRESENTATION TERM:

Name

PROPERTY TERM:

Identifier

QUALIFIER TERMS:

Trading partner

Part 6 - Registration Meta Data Registration Principles

• • • •

Non exclusive registration:

Every organization may be a Registration Authority.

Data sharing registration:

Data may be shared intra- or inter-organizationally.

Economically enforced registration:

Utility determines longevity and usefulness.

Flexible Registration:

Meta data may be registered at different levels of quality.

Certified Standardized Retired Recorded Incomplete Registration Status

X3.285 - Metamodel

Metamodel Purpose

• Promote sharing of metadata for – understanding (meaning, representation, identification) – discovery – harmonization – reuse – analysis • Provide a common base for metadata registries – management structure – components for interchange

Metamodel Regions

Stewardship Data Element Administration Data Element Concept Administration Conceptual & Value Domain Administration Naming & Identification Classification

DATA ELEMENT Data Element Concept Object Class Property Conceptual Domain Value Meaning Permissible Values Data Value Domain Representation Class Data Element Representation

Data Element Model

Data Element Concept Conceptual Domain conceptual domain identifier +contains

enumerated conceptual domain

+contained in Value Meaning value meaning identifier (VMID) value meaning descriptor value meaning begin date value meaning end date Representation Class Representation class name 1 0..* Data Value Domain value domain name value domain character set name value domain minimum character quantity value domain maximum character quantity value domain dependency description value domain format +contains

enumerated value domain

+means +represents Permissible Value permissible value label permissible value begin date permissible value end date A B A contains B D KEY a F must have one or more Gs a G may have zero or one F E a D may have any number of Es an E must have only one D F G

Future Extensions & Work

• Promotion of X3.285 to an ISO standard • Completion of TR 15452 - Data Value Domains • XML Tags • Content consistency • Extended classification/terminology support • Object extensions

DTR 15452 - Specification of Data Value Domains

Definition: Value Domain

• A set of permissible values.

• Enumerated: – Countries of the world • Non-Enumerated: – All Real Numbers Between 0 & 1, 17 Char Alpha-Num, YYYYMMDD

Value Domains -

Examples

– Geographic Codes – Chemical Names – Biological Classification

The Problem

How can data values be mapped among representations so that the equivalent semantic meaning is determined, even if the language, format or character set of the representations differ?

The Benefits

The sharing and reuse of data through equivalent data values will allow information to be exchanged faster and more efficiently.

Sets of reusable domain values, with unique identifiers assigned, eliminate the need for exact representation matches.

Scope of the TR

Attributes for identification, specification, development and reuse of data value domains for data elements.

Assigning a unique identifier to each value within a domain. Defining a data element conceptual domain and describing mappings between the values of a conceptual domain and the values of each representational data value domain.

Defining reuse of value domains among data elements.

Data Element Concept Conceptual Domain conceptual domain identifier +contains

enumerated conceptual domain

+contained in Value Meaning value meaning identifier (VMID) value meaning descriptor value meaning begin date value meaning end date Representation Class Representation class name 1 0..* Data Value Domain value domain name value domain character set name value domain minimum character quantity value domain maximum character quantity value domain dependency description value domain format +contains

enumerated value domain

+means +represents Permissible Value permissible value label permissible value begin date permissible value end date A B A contains B D KEY a F must have one or more Gs a G may have zero or one F E a D may have any number of Es an E must have only one D F G

Data Element Concept Conceptual Domain conceptual domain identifier +contains

enumerated conceptual domain

+contained in Value Meaning value meaning identifier (VMID) value meaning descriptor value meaning begin date value meaning end date Representation Class Representation class name 1 0..* Data Value Domain value domain name value domain character set name value domain minimum character quantity value domain maximum character quantity value domain dependency description value domain format +contains

enumerated value domain

+means +represents Permissible Value permissible value label permissible value begin date permissible value end date Conceptual Level: Object class and Property Logical Level: Representation with addition of qualifier, Application Level

Conceptual Level

Object Class Name: Context: Definition: Property

Country ISO 3166 All separate territories of the Earth.

---------------------------------------------------------------------------------------------------------------------------------------

Name: Context: Definition:

Identifier Means of distinguishing among objects.

-------------------------------------------------------------------------------------------------------------------------------------- Data Element Concepts Conceptual domain id: Value meaning identifier (VMID): Value meaning description: V.m. date in: V.m. date out:

Country identifier 001-220 Identifiers for all the countries of the world.

19940101 Country identifier subset 002 004 005 . . .

Identifiers for some of the countries of the world.

19950603

-------------------------------------------------------------------------------------------------------------------------------------- Representation Class Name: Context: Definition:

name NIST SP 500-149 A designation for an object.

code NIST SP 500-149 A system of valid symbols which substitute for longer values.

-------------------------------------------------------------------------------------------------------------------------------------- ---------------------------------------------------------------------------------------------------------------------------------------

-------------------------------------------------------------------------------------------------------------------------------------- ---------------------------------------------------------------------------------------------------------------------------------------

Logical Level Application Level

Logical Data Element (Generic Data Element) (Reference Data Element) Application Data Element Name: Context: IRDI:

Country name ANSI/ISO/NISO 3166 Trading partner country code fictional 4488

Definition:

The names of all the A subset of countries of the world. 3166, selected by trading treaty.

------------------------------------------------------------------

Data concept attributes

Note: In a registry implementation, the listing of these attributes could be replaced by a link to the

DEC

.

Conceptual domain id:

Country identifier Country identifier subset

Value meaning identifier (VMID):

001-220

Value meaning description: V.m. date in:

Identifiers for all the countries of the world.

19940101

V.m. date out:

------------------------------------------------------------------

Data element representation attributes

002 004 005 . . .

Identifiers for some of the countries of the world.

19950603

Value domain name: Rep. class name: V.d. character set name: V.d.min. char. quantity: V.d.max. char. quantity: Permissible value label: P.v. date in: P.v. date out: V.d. dependency desc: V.d. format:

short name in English name alphabetic character 4 40 All names of countries in short English form listed in ISO 3166 19940101 none text 2-alpha code code numeric character 2 2 some 2-alpha codes for countries listed in ISO 3166, modified as described in trade treaty xx.

19950603 This value domain is a subset of ISO 3166.

AA

Conclusions Application of all principles of the ISO 11179 family to the development of meta data registries allows easy and effective exchange of data and meta data nationally and internationally.

References Applicable Documents, all available at: ftp://www.sdct.itl.nist.gov/L8 ISO/IEC 11179, Specification and Standardization of Data Elements Part 1: Framework for the Specification and Standardization of Data Elements Part 2: Classification for Data Elements Part 3: Basic Attributes of Data Elements Part 4: Rules and Guidelines for the Formulation of Data Definitions Part 5: Naming and Identification Principles for Data Elements Part 6: Registration of Data Elements ANS X3.285, Metamodel for the Management of Sharable Data ISO DTR 15452, Specification of Data Value Domains Web enabled Registries: National Health Information Knowledgebase (NHIK) from Australian Institute of Health and Welfare (AIHM) http://www.aihw.gov.au

Environmental Data Registry (EDR) from the U.S. Environmental Protection Agency (EPA) http://www.epa.gov/edr