SIM/APC - DAMA-NCR Data Management Association
Download
Report
Transcript SIM/APC - DAMA-NCR Data Management Association
On the Authoritative Data Sources:
One Data Element at a Time
DAMA National Capital Region
Chapter Meeting
March 9, 2010
Washington, DC
Richard Wang, Ph.D.
Deputy Chief Data Officer
Chief Data Quality Officer
Office of the U.S. Army CIO/G-6
Director, MIT Information Quality Program (on leave)
Massachusetts Institute of Technology
Vesting University Professor of Information Quality
University of Arkansas at Little Rock
Data Quality Books
by MIT Information Quality Program
http://mitiq.mit.edu/Publications.htm
2006
© 1988-2010 MIT IQ Program, Army
2006
2005
-2-
2000
1999
For DAMA-NCR March 9, 2010 meeting
MIT’s role in the foundations
for IQ education (2007, Madnick)
Lots of time &
energy
IQ
Rich Wang
(our Harry
Potter)
Journals
- 2007 ACM Journal on Data and
Information Quality (JDIQ)
UALR: MSIQ and IQ PhD
Degree Programs
Conferences and Certification Programs
- 1996 International Conference on Information Quality (ICIQ)
- 2002 MIT-IQ program for Executives
- 2003 IQ-1: Principles and Foundations
- 2007 IQ Industry Symposium
Books
- Journey to Data Quality (2006)
- … and many others
Articles
- 1990 Polygen Data Quality Model (VLDB + ICIS)
- 1996 Beyond Accuracy
- 1998 Managing Information as a Product
Research Projects
- 1988 Total Data Quality Management Program (TDQM)
- 2002 MIT Information Quality (MITIQ) Program
© 1988-2010 MIT IQ Program, Army
-3-
* Not complete list
IQ
For DAMA-NCR March 9, 2010 meeting
One Data Element At a Time:
Federal Agency Case
Stakeholders Meeting
Data Element Identification
$1M+ impact per data element
90-day progress
© 1988-2010 MIT IQ Program, Army
-4-
For DAMA-NCR March 9, 2010 meeting
Private Sector Case
Data Element Selection Criteria
√
√
√
√
√
√
√
Critical to Business
Recognized Pain Point
$1M+ impact
Practical to model
Practical to Implement
Owner identified
Commitment by the Stakeholders: 3 C’s +
Management
© 1988-2010 MIT IQ Program, Army
-5-
For DAMA-NCR March 9, 2010 meeting
© 1988-2010 MIT IQ Program, Army
-6-
For DAMA-NCR March 9, 2010 meeting
Army Chief Data Quality Officer FY10 Priorities
1. 300-500 critical Army Data Elements in FY10, 5000 by
FY13
2. Army Staffing of Data Elements from Bronze to be
Silver, Gold
3. Vertical integration up with semantics, business logic,
objects (U-Core, C2-Core ontology)
Authoritative Data Sources
Designated Data Sources
Authoritative Data Elements
© 1988-2010 MIT IQ Program, Army
-7-
For DAMA-NCR March 9, 2010 meeting
Single Element Approach
Challenge:
Establish a Total Data Quality
Management (TDQM) Program in the
Army while utilizing limited resources
TDQM Cycle
Solution:
1. Address one data element at a time using priority
data elements within priority projects.
2. Take a first few data elements through the entire
TDQM cycle to educate and illustrate value.
3. Establish and populate a catalog of data element
quality specifications (the “Define” of TDQM)
containing priority data elements for broad use.
© 1988-2010 MIT IQ Program, Army
-8-
For DAMA-NCR March 9, 2010 meeting
Early Success
Project:
Elements:
Suicide Mitigation - NIMH Study feed
UIC, SSN
Developed Data Quality Specification to define data
quality rules.
Constructed Information Product Map (IP-Map) that shows
the flow of the data element and its quality checks from
data providers to NIMH Study consumer.
ADCF implemented quality checks and reported results.
Captured DQ Process metrics and DQ element metrics
in a Dashboard.
Preparing DQ element metric details to feed back to data
providers.
© 1988-2010 MIT IQ Program, Army
-9-
For DAMA-NCR March 9, 2010 meeting
(Army) Data Element Yellow Pages
A. Army Data Elements
B. IP Producers utilize the Data
specifications are developed thru
the Data Element Quality Definition
Process and entered in the Data
Element Yellow Pages
Element Yellow Pages to discover
Data Element specifications and
integrate them into their Information
Products
IP Producer
Data Element
Quality Definition
Process
IP
http:architecture.army.mil/data/DEYP
IP =
Information
Product
C. IP Consumers access the
Data Element Yellow Pages to
find Data Element specifications
for understanding and correctly
using the data.
IP
IP Consumer
© 1988-2010 MIT IQ Program, Army
- 10 -
For DAMA-NCR March 9, 2010 meeting
Data Element Yellow Pages Content
Data Element Quality
Specification:
• Element Name
• Definition
• Data Quality Rules
• Approval Level
• Examples
• Data Element Owner
(Steward?)
• Authoritative References
• Usage Notes
• more…
Data Quality Rules:
Supports “fit for use”
Segmented into Three Levels
1. Container (conceptual format)
2. Content
(correct in itself)
3. Context
(correct in context)
Approval Level:
1. Gold –
ADB Approved
2. Silver –
ADC Approved
3. Bronze – CDQO Approved
© 1988-2010 MIT IQ Program, Army
- 11 -
For DAMA-NCR March 9, 2010 meeting
Data Element Quality
Specification Process
Research Using
Publications and
SMEs
CDQO Submit to
Army Data
Council for
Review
Army Data
Council Review
and Corrections
© 1988-2010 MIT IQ Program, Army
Determine
Quality Rules
Produce
Specification
CDQO Approval
BRONZE
Submit to CDQO
for Review
Army Data
Council Approval
Army Data Board
Approval
SILVER
GOLD
- 12 -
For DAMA-NCR March 9, 2010 meeting
ADC Review and Comment
Process (proposed)
1. Review DE Specifications with your SMEs
Note: you will find some documents cover the entire project; others have only
the definition and quality sections completed. Review the definition and
quality sections.
2. Gather and submit your comments to CDQO
All comments welcomed (positive, corrections, content, format,
unaddressed). No comment [silence] is concurrence.
Send your comments to CDQO Office.
3. Suspense Date: Week before next ADC Meeting
for readout at month ADC meeting.
© 1988-2010 MIT IQ Program, Army
- 13 -
For DAMA-NCR March 9, 2010 meeting
ADS Defined
Authoritative Data Source:
A recognized or official data production source with a designated mission
statement or source/product to publish reliable and accurate data for subsequent
use by customers. An authoritative data source may be the functional combination
of multiple, separate data sources.
© 1988-2010 MIT IQ Program, Army
- 14 -
For DAMA-NCR March 9, 2010 meeting
To assure data quality…
A data source, is a mechanism through which the publication, storage, or
retrieval of data is possible. Within the scope of the Information Technology
domain, a data source is consists of digitized data, such as a database, a
machine readable file, or a data stream. Data sources contain or provide
information and fulfill specific data needs within an identified mission context.
A data element is an attribute in a database, a field in a machine readable file,
or a basic unit in a data stream.
The association of a data need and a given mission characterizes a data
source’s intended use.
A data source is referred to as a Designated Data Source if the mission and
the needed data elements from the data source for this mission are clearly
specified.
An authoritative body that has responsibility of fulfilling a particular data need
attributes a data source as a designated data source.
A designated data source is referred to as an Authoritative Data Source if
the underlying data of the data elements needed in the specified mission
is certified as accurate, timely, and fit for subsequent use by data
consumers.
© 1988-2010 MIT IQ Program, Army
- 15 -
For DAMA-NCR March 9, 2010 meeting
Thank you!
Q&A
© 1988-2010 MIT IQ Program, Army
- 16 -
For DAMA-NCR March 9, 2010 meeting