Presentation Title - Order PC Magazine!

Download Report

Transcript Presentation Title - Order PC Magazine!

data relationship management
Data Integration to Data Governance
Data In the News:
Data slipups
Rick Whiting , 10-May-2006
Inaccurate business data lead to botched
marketing campaigns, failed CRM
projects--and angry customers.
A home valued at US$121,900 somehow wound up recorded in
Porter County's computer system as being worth a whopping US$400
million. Naturally, the figure ended up on documents used to calculate
tax rates. By the time the blunder was uncovered in February, the
damage was done.
2
Market Forces Affecting the Use of Data
Privacy Regulations
HIPAA, GLBA,
PIPEDA, EU DPD
Competitive
Edge
Straight-through
processing,
customer
service,
Consumer
confidence
Data
Inaccuracies,
Over-billing
SEC/NAD rule,
SARBOX, legal
liability, Mergers
Business Governance
3
Customer
Pressure
What are Companies Doing in
Response?
Credit Card Company:
Where is the Sensitive Data?
Business Problem:
• Risk of a security breach exposes
potential regulatory fines, negative
PR and customer backlash
Proposed Solution:
• Identify sensitive data flows in
structured databases so critical data
can be consolidated and properly
secured
Roadblock:
• 50 data analysts over 5 years
estimate makes project appear to be
unbounded and infeasible
Status:
• Project put on hold
5
Health Insurance Company:
Outsourcing Development
Business Problem:
• Data must be sent to India for offshore
application development.
• Sensitive data must be masked for
HIPAA compliance
Proposed Solution:
• Mask sensitive data before sending it
outside the company
Roadblock:
• Sensitive data, where is it?
• Can two sets of data that individually
contain no sensitive data be combined
to make it sensitive?
Status:
• Manual discovery of sensitive data
slows outsourcing to a crawl
6
Wall Street Firm: Data Consistency will
Increase Profitability
Business Problem:
• Transaction errors are expensive and the
risk of regulatory fines due to inconsistent
reference data is unacceptable
Proposed Solution:
• Deploy a master data management solution
Roadblock:
• 5 years to determine the business rules that
relate the master data system to legacy
systems
• Unable map two tables to each other after 6
weeks of work (70 tables total to map)
Status:
• Project on hold
7
Auto Insurance: Migrating Fragile Legacy
Integration Code to Modern Tools
Business Problem:
• Business changes force expensive and
difficult to implement changes in hand
written legacy integration code
Proposed Solution:
• Migrate legacy code to a modern ETL
(extract, transform, load) tool. Cost of
maintenance of ETL is a fraction of
legacy code
Roadblock:
• No one knows the code. The cost of
migration is unpredictable.
Status
• Company continues to manually
change hand written code ad hoc as the
business demands
8
The Common “?” in the Project Schedule
•
T= 0
Data Relationship Discovery
You have to know where your data is, how
it flows and relates across systems if you
hope to secure it, move it, consolidate,
integrate it ...
?
Consistency/
Master
Data
Internal
Security
Project Timeline
Integration
9
Don’t We Know Our Own
Data?
Myth #1: “We know our data”
I’m a professional.
Of course I know
my data!
• Subject matter experts
(SMEs) only know their own
systems
• But they can’t tell you how it
changes and is transformed
as it moves from system to
system
• Relationships between
systems are complex:
But, once it leaves my
hands, it is someone
else’s problem!
Wow, that transformation is
complex. Are you sure that
is in my data?
• SMEs sometimes change
jobs!
I’m going to start my
own consulting firm
11
Myth #2: “We know our data”
All of my data follows
the business rules for
this system!
• Business rules are broken all the
time as data crosses business
and system boundaries:
• 83 year old man in system A is
a “youthful driver” in system B
• Bond yield is listed as 5% in
system X and 5.3% in system Y
• Exceptions result in lost revenue,
customer dissatisfaction, and
regulatory fines
12
Myth #3: “We know our data”
• Business rules change as
organizations change
• Mergers and Acquisitions
• New products or services
• Products/services are retired
• Reorganizations
• New IT systems are added
I can’t keep up
with all the
acquisitions and
reorganizations.
They mess up the
way systems work
together. It is very
inconvenient.
13
The Reality
Companies lack
a global view of
their corporate
data map
14
Current Trend: Data Governance
What is it?
• The latest over-hyped term
• Data Integration
is to
Data Governance is to
Tactical as
Strategic
Definition
• Data Governance encompasses the people, processes and
procedures to create a consistent, enterprise view of your data
in order to:
• Improve data security
• Increase consistency & confidence in decision making
• Decrease the risk of regulatory fines
15
The Problem with Data Governance
• How do you do it?
• Where is the sensitive data?
• What are the business rules and data relationships
• Where are the exceptions?
• How do you ensure a consistent, repeatable process?
16
Traditional Proposed Approach: Metadata
What is it?
• Another over-hyped term
• Data about data: datatype (character,
integer, number, date etc), column
width, frequency, cardinality etc
Traditional Data Relationship
Discovery Tool
The Problem
• Single system metadata only:
• Profiling
• Traditional data integration tools do
not discover metadata
• Cleansing, ETL, EAI and EII
The Reality
• Data analysts manually examine data
values to figure out the data map
• The most sophisticated tool generally
used today is:
17
There is a Better Way
The Solution:
Data-Driven Relationship Discovery
• New approach to a 40 year old problem
• Sophisticated heuristics and algorithms
analyze actual data values
• Automates the discovery
and validation of:
• Sensitive data flows
• Business rules
• Complex transformations
between structured data sets in a
consistent and repeatable manner
19
Solution:
Data-Driven Exception & Discrepancy Discovery
• Identify exceptions to avoid:
• Regulatory fines
• Lost revenue
• Customer dissatisfaction
Transformation
CASE WHEN AGE <=25 THEN
Youthful_Driver = ‘Y’ ELSE ‘N’ END
Transformation
ApplicationA.BY * 10000 = ApplicationB.Bond_Yield
Hit Rate = 90%
Application A
Application B
B_Y
Bond_Yield
0.053
530
0.062
620
0.071
710
0.034
340
0.055
550
0.072
720
0.055
550
0.067
670
Exception 0.056
580
0.06
600
Hit Rate = 90%
Application A
Application B
AGE
Youthful_Driver
17
Y
24
Y
55
N
28
N
40
N
33
N
Y
Exception 83
29
N
36
N
42
N
20
Data-Driven Discovery Results
Credit Card Company
Wall Street Firm
Status: Project moving forward again
Status: Back on track
• Reduced estimated effort from 250
engineering years to 25 eng. years
• Eliminated project feasibility risk
• Over 5x (2 days vs 6 weeks manually)
improvement in discovery of business
rules made MDM project possible
• Found bond yield discrepancies
Health Insurance Company
Auto Insurance Company
Status: Outsourcing rollout accelerated
Status: Predictable & affordable migration
• Now confident in sensitive data
discovery accuracy and speed
• Launching new data masking service
companywide
• 80% reduction in effort required to
migrate hand-code to ETL tool
• Mapping process discovered potentially
costly business rule errors
21
Summary:
Data Governance = Strategic Data Integration
• Companies are implementing data governance projects to:
• Improve Security
• Increase Consistency
• Decrease Regulatory Risk
• First step of data governance… Discovery
• Automated data-driven discovery is a consistent, repeatable and
proven approach to identify:
• Sensitive Data
• Business Rules
• Data Exceptions
22
Key Contacts
Bob Shannon: U.S. East Coast Sales
Phone: (203) 878-8472
Email: [email protected]
Brian Smogard: U.S. Central Sales
Phone: (612) 605-9236
Email: [email protected]
Clive Harrison: U.S. West Coast and International Sales
Phone: 415-608-4632
Email: [email protected]
If you have any other follow up questions, contact me:
Todd Goldman
Phone: (408) 919-0191 ext 1115
Email: [email protected]
23