Enterprise Data Management for Siebel
Download
Report
Transcript Enterprise Data Management for Siebel
Optim™
Essentials of Test Data Management
Jonathan L. Karp
Team Lead - LDR
[email protected]
© 2008 IBM Corporation
Optim™
Section 1: Effective Test Data
Management (TDM) for Improving
Costs & Efficiency
© 2008 IBM Corporation
Disclaimer
This presentation is intended to provide general
background information, not regulatory, legal or
other advice. IBM cannot and does not provide
such advice. Readers are advised to seek
competent assistance from qualified professionals
in the applicable jurisdictions for the types of
services needed, including regulatory, legal or
other advice.
3
© 2008 IBM Corporation
Outline
TDM – What is it? Why is it important?
Current approaches
Key requirements for an effective approach for
TDM
4
© 2008 IBM Corporation
Enterprise Application
Snapshot in Time
Development
V3
QA
V2
User
Acceptance Testing
Production
Version 1
V2
Development and QA
Environments
5
© 2008 IBM Corporation
Multiple “Consumers” For Test Environments
Developers
Unit
Sole
Integration
Sole
Component
Primary
System Secondary
System
integration
Testers on
project teams
Testers in
central group
IT operations
Secondary
Primary
Secondary
Secondary
Primary
UAT
Implementation
Customers
Primary
Secondary
Primary
Internal and External “consumers” such as off-shore teams and partners
6
© 2008 IBM Corporation
Multiple Requirements for Test Environments
Functionality
– Features and capabilities
Performance
– Speed, availability, tolerance for load
Usability
– Ease with which the software can be employed
Security
– Vulnerability to unauthorized usage
Compliance
– Conformance to internal standards or external regulations
7
© 2008 IBM Corporation
Key Business Goals
Reduce Business Downtime
Get to Market Faster
Maximize Process Efficiencies
Improve Quality
8
© 2008 IBM Corporation
Some Key Considerations
Infrastructure Costs – higher HW storage costs
Development Labor - higher costs
Defects – Can be expensive
• Cost to resolve defects in the production environment can be 10 – 100
time greater than those caught in the development environment
Data Privacy/Compliance
• Data breaches can put you out of business
9
© 2008 IBM Corporation
Test Data Management
Strategy and approach to creating and managing
test environments to meet the needs of various
stakeholders and business requirements.
10
© 2008 IBM Corporation
Current Approaches
#1 - Clone Production
#2 - Write SQL
Clone Production
Write SQL
Request for Copy
• Complex
• Subject to
Change
Extract
Wait
After
Production
Database
Copy
Production
Database
Copy
Changes
Extract
After
Changes
Manual examination:
Right data?
What Changed?
Correct results?
Unintended Result?
Someone else modify?
11
• RI Accuracy?
• Right Data?
Expensive,
Dedicated Staff,
Ongoing
Responsibility.
Share test database
with everyone else
© 2008 IBM Corporation
Cloning And Data Multiplier Effect
2
3
1
6
4
5
12
1. Production
500 GB
2. Training
500 GB
3. QA
500 GB
4. Development
500 GB
5. UAT
500 GB
6. Integration
500 GB
Total
3,000 GB
© 2008 IBM Corporation
Some Key Issues With Current Approaches
Cloning can create duplicate copies of large
databases
– Large storage requirements and associated
expenses
– Time consuming to create
– Difficult to manage on an on-going basis
Data privacy not addressed
Internally developed approaches not cost effective
– Lengthy development cycles
– Dedicated staff
– On-going maintenance
13
© 2008 IBM Corporation
Using Subsetting For Effective TDM
Database
resized*
and re-indexed
PROD
CLONED
PROD
Extract
& Load
REDUCED
CLONE
GOLD
TRAINING
DEV
TEST
The cloning is performed only once!
14
© 2008 IBM Corporation
Effective Test Data Management Solution
Subsetting capabilities to create realistic and
manageable test databases
De-identify (mask) data to protect privacy
Quickly and easily refresh test environments
Edit data to create targeted test cases
Audit/Compare ‘before’ and ‘after’ images of the
test data
15
© 2008 IBM Corporation
Key Aspects of an Effective TDM Approach
Test
Environment
Production
Database
...
Production
Database
Production Environment
Subset
16
De-Identify?
Test
Database
Test
Database
Test Environment
Test Environment
Refresh
Analyze
© 2008 IBM Corporation
Subset: Key Capabilities
Precise subsets to build realistic “right-sized” test databases
– Application Aware
– Flexible criteria for determining record sets
– Business Logic Driven
– Complete Business Object: Referentially intact subsets
– Across heterogeneous environments
DB2 Order Entry
Oracle ERP
17
Legacy
CRM
© 2008 IBM Corporation
Subset: Complete Business Object
Cust_I
D is
Primar
y Key
CUSTOMERS
19101 Joe Pitt
02134 John Jones
27645 Karen Smith
ORDERS
27645
80-2382
20 June 2006
27645
86-4538
10 October 2006
• Referentially-intact
subset of data
• Example:
DETAILS
All Open –DN Call
Back related to
Cust_ID 27645 (Karen
Smith)
86-4538 DR1001 System Outage
86-4538 CL2010 Broken Cup Holder
18
© 2008 IBM Corporation
Data De-Identification
Production
Test
Validate and Compare
Subset
Mask
Propagate
Application X
(Oracle)
Application X
(Oracle)
• De-identify for privacy protection
• Deploy multiple masking algorithms
Application Y
(Oracle)
Application Z
(DB2)
• Substitute real data with fictionalized yet
contextually accurate data
• Provide consistency across
environments and iterations
• No value to hackers
• Enable off-shore testing
Application Y
(SQLServer)
Application Z
(DB2)
Ensure Data Privacy Across Non-Production Environments!
19
© 2008 IBM Corporation
Refresh
TESTDB
AP_INVOICES
Load test environment with
precise set of data
-- ---- ---- ---- ------- ----- ---- ---- ---- ------- ----
-----
– Subset further as required
Load utility for large volumes
of data
Subset
• Insert
• Update
• Load
INVOICE DIST
-----
---------------------
-----
---------------------------------
-------------
ACCT
EVENTS
---- ---- ---- ------- ----
------
-------------
-------------
-------------
-------------------------
-------------
QADB
Easily refresh environments
AP_INVOICES
-- ---- ---- ---- ------- ----- ---- ---- ---- ------- ----
-----
INVOICE DIST
-----
---------------------
-----
---------------------------------
-------------
ACCT
EVENTS
---- ---- ---- ------- ----
------
20
-------------
-------------
-------------
-------------------------
-------------
© 2008 IBM Corporation
Analyzing Test Data
Version 1
INVOICES
27645 86-4538 Widget#1
27645 86-4538 Widget#PG13
Invoice Total
Both Invoices total $100
$80.00
$20.00
$100.00
Composition is different
Could an error have been
missed?
Version 2
INVOICES
27645 86-4538 Widget#1
27645 86-4538 Widget#PG13
Invoice Total
21
$50.00
$50.00
$100.00
© 2008 IBM Corporation
Analyze Test Data
Compare the "before" and
"after" data from an application
test
Compare results after running
modified application during
regression testing
Identify differences between
separate databases
Audit changes to a database
Compare should analyze
complete sets data – finding
changes in rows in tables
–
–
22
SOURCE 1
COMPARE
PROCESS
COMPARE
FILE
SOURCE 2
Single-table or multi-table
compare
Compare file of results
Edit Data to Create Test Cases
© 2008 IBM Corporation
Effective TDM: Example ROI Benefits
Projected ROI = 504% (3 years), Payback Period = 13 months
23
© 2008 IBM Corporation
Summary: An Effective TDM Solution
Ability to extract precise subsets of related data to
build realistic, “right-sized” test databases
– Complete business object
– Create referentially intact subsets
– Flexible criteria for determining record sets
De-identify sensitive data in the test environment to
ensure compliance with regulatory requirements
for data privacy
Easily refresh test environments
Analyze test data.
24
© 2008 IBM Corporation
Optim™
Section 2: Data Privacy....Closing
the Gap
© 2008 IBM Corporation
Agenda
The Latest on Data Privacy
The Easiest Way to Expose Private Data
Understanding the Insider Threat
Considerations for a Privacy Project
Success Stories
No part of this presentation may be reproduced or transmitted in any form by any means,
electronic or mechanical, including photocopying and recording, for any purpose without the
express written permission of IBM
26
© 2008 IBM Corporation
The Latest on Data Privacy
2007 statistics
– $197
• Cost to companies per
compromised record
– $6.3 Million
• Average cost per data breach
“incident”
– 40%
• % of breaches where the
responsibility was with
Outsourcers, contractors,
consultants and business
partners
– 217 Million
• TOTAL number of records
containing sensitive personal
information involved in security
breaches in the U.S. since 2005
* Sources”: Ponemon Institute, Pirvacy
Rights Clearinghouse, 2007
27
© 2008 IBM Corporation
Did You Hear?
UK gov’t suffered a massive data
breach in Nov. 07
– HMRC (Her Majesty's Revenue
& Customs) UK equivalent to
IRS
Lost 2 disks containing personal
information on 25 million people
(ALMOST ½ of UK population!)
Information has a criminal value
of $3.1 Billion
No reported criminal activity to
date
28
© 2008 IBM Corporation
How much is personal data worth?
Credit Card Number With PIN - $500
Drivers License - $150
Birth Certificate - $150
Social Security Card - $100
Credit Card Number with Security
Code and Expiration Date - $7-$25
Paypal account Log-on and Password - $7
Representative asking prices found recently on cybercrime forums.
Source: USA TODAY research 10/06
29
© 2008 IBM Corporation
Cost to Company per Missing Record: $197
Lost
Productivity,
$30
$7
$13
$4
Loss of
Customers,
$98
Over 100 million records lost at a cost of
$16 Billion.
Incident
Response,
$54
$3
$1
$24
Free/Discounted Services
Notifications
Legal
Audit/Accounting Fees
Call Center
Other
Source: Ponemon Institute
30
© 2008 IBM Corporation
Where is Confidential Data Stored?
[1] ESG Research Report: Protecting Confidential Data, March, 2006.
31
© 2008 IBM Corporation
The Easiest Way to Expose Private Data …
Internally with the Test Environment
70% of data breaches occur internally
(Gartner)
Test environments use personally
identifiable data
Standard Non-Disclosure Agreements
may not deter a disgruntled employee
What about test data stored on laptops?
What about test data sent to
outsourced/overseas consultants?
How about Healthcare/Marketing Analysis
of data?
Payment Card Data Security Industry
Reg. 6.3.4 states, “Production data (real
credit card numbers) cannot be used for
testing or development”
* The Solution is Data De-Identification *
32
© 2008 IBM Corporation
The Latest Research on Test Data Usage
Overall application testing/development
– 62% of companies surveyed use actual customer data instead
of disguised data to test applications during the development
process
– 50% of respondents have no way of knowing if the data used
in testing had been compromised.
Outsourcing
– 52% of respondents outsourced application testing
– 49% shared live data!!!
Responsibility
– 26% of respondents said they did
not know who was responsible for
securing test data
Source: The Ponemon Institute. The Insecurity of Test Data: The Unseen Crisis
33
© 2008 IBM Corporation
Failure Story – A Real Life Insider Threat
28 yr. old Software Development Consultant
Employed by a large Insurance Company in Michigan
Needed to pay off Gambling debts
Decided to sell Social Security Numbers and other identity
information pilfered from company databases on 110,000
Customers
Attempted to sell data via the Internet
– Names/Addresses/SS#s/birth dates
– 36,000 people for $25,000
Flew to Nashville to make the deal with…..
The United States Secret Service (Ooops)
Results:
Sentenced to 5 Years in Jail
Order to pay Sentry $520,000
34
© 2008 IBM Corporation
How is Risk of Exposure being Mitigated?
No laptops allowed in the building
Development and test devices
– Do not have USB
– No write devices (CD, DVD, etc.)
Employees sign documents
Off-shore development does not do the testing
The use of live data is ‘kept quiet’
35
© 2008 IBM Corporation
Encryption is not Enough
DBMS encryption protects DBMS theft and
hackers
Data decryption occurs as data is retrieved from
the DBMS
Application testing displays data
– Web screens under development
– Reports
– Date entry/update client/server devices
If data can be seen it can be copied
– Download
– Screen captures
– Simple picture of a screen
36
© 2008 IBM Corporation
What is Data De-Identification?
AKA data masking, depersonalization,
desensitization, obfuscation or data scrubbing
Technology that helps conceal real data
Scrambles data to create new, legible data
Retains the data's properties, such as its width,
type, and format
Common data masking algorithms include
random, substring, concatenation, date aging
Used in Non-Production environments as a Best
Practice to protect sensitive data
37
© 2008 IBM Corporation
How does Data De-Identification Protect Privacy?
Comprehensive enterprise data masking provides the
fundamental components of test data management
and enables organizations to de-identify, mask and
transform sensitive data across the enterprise
Companies can apply a range of transformation
techniques to substitute customer data with
contextually-accurate but fictionalized data to produce
accurate test results
By masking personally-identifying information,
comprehensive enterprise data masking protects the
privacy and security of confidential customer data, and
supports compliance with local, state, national,
international and industry-based privacy regulations
38
© 2008 IBM Corporation
Success with Data Masking
– “ Today we don’t care if we lose a laptop”
- Large Midwest Financial Company
– “ The cost of a data breach is exponentially more expensive
than the cost of masking data”
- Large East Coast Insurer
39
© 2008 IBM Corporation
Success: Data Privacy
About the Client:
$300 Billion Retailer
Largest Company in the World
Largest Informix installation in the world
Application:
– Multiple interrelated retail transaction
processing applications
Challenges:
– Comply with Payment Card Industry (PCI)
regulations that required credit card data to be
masked in the testing environment
– Implement a strategy where Personally
Identifiable Information (PII) is de-identified
when being utilized in the application
development process
– Obtain a masking solution that could mask
data across the enterprise in both Mainframe
and Open Systems environments
Client Value:
– Satisfied PCI requirements by giving
this retailer the capability to mask
credit data with fictitious data
– Masked other PII, such as customer
first and last names, to ensure that
“real data” cannot be extracted from
the development environment
– Adapted an enterprise focus for
protecting privacy by deploying a
consistent data masking methodology
across applications, databases and
operating environments
Solution:
– IBM Optim Data Privacy Solution™
40
© 2008 IBM Corporation
Concluding Thought #1
“It costs much less to protect sensitive data than it
does to replace lost customers and incur damage
to the image of the organization and its brand—an
irreplaceable asset in most cases.”
IT Compliance Group Benchmark Study 2/07
41
© 2008 IBM Corporation
Concluding Thought #2
“We're not going to solve this by making data
hard to steal. The way we're going to solve it is by
making the data hard to use.”
Bruce Schneier, author of "Beyond Fear: Thinking Sensibly
About Security in an Uncertain World"
42
© 2008 IBM Corporation
43
© 2008 IBM Corporation