Enterprise Data Management for Siebel

Download Report

Transcript Enterprise Data Management for Siebel

Optim™
Essentials of Test Data Management
Jonathan L. Karp
Team Lead - LDR
[email protected]
© 2008 IBM Corporation
Optim™
Section 1: Effective Test Data
Management (TDM) for Improving
Costs & Efficiency
© 2008 IBM Corporation
Disclaimer
This presentation is intended to provide general
background information, not regulatory, legal or
other advice. IBM cannot and does not provide
such advice. Readers are advised to seek
competent assistance from qualified professionals
in the applicable jurisdictions for the types of
services needed, including regulatory, legal or
other advice.
3
© 2008 IBM Corporation
Outline
 TDM – What is it? Why is it important?
 Current approaches
 Key requirements for an effective approach for
TDM
4
© 2008 IBM Corporation
Enterprise Application
Snapshot in Time
Development
V3
QA
V2
User
Acceptance Testing
Production
Version 1
V2
Development and QA
Environments
5
© 2008 IBM Corporation
Multiple “Consumers” For Test Environments
Developers
Unit
Sole
Integration
Sole
Component
Primary
System Secondary
System
integration
Testers on
project teams
Testers in
central group
IT operations
Secondary
Primary
Secondary
Secondary
Primary
UAT
Implementation
Customers
Primary
Secondary
Primary
Internal and External “consumers” such as off-shore teams and partners
6
© 2008 IBM Corporation
Multiple Requirements for Test Environments
 Functionality
– Features and capabilities
 Performance
– Speed, availability, tolerance for load
 Usability
– Ease with which the software can be employed
 Security
– Vulnerability to unauthorized usage
 Compliance
– Conformance to internal standards or external regulations
7
© 2008 IBM Corporation
Key Business Goals
 Reduce Business Downtime
 Get to Market Faster
 Maximize Process Efficiencies
 Improve Quality
8
© 2008 IBM Corporation
Some Key Considerations
 Infrastructure Costs – higher HW storage costs
 Development Labor - higher costs
 Defects – Can be expensive
• Cost to resolve defects in the production environment can be 10 – 100
time greater than those caught in the development environment
 Data Privacy/Compliance
• Data breaches can put you out of business
9
© 2008 IBM Corporation
Test Data Management
Strategy and approach to creating and managing
test environments to meet the needs of various
stakeholders and business requirements.
10
© 2008 IBM Corporation
Current Approaches
#1 - Clone Production
#2 - Write SQL
Clone Production
Write SQL
Request for Copy
• Complex
• Subject to
Change
Extract
Wait
After
Production
Database
Copy
Production
Database
Copy
Changes
Extract
After
Changes
Manual examination:
Right data?
What Changed?
Correct results?
Unintended Result?
Someone else modify?
11
• RI Accuracy?
• Right Data?
Expensive,
Dedicated Staff,
Ongoing
Responsibility.
Share test database
with everyone else
© 2008 IBM Corporation
Cloning And Data Multiplier Effect
2
3
1
6
4
5
12
1. Production
500 GB
2. Training
500 GB
3. QA
500 GB
4. Development
500 GB
5. UAT
500 GB
6. Integration
500 GB
Total
3,000 GB
© 2008 IBM Corporation
Some Key Issues With Current Approaches
 Cloning can create duplicate copies of large
databases
– Large storage requirements and associated
expenses
– Time consuming to create
– Difficult to manage on an on-going basis
 Data privacy not addressed
 Internally developed approaches not cost effective
– Lengthy development cycles
– Dedicated staff
– On-going maintenance
13
© 2008 IBM Corporation
Using Subsetting For Effective TDM
Database
resized*
and re-indexed
PROD
CLONED
PROD
Extract
& Load
REDUCED
CLONE
GOLD
TRAINING
DEV
TEST
The cloning is performed only once!
14
© 2008 IBM Corporation
Effective Test Data Management Solution
 Subsetting capabilities to create realistic and
manageable test databases
 De-identify (mask) data to protect privacy
 Quickly and easily refresh test environments
 Edit data to create targeted test cases
 Audit/Compare ‘before’ and ‘after’ images of the
test data
15
© 2008 IBM Corporation
Key Aspects of an Effective TDM Approach
Test
Environment
Production
Database
...
Production
Database
Production Environment
Subset
16
De-Identify?
Test
Database
Test
Database
Test Environment
Test Environment
Refresh
Analyze
© 2008 IBM Corporation
Subset: Key Capabilities
 Precise subsets to build realistic “right-sized” test databases
– Application Aware
– Flexible criteria for determining record sets
– Business Logic Driven
– Complete Business Object: Referentially intact subsets
– Across heterogeneous environments
DB2 Order Entry
Oracle ERP
17
Legacy
CRM
© 2008 IBM Corporation
Subset: Complete Business Object
Cust_I
D is
Primar
y Key
CUSTOMERS
19101 Joe Pitt
02134 John Jones
27645 Karen Smith
ORDERS
27645
80-2382
20 June 2006
27645
86-4538
10 October 2006
• Referentially-intact
subset of data
• Example:
DETAILS
All Open –DN Call
Back related to
Cust_ID 27645 (Karen
Smith)
86-4538 DR1001 System Outage
86-4538 CL2010 Broken Cup Holder
18
© 2008 IBM Corporation
Data De-Identification
Production
Test
Validate and Compare
Subset
Mask
Propagate
Application X
(Oracle)
Application X
(Oracle)
• De-identify for privacy protection
• Deploy multiple masking algorithms
Application Y
(Oracle)
Application Z
(DB2)
• Substitute real data with fictionalized yet
contextually accurate data
• Provide consistency across
environments and iterations
• No value to hackers
• Enable off-shore testing
Application Y
(SQLServer)
Application Z
(DB2)
Ensure Data Privacy Across Non-Production Environments!
19
© 2008 IBM Corporation
Refresh
TESTDB
AP_INVOICES
 Load test environment with
precise set of data
-- ---- ---- ---- ------- ----- ---- ---- ---- ------- ----
-----
– Subset further as required
 Load utility for large volumes
of data
Subset
• Insert
• Update
• Load
INVOICE DIST
-----
---------------------
-----
---------------------------------
-------------
ACCT
EVENTS
---- ---- ---- ------- ----
------
-------------
-------------
-------------
-------------------------
-------------
QADB
 Easily refresh environments
AP_INVOICES
-- ---- ---- ---- ------- ----- ---- ---- ---- ------- ----
-----
INVOICE DIST
-----
---------------------
-----
---------------------------------
-------------
ACCT
EVENTS
---- ---- ---- ------- ----
------
20
-------------
-------------
-------------
-------------------------
-------------
© 2008 IBM Corporation
Analyzing Test Data
Version 1
INVOICES
27645 86-4538 Widget#1
27645 86-4538 Widget#PG13
Invoice Total
 Both Invoices total $100
$80.00
$20.00
$100.00
 Composition is different
 Could an error have been
missed?
Version 2
INVOICES
27645 86-4538 Widget#1
27645 86-4538 Widget#PG13
Invoice Total
21
$50.00
$50.00
$100.00
© 2008 IBM Corporation
Analyze Test Data

Compare the "before" and
"after" data from an application
test

Compare results after running
modified application during
regression testing

Identify differences between
separate databases

Audit changes to a database

Compare should analyze
complete sets data – finding
changes in rows in tables
–
–

22
SOURCE 1
COMPARE
PROCESS
COMPARE
FILE
SOURCE 2
Single-table or multi-table
compare
Compare file of results
Edit Data to Create Test Cases
© 2008 IBM Corporation
Effective TDM: Example ROI Benefits
Projected ROI = 504% (3 years), Payback Period = 13 months
23
© 2008 IBM Corporation
Summary: An Effective TDM Solution
 Ability to extract precise subsets of related data to
build realistic, “right-sized” test databases
– Complete business object
– Create referentially intact subsets
– Flexible criteria for determining record sets
 De-identify sensitive data in the test environment to
ensure compliance with regulatory requirements
for data privacy
 Easily refresh test environments
 Analyze test data.
24
© 2008 IBM Corporation
Optim™
Section 2: Data Privacy....Closing
the Gap
© 2008 IBM Corporation
Agenda
 The Latest on Data Privacy
 The Easiest Way to Expose Private Data
 Understanding the Insider Threat
 Considerations for a Privacy Project
 Success Stories
No part of this presentation may be reproduced or transmitted in any form by any means,
electronic or mechanical, including photocopying and recording, for any purpose without the
express written permission of IBM
26
© 2008 IBM Corporation
The Latest on Data Privacy
 2007 statistics
– $197
• Cost to companies per
compromised record
– $6.3 Million
• Average cost per data breach
“incident”
– 40%
• % of breaches where the
responsibility was with
Outsourcers, contractors,
consultants and business
partners
– 217 Million
• TOTAL number of records
containing sensitive personal
information involved in security
breaches in the U.S. since 2005
* Sources”: Ponemon Institute, Pirvacy
Rights Clearinghouse, 2007
27
© 2008 IBM Corporation
Did You Hear?
 UK gov’t suffered a massive data
breach in Nov. 07
– HMRC (Her Majesty's Revenue
& Customs) UK equivalent to
IRS
 Lost 2 disks containing personal
information on 25 million people
(ALMOST ½ of UK population!)
 Information has a criminal value
of $3.1 Billion
 No reported criminal activity to
date
28
© 2008 IBM Corporation
How much is personal data worth?
 Credit Card Number With PIN - $500
 Drivers License - $150
 Birth Certificate - $150
 Social Security Card - $100
 Credit Card Number with Security
Code and Expiration Date - $7-$25
 Paypal account Log-on and Password - $7
Representative asking prices found recently on cybercrime forums.
Source: USA TODAY research 10/06
29
© 2008 IBM Corporation
Cost to Company per Missing Record: $197
Lost
Productivity,
$30
$7
$13
$4
Loss of
Customers,
$98
Over 100 million records lost at a cost of
$16 Billion.
Incident
Response,
$54
$3
$1
$24
Free/Discounted Services
Notifications
Legal
Audit/Accounting Fees
Call Center
Other
Source: Ponemon Institute
30
© 2008 IBM Corporation
Where is Confidential Data Stored?
[1] ESG Research Report: Protecting Confidential Data, March, 2006.
31
© 2008 IBM Corporation
The Easiest Way to Expose Private Data …
Internally with the Test Environment
 70% of data breaches occur internally
(Gartner)
 Test environments use personally
identifiable data
 Standard Non-Disclosure Agreements
may not deter a disgruntled employee
 What about test data stored on laptops?
 What about test data sent to
outsourced/overseas consultants?
 How about Healthcare/Marketing Analysis
of data?
 Payment Card Data Security Industry
Reg. 6.3.4 states, “Production data (real
credit card numbers) cannot be used for
testing or development”
* The Solution is Data De-Identification *
32
© 2008 IBM Corporation
The Latest Research on Test Data Usage
 Overall application testing/development
– 62% of companies surveyed use actual customer data instead
of disguised data to test applications during the development
process
– 50% of respondents have no way of knowing if the data used
in testing had been compromised.
 Outsourcing
– 52% of respondents outsourced application testing
– 49% shared live data!!!
 Responsibility
– 26% of respondents said they did
not know who was responsible for
securing test data
Source: The Ponemon Institute. The Insecurity of Test Data: The Unseen Crisis
33
© 2008 IBM Corporation
Failure Story – A Real Life Insider Threat
 28 yr. old Software Development Consultant
 Employed by a large Insurance Company in Michigan
 Needed to pay off Gambling debts
 Decided to sell Social Security Numbers and other identity
information pilfered from company databases on 110,000
Customers
 Attempted to sell data via the Internet
– Names/Addresses/SS#s/birth dates
– 36,000 people for $25,000
 Flew to Nashville to make the deal with…..
 The United States Secret Service (Ooops)
Results:
 Sentenced to 5 Years in Jail
 Order to pay Sentry $520,000
34
© 2008 IBM Corporation
How is Risk of Exposure being Mitigated?
 No laptops allowed in the building
 Development and test devices
– Do not have USB
– No write devices (CD, DVD, etc.)
 Employees sign documents
 Off-shore development does not do the testing
 The use of live data is ‘kept quiet’
35
© 2008 IBM Corporation
Encryption is not Enough
 DBMS encryption protects DBMS theft and
hackers
 Data decryption occurs as data is retrieved from
the DBMS
 Application testing displays data
– Web screens under development
– Reports
– Date entry/update client/server devices
 If data can be seen it can be copied
– Download
– Screen captures
– Simple picture of a screen
36
© 2008 IBM Corporation
What is Data De-Identification?
 AKA data masking, depersonalization,
desensitization, obfuscation or data scrubbing
 Technology that helps conceal real data
 Scrambles data to create new, legible data
 Retains the data's properties, such as its width,
type, and format
 Common data masking algorithms include
random, substring, concatenation, date aging
 Used in Non-Production environments as a Best
Practice to protect sensitive data
37
© 2008 IBM Corporation
How does Data De-Identification Protect Privacy?
 Comprehensive enterprise data masking provides the
fundamental components of test data management
and enables organizations to de-identify, mask and
transform sensitive data across the enterprise
 Companies can apply a range of transformation
techniques to substitute customer data with
contextually-accurate but fictionalized data to produce
accurate test results
 By masking personally-identifying information,
comprehensive enterprise data masking protects the
privacy and security of confidential customer data, and
supports compliance with local, state, national,
international and industry-based privacy regulations
38
© 2008 IBM Corporation
Success with Data Masking
– “ Today we don’t care if we lose a laptop”
- Large Midwest Financial Company
– “ The cost of a data breach is exponentially more expensive
than the cost of masking data”
- Large East Coast Insurer
39
© 2008 IBM Corporation
Success: Data Privacy
About the Client:
$300 Billion Retailer
Largest Company in the World
Largest Informix installation in the world
 Application:
– Multiple interrelated retail transaction
processing applications
 Challenges:
– Comply with Payment Card Industry (PCI)
regulations that required credit card data to be
masked in the testing environment
– Implement a strategy where Personally
Identifiable Information (PII) is de-identified
when being utilized in the application
development process
– Obtain a masking solution that could mask
data across the enterprise in both Mainframe
and Open Systems environments
 Client Value:
– Satisfied PCI requirements by giving
this retailer the capability to mask
credit data with fictitious data
– Masked other PII, such as customer
first and last names, to ensure that
“real data” cannot be extracted from
the development environment
– Adapted an enterprise focus for
protecting privacy by deploying a
consistent data masking methodology
across applications, databases and
operating environments
 Solution:
– IBM Optim Data Privacy Solution™
40
© 2008 IBM Corporation
Concluding Thought #1
“It costs much less to protect sensitive data than it
does to replace lost customers and incur damage
to the image of the organization and its brand—an
irreplaceable asset in most cases.”
IT Compliance Group Benchmark Study 2/07
41
© 2008 IBM Corporation
Concluding Thought #2
“We're not going to solve this by making data
hard to steal. The way we're going to solve it is by
making the data hard to use.”
Bruce Schneier, author of "Beyond Fear: Thinking Sensibly
About Security in an Uncertain World"
42
© 2008 IBM Corporation
43
© 2008 IBM Corporation