Transcript Getting Data Ready for WebFOCUS
Getting Data Ready for WebFOCUS
Lucius McInnis, Systems Engineer – Client Services Group Kam Wong, Solutions Architect – iWay Software March 22, 2012 1
Data Quality/Business Intelligence Lexicon
GO GO 1960’s Dance Craze (Image: target.com) GI GI GI GO 1958 Romantic Musical (Image: imdb.com)
Garbage-In-Garbage-Out
2
Get Rid Of The Garbage…
• • • • • Access Cleanse Standardize Monitor Manage •
Accurate data promotes accurate information and decisions…
3
When Business Data Is Not Managed
•
ERRORS
•
DUPLICATION
•
CONFUSION
4
AGENDA
• •
The Path from Data
• • •
Access to Data Data Quality to Information Master Data Management/Data Synchronization Demonstration
Information Revenue Generation Quality of Care/Service .
Operations and Financial Mgmt.
Fraud, Waste, and Abuse Risk, Compliance, and Governance 5
Path from Data to Information
Infrastructure • Allow for access to data • Real-Time and Batch Information Movement • Reusability Data Quality • Allow for Real-Time Data Quality • Correct Data Quality issues before they propagate Master Data Management • Centralize the management of information • Control the information throughout to organization 6
Path from Data to Information
Infrastructure • Allow for access to data • Real-Time and Batch Information Movement • Reusability #1 7
Integration Approach – Start with an Integrated Infrastructure
8
Pre-packaged Integration Components ERP/Financials
Ariba I2 JD Edwards Lawson Manugistics Microsoft Oracle SAP
SFA/CRM
Amdocs/Clarify BMC/Remedy MSDynamics Oracle/Siebel Salesforce.com
SAP
Legacy Systems
CICS IMS VSAM .NET
Java TUXEDO MUMPS
Data Warehouse
DB2 ETL Oracle/Essbase MS SSAS/OLAP Netezza SAP BW Teradata
Industry
ACORD CIDX HL7 RNIF SWIFT 1Sync
B2B
Internet EDI Legacy EDI MFT Online B2B XML 9
Enterprise Data Integration Scenario Data Sources Data Integration Data Quality Reports Dashboards
10
Path from Data to Business Intelligence
Data Quality • Allow for Real-Time Data Quality #2 • Correct Data Quality issues before they propagate 11
The Business Value of Data Quality
•
Improves customer-facing processes:
Promotes accurate client address and household information •
Enables advanced analysis:
Facilitates the use of data-mining, market predictions, fraud detection, and future client value •
Credit and behavioral scoring:
Helps financial institutions improve risk management - Basel Capital Accord III (2010) •
Assists healthcare organizations:
Develop an Enterprise Master Patient Index (EMPI) leveraging connectivity to legacy systems and databases 12
Data Quality Center – Profiling
• • Profiling – Technical (Pre-built) • Basic Analysis • • • • • Minimums Maximums Averages Counts Etc.
• Patterns / Masking • Extremes • • • Quantities Frequency Analysis Foreign Key Analysis Profiling – All • Charting • • Grouping / Aggregate Drilldown / Interactive Displays 13
Data Quality – Cleansing
• • •
Parsing
• data parsed into components (pattern based)
Standardization
• transformation into standard format (Jim Smith -> James Smith) • standard and nonstandard abbreviations (Str. -> Street) • language-specific replacements •
Large number of domain oriented algorithms
• Address • Party • Vehicle • Name • Identification number • Credit Card number • Bank account number
Data quality validation
• validation against rules • validation against reference tables •
Extension by custom validation steps
• using complex function and rules including • • •
Levensthein distance SoundEx internal (java-based) functions
14
Data Quality – Match & Merge
•
Unification
• identification of the candidate groups • company • address • person • product • …etc.
•
Fuzzy logic and scoring
• Same name + same address • Same name + similar address • Similar name + same address • Similar name + similar address •
Deduplication
• • best representation of the identified subject golden record creation •
Identification
• new data entries – to identify subject (person, address, etc.) to which the new record is connected (matched) •
Complex business rules
• using sophisticated algorithms and functions including • Levensthein distance • Hamming distance • Edit distance • Data quality scores values • Data stamps of last modification • Source system originating data 15
Data Quality: Issue Management
16
Data Quality Issue Management
17
Issue Tracker Portal – Workflow Management
18
Issue Tracker Portal – Issue Resolution (1)
19
Issue Tracker Portal – Issue Resolution (2)
20
Path from Data to Business Intelligence
Master Data Management • Centralize the management of information #3 • Control the information throughout to organization 21
Moving Towards MDM from Data Quality
1. Matching: Identification, linking related entries within or across sets of data.
2. Merging: Creation of the golden data based on one or several reference source and rules.
3. Propagating: Update other systems with Golden Data if required.
4. Monitoring: Deployment of controls to ensure ongoing conformance of data to business rules that define data quality for the organization.
22
MDM Architectures Source Master Source Consolidated
• • • • Master is Single Version of Truth Data Quality at Master Updates occur at Sources Updates propagated to Master
Source Source Source Source Master Source Registry Style
• • • • • Multiple Versions of Truth Data Quality is Ongoing Updates occur at Sources Keys and Metadata in Registry Updates propagated to other Sources
Source
• Other Styles Supported 23
Project Successes – Pathway to Maturity Getting to MDM – “Golden Data” 1.
Start with Data Profiling
• Understand the data you have • Identify inconsistencies in the data • Disseminate the information about the data quality
2.
• • •
Continue with Data Quality
Validate, standardize and cleanse for purpose Automate the process De-duplication (Match & Merge)
3.
• •
End with Master Data
Synchronize with closed loop feedback integration Provide a single view for all stake holders
4.
Implement Data Governance – Issue Tracking
24
Demonstration
25
Data Management Life-Cycle
26
Thank You! - Questions?
iWay Software
Because Everything Should Work Together.
WebFOCUS
Because Everyone Makes Decisions.
27