Data Management - Inside Analysis
Download
Report
Transcript Data Management - Inside Analysis
Data Integration Strategy
Air Force Personnel Operations
Agency – Human Resources
December 2, 2011
Copyright © 2011 SAS Institute Inc. All rights reserved.
Agenda - TOC
Who is SAS?
Where is SAS in the Air Force
Essential Business Process Agility
Data Integration and Information Management
Data Quality: Cleansing and Enrichment
Data Integration/Management Benefits
Why SAS for AFPOA
Resources and Further Contact Information
2
Copyright © 2011, SAS Institute Inc. All rights reserved.
SAS – Leader in Applied Analytics
Largest privately held software
company in the world
2010 Revenue: $2.43 Billion – 35 Years
of consecutive growth
24% reinvestment in R&D for 2010
50,000 deployments worldwide
12,000 employees
90 of the top 100 of Fortune Global 500
All 15 U.S. Departments and 90% of
U.S. Federal Agencies
All 50 U.S. State Governments
Government agencies in 80 countries
Outstanding Technical Support
Services
Life Sciences: 6% Manufacturing: 5.5%
Retail: 4.5%
Communications: 7%
Education: 3%
Services: 11%
Energy & Utilities: 3%
Other: 1%
Government: 15%
Financial Services: 42%
Copyright © 2011, SAS Institute Inc. All rights reserved.
3
Where SAS is in the U.S. Air Force
In a new framework for AF Human Resources, these SAS installations
could be streamlined if needed or leveraged.
Malmstrom AFB
Hanscom AFB
Offutt AFB
Hill AFB
Wright-Patterson AFB
Langley AFB
Chantillly, VA &
Washington, D.C.
Scott AFB
Peterson AFB
Nellis AFB
Edwards AFB
LA AFB
Tinker AFB
Kirtland AFB
Huntsville
Robins AFB
Maxwell AFB-Gunter
Barksdale AFB
Brooks, Randolph
Lackland AFBs
Eglin AFB
Hurlburt Field
Patrick AFB
4
Copyright © 2011, SAS Institute Inc. All rights reserved.
Essential for Business Process Agility
Data Integration defined in terms of Information
Data Access/Integration/Quality
Data Governance for
Airman Personnel PII
Master Data
Best Record Airman Personnel
DATA
INTEGRATION
UNIFIED
PLATFORM
for
Information
Decision Management
Support
Performance and Cost
Analytical Model Management
and Governance
Recalibrate or New
DATA
QUALITY
DATA
ACCESS
ANALYTIC
MGMT.
MASTER
DATA
DECISION
MGMT.
5
Copyright © 2011, SAS Institute Inc. All rights reserved.
Data Integration and Information
Management
The Opportunity
Data Diversity:
Big Data
Information Management
Use Case
Diversity
Unified Data
Management
Role Diversity:
IT/Business
Acceleration
is required to meet the current
demands in the Business and
Technical landscape
Managing information for Air
Force personnel who are the
most valuable asset includes the
ability to value information
appropriately
Deployment
Diversity:
Cloud
Consumption
Diversity:
Mobility
Architecture
Diversity:
Enterprise
Architecture
Temporal
Diversity:
Right-Time
Application
Diversity:
Operational
Analytics
6
Copyright © 2011, SAS Institute Inc. All rights reserved.
Data Management and Integration
Capability View
Agile And Responsive To Changing Requirements
Linking IT Strategy to Business Strategy
DATA SERVICES
DATA & ANALYTIC SERVICES
DATA GOVERNANCE
ANALYTICS GOVERNANCE
Analytics
Management
Data
Management
DATA
INTEGRATION
DATA
QUALITY
MDM
DECISION
MANAGEMENT
MODEL
MANAGEMENT
&
MONITORING
MODEL
DEPLOYMENT
&
INTEGRATION
INFRASTRUCTURE SUPPORT:
Text & Unstructured Data Support, Events, Security, Meta-data & Lineage, Monitoring & Deployment
Infrastructure
Support
ENTERPRISE DATA ACCESS
7
Copyright © 2011, SAS Institute Inc. All rights reserved.
Data Management
UNIFIED
PLATFORM
INFORMATION
MANAGEMENT
DataManagement
Management
Data
Data
Integration
Data
Quality
Master Data
Management
Enterprise Data Access and Federation
Resolves
Architecture is rigid; not flexible to changes in business needs
Managing end-to-end processes is challenging
8
Copyright © 2011, SAS Institute Inc. All rights reserved.
Why SAS/DataFlux for Data Management
SAS the Original Data Warehouse
Pioneer
UNIFIED
PLATFORM
INFORMATION
MANAGEMENT
DataFlux
35 years Data Management Experience
Thousands of Data Management
Customers worldwide
End-to-end Solution Available
Proven Market Leader in Data Management
Recognized as a leading provider of enterprise data
quality, data integration, data governance and MDM
solutions
Gartner Leader for DQ
Gartner Leader for DI
Gartner DI Leader Best-in-Class Data
Access Technology
Over 2,300 customers worldwide
Shortest Time to delivery for MDM
Scalable Grid Enabled Platform
Only Data Management vendor with
integrated analytic scoring
Offices throughout the US, the UK, France,
Germany and Australia
World Class Partner Network
Thousands of consultants worldwide
Growth through consistent R&D
Founded
in 1997
Acquired by SAS in 2000
Operates as a wholly owned
subsidiary
9
Copyright © 2011, SAS Institute Inc. All rights reserved.
SAS’ Data Management Portfolio
SAS Data Integration Server
UNIFIED
PLATFORM
INFORMATION
MANAGEMENT
SAS Data Integration Server
SAS Enterprise Integration Server
SAS Data Quality Solution (Server minus DI Studio)
DataFlux Data Management Platform
Data Management Studio
Data Management Server
Data Federation Server
qMDM – Master Data Management
DataFlux Solution Accelerators
DataFlux Event Stream Processor
Supporting SAS Technologies
SAS / Access
» Interface to <insert favorite RDBMS > / Interface to PC Files
SAS Integration Technologies
» Message Queuing APIs, Web Services, Publishing Framework
10
Copyright © 2011, SAS Institute Inc. All rights reserved.
SAS Data Management Advantage –
Access?
DATA ACCESS
Any Data, Anywhere, Anytime: Best-in-Class Data Access Technology
SAS/Access What is it?
SAS/ACCESS Interfaces are out-of-the-box solutions that
provide enterprise data access and integration between SAS and
third-party databases.
SAS/ACCESS interfaces are highly optimized to enable your
SAS® solutions to read, write and update data regardless of its
native database or platform.
Because the data appears native to SAS, there is no need to
learn Structured Query Language (SQL) or any other databasespecific query languages.
All data is presented unified to users, complex activities like bulk
loading are simplified, cross source integration is seamless.
» SAS leverages the underlying RDBMS for you transparently
» For example, a single option is used to invoke bulk loaders
» DATA MyORA.ORACLE_TABLE(BULK_LOAD=YES)
Copyright © 2011, SAS Institute Inc. All rights reserved.
11
SAS Data Management Advantage
Any Data, Anywhere, Anytime: Best-in-Class Data Access Technology
RDBMS
DATA ACCESS
Non-Relational
SAS/ACCESS Interface to DB2
SAS/ACCESS Interface to ADABAS
SAS/ACCESS Interface to Informix
SAS/ACCESS Interface to DATACOM/DB
SAS/ACCESS Interface to Microsoft SQL
SAS/ACCESS Interface to IDMS/R
SAS/ACCESS Interface to MySQL
SAS/ACCESS Interface to IMS-DL/I
SAS/ACCESS Interface to ODBC
SAS/ACCESS Interface to PC Files
SAS/ACCESS Interface to OLE DB
SAS/ACCESS Interface to SYSTEM 2000
SAS/ACCESS Interface to Oracle
SAS/ACCESS Interface to Sybase
SAS/ACCESS Interface to Sybase IQ
SAS/ACCESS Interface to Teradata
SAS Data Surveyor for SalesForce.com
MPP Database Appliances
SAS/ACCESS Interface to Aster Data nCluster
SAS/ACCESS Interface to Greenplum
SAS/ACCESS Interface to Netezza
Enterprise Applications
SAS Data Surveyor for Oracle Applications
SAS Data Surveyor for SAP
SAS Data Surveyor for Siebel
OLAP
Message Queues
SAS/ACCESS Interface to Exadata
XML
SAS/ACCESS Interface to ODBC
Web Services
Unstructured, Semi-Structured And more…
(ParAccel, Vertica, Microsoft PDW)
Enterprise-wide, netcentric to meet the
operational needs of the
AF
12
Copyright © 2011, SAS Institute Inc. All rights reserved.
SAS Data Integration and Management
Benefits and Features
Data Integration
Tasks
SAS
DataFlux
Other ETL Tools
Table Creation
Low Effort
High Effort
Fast Extract
Automatic
Separate Utility
Batch & Real-Time
Low Effort
Multi Step Deployment
Data Quality
Integrated
Separate Tool
Data Services
Low Effort
Custom Code
High Performance
Low
Performance
SQL Pushdown
Automatic
High Effort
Federated Views
Automatic
Separate Tool
Parallelism
Low Effort /
Any Job
High Effort /
Limited Job Sets
Scoring
Integrated
Custom Code
Customization
Low Effort
High Effort
Sorts & Join
DATA
INTEGRATION
13
Copyright © 2011, SAS Institute Inc. All rights reserved.
SAS Data Management Workbench
DATA
INTEGRATION
Data Integration via a Flexible and Powerful Interface
Execute Data Quality Workflows
Metadata Driven Environment
» Deploy governance within the IT
infrastructure
» Determine integration method
» Real-time
» Batch
» Virtual
» Reuse and redeploy the same set of
business rules across applications
Review execution results
» Exceptions
» Debugging
» Tuning
» Optimization
DM System Reporting
14
14
Copyright © 2011, SAS Institute Inc. All rights reserved.
SAS Data Management Advantage
Scalable : In-Database EL-T
DATA
INTEGRATION
Visual Indicators show you where processing takes place.
Push-down is a simple option set on the SQL transform.
Users can control where processing takes place, two clicks enable
users to “move first then process” via bulk loader.
15
Copyright © 2011, SAS Institute Inc. All rights reserved.
SAS Data Management Advantage
Analytic Integration : Integrated Scoring
DATA
INTEGRATION
Integration of Predictive analytics from SAS’ Enterprise Miner into
Data Management tasks, takes just two clicks.
Shared Metadata provides the lineage of model dependencies and
where models are consumed
With Scoring Accelerator Scoring Functions can be Pushed
“In-Database” and called via SQL Function calls from DI Studio
16
Copyright © 2011, SAS Institute Inc. All rights reserved.
Metadata Management
Sophisticated metadata mapping
technologies accelerates
development
Impact Analysis/Data Lineage to see
process and data relationships
Multi-user Collaboration with
DEV/TEST/PROD promotion
management
Powerful metadata reporting
including runtime and performance
analysis
User Interface Roles and Security
DATA
INTEGRATION
17
Copyright © 2011, SAS Institute Inc. All rights reserved.
SAS Data Management Advantage
Best in Class: DataFlux Data Quality
Embedded into batch, near-time and
real-time processes.
Rules callable through message
queues, Web services and custom
exits.
Data cleansing provided in native
languages with localizations.
Metadata built and shared across
the entire SAS Platform.
Generate and append postal
addresses, geo-coding,
demographic data or facts from other
sources of information.
Profile operational data and monitor
ongoing data activities.
DATA
QUALITY
18
Copyright © 2011, SAS Institute Inc. All rights reserved.
Data Cleansing and Enrichment
Data quality, including integrated profiling,
exploration, business rules creation, entity
resolution and monitoring.
Process orchestration layers for jobs, SAS code
and SQL.
Data quality is embedded into batch, near-time
and real-time processes.
Out-of-the-box standardization rules; or you can
create customized, reusable data quality
business rules.
Data quality rules are callable through message
queues, Web services and custom exits.
Data enrichment and augmentation.
Real-time transaction cleansing using standard
business rules.
Data quality monitoring lets you continuously
examine data in real time and over time to
discover when quality falls below acceptable
limits. Alerts are issued when corrective action is
needed.
DATA
QUALITY
19
Copyright © 2011, SAS Institute Inc. All rights reserved.
Business Rules to Track Data Events
DATA
QUALITY
Events from
business rules are
SOA enabled
20
Copyright © 2011, SAS Institute Inc. All rights reserved.
Monitor Data Quality
Monitor on the Web; Multiple repositories
Workflow capabilities
DataFlux Data Management Web Studio
Monitor
Filter:
Triggered rules:
Apply
Clear
General *
Open Table
Reason
Total count:
24
Assigned User
Integration
Franz Kafka
4378 Unresolved
Aging
Ernest Hemingway
31
4023 Unresolved
Integration
Joseph Conrad
6
856
3874 Unresolved
Organizati
…
Ernest Hemingway
6/5/2010 1:1:1 PM
3
343
3689 Unresolved
Process
Joseph Conrad
Rule 6
6/4/2010 1:1:1 PM
1
11
3241 Unresolved
Transform
…
Ernest Hemingway
Rule 6
6/3/2010 1:1:1 PM
1
51
3241 Assigned
Usage
Joseph Conrad
Rule 3
6/2/2010 1:1:1 PM
2
92
4378 Assigned
User Error
Joseph Conrad
Rule 3
6/1/2010 1:1:1 PM
1
31
4023 Assigned
Aging
Joseph Conrad
Rule
Date
Importanc
e
Rule 1
3
563
5002 Unresolved
Date:
This week
6/10/2010 1:1:1
PM
Rule 2
6/9/2010 1:1:1 PM
6
786
4789
Rule:
Rule 3
6/8/2010 1:1:1 PM
2
42
Rule 3
6/7/2010 1:1:1 PM
1
Task:
Rule 4
6/6/2010 1:1:1 PM
Source:
Rule 5
Repository:
%
Triggers
# Triggers
Rows Status
Processed
Description:
Importance :
Summary
5/30/2010 1:1:1
PM
Rule 4
5/30/2010 1:1:1
PM
Status
Term
Table
Trigger Percentage
Run
Trigger Values
Rule 4
Monitor
Statistics
DATA
QUALITY
Rule Code
Status
6
6
History Graph
696
3874 Resolved
696
3874 Resolved
Trigger History (Percentage of Rows Processed)
Integration
Integration
Mary Cassatt
Mary Cassatt
10
8
6
4
2
0
9/7/2009 10/7/2009 11/7/2009 12/7/2009 1/7/2010 2/7/2010 3/7/2010 4/7/2010 5/7/2010 6/7/2010 7/7/2010
Date
Copyright © 2011, SAS Institute Inc. All rights reserved.
21
Entity Resolution for Data Services
Knowing your Airman
Cust. Id
Legacy
30391-244 William
CRM
Online
ERP
Data
Warehouse
1001
First Name Middle
James
MASTER
DATA
Last Name DOB
SSN
Sosulski
563-49-1234123 Oak St., Eves, IL 30319
04/12/39
Address
Cust. Id
First Name Middle
Last Name DOB
SSN
30391244
William
Sosulski
563491234 123 Oak St., Eves, IL
Cust. Id
First Name Middle
14239
Bubba
Cust. Id
First Name Middle
Last Name DOB
SSN
3721B
Willaim
Corp.
56349123 3224 Pkwy G, Los Osos
Cust. Id
First Name Middle
30391-244 William
30391-244 30391244
14239
J.
Last Name DOB
J.
James
James
4-12-39
SSN
April 12
April 12
Last Name DOB
Sosulski
Address
Address
[email protected]
SSN
Address
Address
04/12/1939 563-49-1234123 Oak St., Eves, IL 30319
3721B 30391-244 William James Sosulski 04/12/1939 563491234 123 Oak Street Eves
22
Copyright © 2011, SAS Institute Inc. All rights reserved.
CA
91403
22
SAS Data Management Benefits
Control & Lower TCO
DATA
INTEGRATION
Integrated DQ Functions
Specialized DQ Interfaces
SOA Enabled
Lineage Reporting from Source to Report
Over 300 Functions 70 built-in Transformations
Analytic Integration & Scoring
Scalable Partition-Based Parallelism
Record Setting File I/O
Best-in-Class Data Access Technology /w/ Widest Array of
Data Sources and Targets
Version Control
23
Copyright © 2011, SAS Institute Inc. All rights reserved.
Why SAS for AFPOA?
Flexibility
Agile and responsive to meet the needs of the Air Force’s most important asset:
Air Force Personnel
Why it Matters
Feature
Data quality rules automatically exposed
as RESTful and SOAP Web Services
All tools are built on a 4th Generation
SAS Programming Language and C++
Unified runtime across computing
environments
Seamless integration with AFPOA
enterprise architecture and workflow
Enables AFPOA to more easily adapt
SAS/DataFlux data management
capabilities to changing mission needs
Enables AFPOA to attain cloud
computing environments
24
Copyright © 2011, SAS Institute Inc. All rights reserved.
SAS Can Be Your Trusted Partner
PERFORMANCE
#1 World Leader in Business Analytics
50,000+ Customers
12,000 Employees Worldwide
CULTURE
Relentless Innovation
Voted #1 Place to Work in U.S.
Trusted Partner to Governments and Leading
Business Organizations
EXPERIENCE
50,000 SAS Sites in 127 Countries
93 of the Top 100 Companies in 2011 Fortune
Global 500
35 Years Leading Analytics Solutions
THOUGHT LEADER
SAS Advanced Analytics Lab Provides Business
Leadership
Domain Expertise in Key Industries
Culture of Innovation: 24% R&D Reinvestment
25
Copyright © 2011, SAS Institute Inc. All rights reserved.
Resources
Here are the URLs and titles of downloadable materials we refer you to
for more information:
SAS Data Management - http://www.sas.com/software/datamanagement/
From this webpage, you can access downloadable materials and drill
down deeper into
Data Integration
Data Quality
Enterprise Data Access
Master Data Management
Here are two additional white papers you might find helpful which
were provided to Bloor.
“Data Quality Remediation” (DataFlux)
“DataFlux Data Management Methodology: A Do-It-Yourself Guide for
High-Value Data Across the Enterprise” (DataFlux)
26
Copyright © 2011, SAS Institute Inc. All rights reserved.
SAS® Data Integration
Thank you!
If you are interested in hearing more about SAS and how we
address Information and Data Integration Management,
please contact:
Gail Bamford
Industry Marketing Manager for Defense & Intelligence
[email protected]
571-227-7000 x51715
27
Copyright © 2011, SAS Institute Inc. All rights reserved.