Data Management - Inside Analysis

Download Report

Transcript Data Management - Inside Analysis

Data Integration Strategy
Air Force Personnel Operations
Agency – Human Resources
December 2, 2011
Copyright © 2011 SAS Institute Inc. All rights reserved.
Agenda - TOC
 Who is SAS?
 Where is SAS in the Air Force
 Essential Business Process Agility
 Data Integration and Information Management
 Data Quality: Cleansing and Enrichment
 Data Integration/Management Benefits
 Why SAS for AFPOA
 Resources and Further Contact Information
2
Copyright © 2011, SAS Institute Inc. All rights reserved.
SAS – Leader in Applied Analytics

Largest privately held software
company in the world

2010 Revenue: $2.43 Billion – 35 Years
of consecutive growth

24% reinvestment in R&D for 2010

50,000 deployments worldwide

12,000 employees

90 of the top 100 of Fortune Global 500

All 15 U.S. Departments and 90% of
U.S. Federal Agencies

All 50 U.S. State Governments

Government agencies in 80 countries

Outstanding Technical Support
Services
Life Sciences: 6% Manufacturing: 5.5%
Retail: 4.5%
Communications: 7%
Education: 3%
Services: 11%
Energy & Utilities: 3%
Other: 1%
Government: 15%
Financial Services: 42%
Copyright © 2011, SAS Institute Inc. All rights reserved.
3
Where SAS is in the U.S. Air Force
In a new framework for AF Human Resources, these SAS installations
could be streamlined if needed or leveraged.
Malmstrom AFB
Hanscom AFB
Offutt AFB
Hill AFB
Wright-Patterson AFB
Langley AFB
Chantillly, VA &
Washington, D.C.
Scott AFB
Peterson AFB
Nellis AFB
Edwards AFB
LA AFB
Tinker AFB
Kirtland AFB
Huntsville
Robins AFB
Maxwell AFB-Gunter
Barksdale AFB
Brooks, Randolph
Lackland AFBs
Eglin AFB
Hurlburt Field
Patrick AFB
4
Copyright © 2011, SAS Institute Inc. All rights reserved.
Essential for Business Process Agility
Data Integration defined in terms of Information
Data Access/Integration/Quality
Data Governance for
Airman Personnel PII
Master Data
Best Record Airman Personnel
DATA
INTEGRATION
UNIFIED
PLATFORM
for
Information
Decision Management
Support
Performance and Cost
Analytical Model Management
and Governance
Recalibrate or New
DATA
QUALITY
DATA
ACCESS
ANALYTIC
MGMT.
MASTER
DATA
DECISION
MGMT.
5
Copyright © 2011, SAS Institute Inc. All rights reserved.
Data Integration and Information
Management
The Opportunity
Data Diversity:
Big Data
Information Management
Use Case
Diversity
Unified Data
Management
Role Diversity:
IT/Business
Acceleration
is required to meet the current
demands in the Business and
Technical landscape
Managing information for Air
Force personnel who are the
most valuable asset includes the
ability to value information
appropriately
Deployment
Diversity:
Cloud
Consumption
Diversity:
Mobility
Architecture
Diversity:
Enterprise
Architecture
Temporal
Diversity:
Right-Time
Application
Diversity:
Operational
Analytics
6
Copyright © 2011, SAS Institute Inc. All rights reserved.
Data Management and Integration
Capability View
Agile And Responsive To Changing Requirements
Linking IT Strategy to Business Strategy
DATA SERVICES
DATA & ANALYTIC SERVICES
DATA GOVERNANCE
ANALYTICS GOVERNANCE
Analytics
Management
Data
Management
DATA
INTEGRATION
DATA
QUALITY
MDM
DECISION
MANAGEMENT
MODEL
MANAGEMENT
&
MONITORING
MODEL
DEPLOYMENT
&
INTEGRATION
INFRASTRUCTURE SUPPORT:
Text & Unstructured Data Support, Events, Security, Meta-data & Lineage, Monitoring & Deployment
Infrastructure
Support
ENTERPRISE DATA ACCESS
7
Copyright © 2011, SAS Institute Inc. All rights reserved.
Data Management
UNIFIED
PLATFORM
INFORMATION
MANAGEMENT
DataManagement
Management
Data
Data
Integration
Data
Quality
Master Data
Management
Enterprise Data Access and Federation
Resolves
Architecture is rigid; not flexible to changes in business needs
Managing end-to-end processes is challenging
8
Copyright © 2011, SAS Institute Inc. All rights reserved.
Why SAS/DataFlux for Data Management

SAS the Original Data Warehouse
Pioneer
UNIFIED
PLATFORM
INFORMATION
MANAGEMENT
DataFlux
 35 years Data Management Experience
 Thousands of Data Management
Customers worldwide

End-to-end Solution Available
Proven Market Leader in Data Management
Recognized as a leading provider of enterprise data
quality, data integration, data governance and MDM
solutions
 Gartner Leader for DQ
 Gartner Leader for DI
 Gartner DI Leader Best-in-Class Data
Access Technology
Over 2,300 customers worldwide
 Shortest Time to delivery for MDM
 Scalable Grid Enabled Platform
 Only Data Management vendor with
integrated analytic scoring
Offices throughout the US, the UK, France,
Germany and Australia
 World Class Partner Network
 Thousands of consultants worldwide
 Growth through consistent R&D
Founded
in 1997
Acquired by SAS in 2000
Operates as a wholly owned
subsidiary
9
Copyright © 2011, SAS Institute Inc. All rights reserved.
SAS’ Data Management Portfolio

SAS Data Integration Server
UNIFIED
PLATFORM
INFORMATION
MANAGEMENT
 SAS Data Integration Server
 SAS Enterprise Integration Server
 SAS Data Quality Solution (Server minus DI Studio)

DataFlux Data Management Platform
 Data Management Studio
 Data Management Server
 Data Federation Server
 qMDM – Master Data Management
 DataFlux Solution Accelerators
 DataFlux Event Stream Processor

Supporting SAS Technologies
 SAS / Access
» Interface to <insert favorite RDBMS > / Interface to PC Files
 SAS Integration Technologies
» Message Queuing APIs, Web Services, Publishing Framework
10
Copyright © 2011, SAS Institute Inc. All rights reserved.
SAS Data Management Advantage –
Access?
DATA ACCESS
Any Data, Anywhere, Anytime: Best-in-Class Data Access Technology
 SAS/Access What is it?
 SAS/ACCESS Interfaces are out-of-the-box solutions that
provide enterprise data access and integration between SAS and
third-party databases.
 SAS/ACCESS interfaces are highly optimized to enable your
SAS® solutions to read, write and update data regardless of its
native database or platform.
 Because the data appears native to SAS, there is no need to
learn Structured Query Language (SQL) or any other databasespecific query languages.
 All data is presented unified to users, complex activities like bulk
loading are simplified, cross source integration is seamless.
» SAS leverages the underlying RDBMS for you transparently
» For example, a single option is used to invoke bulk loaders
» DATA MyORA.ORACLE_TABLE(BULK_LOAD=YES)
Copyright © 2011, SAS Institute Inc. All rights reserved.
11
SAS Data Management Advantage
Any Data, Anywhere, Anytime: Best-in-Class Data Access Technology



RDBMS
DATA ACCESS
Non-Relational

SAS/ACCESS Interface to DB2

SAS/ACCESS Interface to ADABAS

SAS/ACCESS Interface to Informix

SAS/ACCESS Interface to DATACOM/DB

SAS/ACCESS Interface to Microsoft SQL

SAS/ACCESS Interface to IDMS/R

SAS/ACCESS Interface to MySQL

SAS/ACCESS Interface to IMS-DL/I

SAS/ACCESS Interface to ODBC

SAS/ACCESS Interface to PC Files

SAS/ACCESS Interface to OLE DB

SAS/ACCESS Interface to SYSTEM 2000

SAS/ACCESS Interface to Oracle

SAS/ACCESS Interface to Sybase

SAS/ACCESS Interface to Sybase IQ

SAS/ACCESS Interface to Teradata

 SAS Data Surveyor for SalesForce.com
MPP Database Appliances

SAS/ACCESS Interface to Aster Data nCluster

SAS/ACCESS Interface to Greenplum

SAS/ACCESS Interface to Netezza


Enterprise Applications

SAS Data Surveyor for Oracle Applications

SAS Data Surveyor for SAP

SAS Data Surveyor for Siebel

OLAP

Message Queues
SAS/ACCESS Interface to Exadata

XML
SAS/ACCESS Interface to ODBC

Web Services

Unstructured, Semi-Structured And more…
(ParAccel, Vertica, Microsoft PDW)
Enterprise-wide, netcentric to meet the
operational needs of the
AF
12
Copyright © 2011, SAS Institute Inc. All rights reserved.
SAS Data Integration and Management
Benefits and Features
Data Integration
Tasks
SAS
DataFlux
Other ETL Tools
Table Creation
Low Effort
High Effort
Fast Extract
Automatic
Separate Utility
Batch & Real-Time
Low Effort
Multi Step Deployment
Data Quality
Integrated
Separate Tool
Data Services
Low Effort
Custom Code
High Performance
Low
Performance
SQL Pushdown
Automatic
High Effort
Federated Views
Automatic
Separate Tool
Parallelism
Low Effort /
Any Job
High Effort /
Limited Job Sets
Scoring
Integrated
Custom Code
Customization
Low Effort
High Effort
Sorts & Join
DATA
INTEGRATION
13
Copyright © 2011, SAS Institute Inc. All rights reserved.
SAS Data Management Workbench
DATA
INTEGRATION
Data Integration via a Flexible and Powerful Interface
 Execute Data Quality Workflows
 Metadata Driven Environment
» Deploy governance within the IT
infrastructure
» Determine integration method
» Real-time
» Batch
» Virtual
» Reuse and redeploy the same set of
business rules across applications
 Review execution results
» Exceptions
» Debugging
» Tuning
» Optimization
 DM System Reporting
14
14
Copyright © 2011, SAS Institute Inc. All rights reserved.
SAS Data Management Advantage
Scalable : In-Database EL-T
DATA
INTEGRATION
 Visual Indicators show you where processing takes place.
 Push-down is a simple option set on the SQL transform.
 Users can control where processing takes place, two clicks enable
users to “move first then process” via bulk loader.
15
Copyright © 2011, SAS Institute Inc. All rights reserved.
SAS Data Management Advantage
Analytic Integration : Integrated Scoring
DATA
INTEGRATION
 Integration of Predictive analytics from SAS’ Enterprise Miner into
Data Management tasks, takes just two clicks.
 Shared Metadata provides the lineage of model dependencies and
where models are consumed
 With Scoring Accelerator Scoring Functions can be Pushed
“In-Database” and called via SQL Function calls from DI Studio
16
Copyright © 2011, SAS Institute Inc. All rights reserved.
Metadata Management

Sophisticated metadata mapping
technologies accelerates
development

Impact Analysis/Data Lineage to see
process and data relationships

Multi-user Collaboration with
DEV/TEST/PROD promotion
management

Powerful metadata reporting
including runtime and performance
analysis

User Interface Roles and Security
DATA
INTEGRATION
17
Copyright © 2011, SAS Institute Inc. All rights reserved.
SAS Data Management Advantage
Best in Class: DataFlux Data Quality

Embedded into batch, near-time and
real-time processes.

Rules callable through message
queues, Web services and custom
exits.

Data cleansing provided in native
languages with localizations.

Metadata built and shared across
the entire SAS Platform.

Generate and append postal
addresses, geo-coding,
demographic data or facts from other
sources of information.

Profile operational data and monitor
ongoing data activities.
DATA
QUALITY
18
Copyright © 2011, SAS Institute Inc. All rights reserved.
Data Cleansing and Enrichment

Data quality, including integrated profiling,
exploration, business rules creation, entity
resolution and monitoring.

Process orchestration layers for jobs, SAS code
and SQL.

Data quality is embedded into batch, near-time
and real-time processes.

Out-of-the-box standardization rules; or you can
create customized, reusable data quality
business rules.

Data quality rules are callable through message
queues, Web services and custom exits.

Data enrichment and augmentation.

Real-time transaction cleansing using standard
business rules.

Data quality monitoring lets you continuously
examine data in real time and over time to
discover when quality falls below acceptable
limits. Alerts are issued when corrective action is
needed.
DATA
QUALITY
19
Copyright © 2011, SAS Institute Inc. All rights reserved.
Business Rules to Track Data Events
DATA
QUALITY
Events from
business rules are
SOA enabled
20
Copyright © 2011, SAS Institute Inc. All rights reserved.
Monitor Data Quality
Monitor on the Web; Multiple repositories
Workflow capabilities
DataFlux Data Management Web Studio
Monitor
Filter:
Triggered rules:
Apply
Clear
General *
Open Table
Reason
Total count:
24
Assigned User
Integration
Franz Kafka
4378 Unresolved
Aging
Ernest Hemingway
31
4023 Unresolved
Integration
Joseph Conrad
6
856
3874 Unresolved
Organizati
…
Ernest Hemingway
6/5/2010 1:1:1 PM
3
343
3689 Unresolved
Process
Joseph Conrad
Rule 6
6/4/2010 1:1:1 PM
1
11
3241 Unresolved
Transform
…
Ernest Hemingway
Rule 6
6/3/2010 1:1:1 PM
1
51
3241 Assigned
Usage
Joseph Conrad
Rule 3
6/2/2010 1:1:1 PM
2
92
4378 Assigned
User Error
Joseph Conrad
Rule 3
6/1/2010 1:1:1 PM
1
31
4023 Assigned
Aging
Joseph Conrad
Rule
Date
Importanc
e
Rule 1
3
563
5002 Unresolved
Date:
This week
6/10/2010 1:1:1
PM
Rule 2
6/9/2010 1:1:1 PM
6
786
4789
Rule:
Rule 3
6/8/2010 1:1:1 PM
2
42
Rule 3
6/7/2010 1:1:1 PM
1
Task:
Rule 4
6/6/2010 1:1:1 PM
Source:
Rule 5
Repository:
%
Triggers
# Triggers
Rows Status
Processed
Description:
Importance :
Summary
5/30/2010 1:1:1
PM
Rule 4
5/30/2010 1:1:1
PM
Status
Term
Table
Trigger Percentage
Run
Trigger Values
Rule 4
Monitor
Statistics
DATA
QUALITY
Rule Code
Status
6
6
History Graph
696
3874 Resolved
696
3874 Resolved
Trigger History (Percentage of Rows Processed)
Integration
Integration
Mary Cassatt
Mary Cassatt
10
8
6
4
2
0
9/7/2009 10/7/2009 11/7/2009 12/7/2009 1/7/2010 2/7/2010 3/7/2010 4/7/2010 5/7/2010 6/7/2010 7/7/2010
Date
Copyright © 2011, SAS Institute Inc. All rights reserved.
21
Entity Resolution for Data Services
Knowing your Airman
Cust. Id
Legacy
30391-244 William
CRM
Online
ERP
Data
Warehouse
1001
First Name Middle
James
MASTER
DATA
Last Name DOB
SSN
Sosulski
563-49-1234123 Oak St., Eves, IL 30319
04/12/39
Address
Cust. Id
First Name Middle
Last Name DOB
SSN
30391244
William
Sosulski
563491234 123 Oak St., Eves, IL
Cust. Id
First Name Middle
14239
Bubba
Cust. Id
First Name Middle
Last Name DOB
SSN
3721B
Willaim
Corp.
56349123 3224 Pkwy G, Los Osos
Cust. Id
First Name Middle
30391-244 William
30391-244 30391244
14239
J.
Last Name DOB
J.
James
James
4-12-39
SSN
April 12
April 12
Last Name DOB
Sosulski
Address
Address
[email protected]
SSN
Address
Address
04/12/1939 563-49-1234123 Oak St., Eves, IL 30319
3721B 30391-244 William James Sosulski 04/12/1939 563491234 123 Oak Street Eves
22
Copyright © 2011, SAS Institute Inc. All rights reserved.
CA
91403
22
SAS Data Management Benefits
Control & Lower TCO
DATA
INTEGRATION
 Integrated DQ Functions
 Specialized DQ Interfaces
 SOA Enabled
 Lineage Reporting from Source to Report
 Over 300 Functions 70 built-in Transformations
 Analytic Integration & Scoring
 Scalable Partition-Based Parallelism
 Record Setting File I/O
 Best-in-Class Data Access Technology /w/ Widest Array of
Data Sources and Targets
 Version Control
23
Copyright © 2011, SAS Institute Inc. All rights reserved.
Why SAS for AFPOA?
Flexibility
Agile and responsive to meet the needs of the Air Force’s most important asset:
Air Force Personnel
Why it Matters
Feature
Data quality rules automatically exposed
as RESTful and SOAP Web Services
All tools are built on a 4th Generation
SAS Programming Language and C++
Unified runtime across computing
environments
Seamless integration with AFPOA
enterprise architecture and workflow
Enables AFPOA to more easily adapt
SAS/DataFlux data management
capabilities to changing mission needs
Enables AFPOA to attain cloud
computing environments
24
Copyright © 2011, SAS Institute Inc. All rights reserved.
SAS Can Be Your Trusted Partner
PERFORMANCE
#1 World Leader in Business Analytics
50,000+ Customers
12,000 Employees Worldwide
CULTURE
Relentless Innovation
Voted #1 Place to Work in U.S.
Trusted Partner to Governments and Leading
Business Organizations
EXPERIENCE
50,000 SAS Sites in 127 Countries
93 of the Top 100 Companies in 2011 Fortune
Global 500
35 Years Leading Analytics Solutions
THOUGHT LEADER
SAS Advanced Analytics Lab Provides Business
Leadership
Domain Expertise in Key Industries
Culture of Innovation: 24% R&D Reinvestment
25
Copyright © 2011, SAS Institute Inc. All rights reserved.
Resources
Here are the URLs and titles of downloadable materials we refer you to
for more information:
 SAS Data Management - http://www.sas.com/software/datamanagement/
From this webpage, you can access downloadable materials and drill
down deeper into
 Data Integration
 Data Quality
 Enterprise Data Access
 Master Data Management
 Here are two additional white papers you might find helpful which
were provided to Bloor.
 “Data Quality Remediation” (DataFlux)
 “DataFlux Data Management Methodology: A Do-It-Yourself Guide for
High-Value Data Across the Enterprise” (DataFlux)
26
Copyright © 2011, SAS Institute Inc. All rights reserved.
SAS® Data Integration
Thank you!
If you are interested in hearing more about SAS and how we
address Information and Data Integration Management,
please contact:
Gail Bamford
Industry Marketing Manager for Defense & Intelligence
[email protected]
571-227-7000 x51715
27
Copyright © 2011, SAS Institute Inc. All rights reserved.