Asuri Saranathan BEST PRACTICES IN DATA WAREHOUSING AND BUSINESS INTELLIGENCE Agenda  Introduction  Best Practices – Over View  Deep Dive  Conclusion  Q&A.

Download Report

Transcript Asuri Saranathan BEST PRACTICES IN DATA WAREHOUSING AND BUSINESS INTELLIGENCE Agenda  Introduction  Best Practices – Over View  Deep Dive  Conclusion  Q&A.

Asuri Saranathan
BEST PRACTICES IN DATA WAREHOUSING
AND BUSINESS INTELLIGENCE
Agenda
 Introduction
 Best Practices – Over View
 Deep Dive
 Conclusion
 Q&A
Introduction
Speaker
 Holds Bachelor degree in Physics and Electrical and Electronics
Engineering.
 Over 26 years of Experience in Power System and Information
Technology field.
 Has built several large scale applications including online CRM for
multinationals.
 Has managed several Data Warehousing and BI projects for Direct
Marketing, Manufacturing and Auto finance verticals.
 Functioned as Solution Architect for Data warehousing and
reporting projects.
 ISO auditor
 Certified Bullet Proof Manager from CrestComm USA.
Best Practices- Overview
What are Best Practices?
 Is it a technology?
 Is it application of a set of best tools
available in the market?
 Is it a Framework?
Best Practices Definition
 A framework of a set of processes or method
that exhibits achievement of specific results
in a specific manner over a sustained period
of time.
 The framework should have certain
characteristics in that they should be
repeatable.
Do Best Practices Evolve?
 Yes they do.
 Because of Innovation
 Changes in Technology
 Changes in Law or Governance Structure.
 Expectations, Values , Knowledge or other that
makes the practice outdated or inappropriate.
Where can it be Applied?
 Practically in all fields.
How do we apply Best Practices to Data
Warehousing and Business Intelligence?
Data Warehouse - Definition
 In an elementary form , it is the collection of
key information that can be used by the
business users to become more profitable.
 But Is this definition sufficient ?
 We need much more precise definition of
what a data ware house is .
What is a Data Warehouse?
 A Data warehouse is the
 Data ( Meta / Fact / Dimension/ Aggregation)
and
 The Process Managers ( Load / Warehouse / Query)
That make information available , enabling the user to
make informed decisions.
Deep Dive
What is the Challenge?
 Business is never Static.
 And so is Data Warehouse.
 In order to respond to today’s requirement for
instant access to corporate information , the data
warehouse should be designed to respond to this
need in a optimal way.
 Business itself probably not aware of what information is
required in the future.
 This requires a fundamentally different approach than
the traditional waterfall method of software
development for the Data warehouse.
Experience so far…
 Most Enterprise Data Warehousing projects tend to have
development cycle of between 18 – 24 months from
start to finish.
 Justification of this investment is substantial.
 Businesses would prefer a better approach to justify the
investment.
What should be done?
 Focus on Business Requirements
 A clear understanding of what is short term
and long term requirement of the data
warehouse.
 An Architecture design that would evolve.
 Identification of quick win that delivers
business benefit in the first build.
Scalability for Growth
 Scalability means ability of the underlying
Hardware and Software to support increasing
demands over a period of time.
Horizontal Scalability
High Speed Network
CPU
CPU
CPU
CPU
RAM
RAM
RAM
RAM
DB
DB
DB
DB
CPU
CPU
CPU
CPU
RAM
RAM
RAM
RAM
DB
DB
DB
DB
Multiple servers are connected thru a network and use the data partitioning
feature of the Database to tie the CPUs together.
Data Warehouse Environment
Staging Area
Source
Systems
Data warehouse
(System of Record)
Full History in 3rd Normal Form
No User Access
Summary Area
Full History
User Access
Analytical Area
User Access
Data Mart
Strategy
Metadata Mgmt
•Architecture
•Integration
•Control
•Delivery
Document / Content Mgmt
•Acquisition & Storage
•Backup & Recovery
•Content
•Retrieval
•Retention
DWH / BI
•Architecture
•Implementation
•Training and
Support
•Tuning
Data Architecture
•Entp. DM
• Value Chain
Data Quality
• Spec
•Analysis
•Measurement
•Improvement
Data Governance
Reference and MDM
•External & Internal
Code
•Customer Data
•Product data
Data Development
•Analysis
•Data Modeling
•DB Design
•Implementation
Data Operations
•Acquisition
•Recovery
•Tuning
•Purging
Data Security
•Standards
•Classification
•Administration
•Authentication
•Auditing
Environment
Organization &
Culture
Technology
Activities
Goals &
Objectives
Practices &
Techniques
Deliverables
Roles &
Responsibilities
Architecture Requirements
 Must recognize change as a constant
 Take incremental development approach
 Existing applications must continue to work
 Need to allow more data and new types of data to be
added
High Level
 Remember the different “worlds”
 On-line transaction processing (OLTP)
 Business intelligence systems (BIS)
 Users are different
 Data content is different
 Data structures are different
 Architecture & methodology must be
different
Best Practice #1
 Use a Data model that is optimized for
Information retrieval
 dimensional model
 denormalized
 hybrid approach
DW Architecture Best Practices
November 7, 2015
24
Best Practice #2
 Carefully design the data acquisition and
cleansing processes for your DW
 Ensure the data is processed efficiently and
accurately
 Consider acquiring ETL and Data Cleansing tools
 Use them well!
DW Architecture Best Practices
November 7, 2015
25
Best Practice #3
 Design a metadata architecture that allows
sharing of metadata between components of
your DW
 consider metadata standards such as OMG’s
Common Warehouse Metamodel (CWM)
DW Architecture Best Practices
November 7, 2015
26
Best Practice #4
 Take an approach that consolidates data into
‘a single version of the truth’
 Data Warehouse Bus
 conformed dimensions & facts
 OR?
DW Architecture Best Practices
November 7, 2015
27
Best Practice #5
 Consider implementing an ODS only when
information retrieval requirements are near the
bottom of the data abstraction pyramid and/or
when there are multiple operational sources that
need to be accessed
 Must ensure that the data model is integrated, not just
consolidated
 May consider 3NF data model
 Avoid at all costs a ‘data dumping ground’
DW Architecture Best Practices
November 7, 2015
28
Pitfalls to be Avoided
 Engagement of Non-BI Manger in a BI delivery Project.
 Trying to please the client and the user community.
 Expecting the Service Provider to own the Project completely.
 Bringing the Solution Architect half way into the project.
 Allowing the Business Users to drive the Data Model.
 Not having the right people with right skills in tool selection process.
 Expecting the contractor to deliver all that they presented.
 Over dependency on the Service provider or contractor in execution.
 Assuming that the Data quality will be handled somehow.
 Assuming that the Data warehouse project is over once it is
deployed.
Data Warehouse
Architecture Best Practices
Thank You