Implementing Near Real-Time Data Warehouse

Download Report

Transcript Implementing Near Real-Time Data Warehouse

www.hitachiconsulting.com
Implementing Near Real-Time
Data Warehouse
Sutha Thiru
[email protected]
@suthathiru
http://www.beeii.com
Commercial in Confidence
© Copyright 2009 Hitachi Consulting
1
Agenda
 Real-Time DW
 Scenario
 Data Load
 Custom Components
 Demo
 Real Time Stuff
 RT Challenges & Solutions
 Best Practices
© Copyright 2009 Hitachi Consulting
2
Real Time
 What is Real-Time data warehousing?
 Why do we need it?
© Copyright 2009 Hitachi Consulting
3
Scenario
 Global brand
 Well known in the UK
 Number of customers in Retail Parks
 Provides cameras and counting devices
 Multi Currency / Language
 Multi Time-Zone
 Calendar specific to a client
 REAL TIME (near)
© Copyright 2009 Hitachi Consulting
4
Data Load
 Cameras sending files every few minutes
 1000s of devices
 Unstructured files
 File is unique to a device
 Need to load them quickly using SSIS
 Data available on dashboard for the controllers
 Decisions made before next set of files are produced
by the device
© Copyright 2009 Hitachi Consulting
5
Custom Components
 System Config Reader
 Event Handler
 XMLify
 TRIM All
 SHA1 / MD5 Checksum
 Inferred Dimension
© Copyright 2009 Hitachi Consulting
6
Data Load
Demo
© Copyright 2009 Hitachi Consulting
7
Real Time Stuff
 Stream Insight
 Change Data Capture (CDC)
 Service Broker
 AbInitio Continuous Flow
 Java Messaging Service (JMS)
 Others
© Copyright 2009 Hitachi Consulting
8
Real-Time Data Warehousing Challenges & Solutions
 Enabling Real-Time ETL
 Near Real-Time ETL
 Trickle Feed
 Real-Time Data Cache
 Model Real-Time Fact Table
 Direct Feed
 Real-Time Partition
 View
© Copyright 2009 Hitachi Consulting
9
Real-Time Data Warehousing Challenges & Solutions
 Real-Time Alerting
 True Real-Time data monitoring & triggering
 Minute cycle schedule
 Real-Time Threshold
 Reporting




Simplify Real-Time Reporting
Increase Hardware power
Separate Real-Time data cache
OLAP vs. OLTP
© Copyright 2009 Hitachi Consulting
10
Best Practices
 Implement Correct Database Partitions
 Implement ROLAP Partitions (OWN RISK)
 Implement Correct Merging Strategy
 Handle Early Arriving Facts Efficiently
 Use Stream-Insight
© Copyright 2009 Hitachi Consulting
11
Thank You
© Copyright 2009 Hitachi Consulting
12
Coming up…
Speaker
Title
Room
Stephan Stoltze
Writeback-Here Comes the Sun
Aintree
James Boother
POSH Clustering
Lancaster
Kasper de Jonge Building Great Models for Crescent
Pearce
Andy Leonard
Boardroom
Designing an SSIS Framework
Milos Radivojevic TSQL Performance Recommendations
Empire
Christina E. Leo
Derby
Working with Server Side Traces
#SQLBIT
S
© Copyright 2009 Hitachi Consulting