Granularity in the Data Warehouse

Download Report

Transcript Granularity in the Data Warehouse

Granularity in the Data
Warehouse
Raw Estimate
 The
raw estimate of the number of rows
of data that will reside in the data
warehouse tells the architect a great
deal.
 Example of an algorithmic path to
calculate the space occupied by a data
ware house.
Input to the Planning Process
How much DASD is
needed ?
Space estimates,
row estimates
How much lead time
for ordering can be
expected ?
Are dual levels of
granularity
needed?
Data in Overflow ?
 What
is data overflow ?
 How it could happen and how to solve
that problem ?
 Overflow Storage ? How to solve it ?
What the Levels of Granularity
Will Be

Rule of Thumb : if 50% of the first iteration of
design is correct,the design effort has been a
success
 Building very small subsets quickly and
carefully listening to feedback
 Prototyping
 Looking at what other people have done
 Working with an experienced user
 Looking at what the organization has now
 JAD sessions with simulated output
Some Feedback Loop
Techniques






Build the first parts of the data warehouse in very small,
very fast steps, and carefully listen to the end users
comments at the end of each step of development. Be
prepared to make adjustments quickly.
If available, use prototyping and allow the feedback loop
to function using observations gleaned from the prototype.
Look at how other people have built their levels of
granularity and learn from their experience.
Go through the feedback process with an experienced
user who is aware of the process occurring.
Look at whatever the organization has now that appears
to be working, and use those functional requirements as
guideline.
Execute joint application design (JAD) sessions and
simulate the output in order to achieve the desired
feedback
Granularity of data can be
raised by
 Summarize
data from the source as it
goes into the target
 Average or otherwise calculate data as
it goes into the target
 Push only data that is obviously needed
into the target
 Use conditional logic to select only a
subset of records to go into the target.
Levels of Granularity –
Banking Environment
 Example