KIMBALL vs INMON

Download Report

Transcript KIMBALL vs INMON

KIMBALL vs INMON
A presentation by
W H Inmon
the essence of the difference between Inmon and Kimball
Inmon –
there needs to be a single
version of the truth
data
mart
data
mart
data
mart
finance
marketing
sales
data
mart
mgmt
integrated
historical
granular
data
warehouse
single version
of the truth
HR
the essence of the difference between Inmon and Kimball
the question being answered – what is the single version
of the truth? what is corporate data?
data
mart
data
mart
data
mart
finance
marketing
sales
data
mart
mgmt
integrated
historical
granular
data
warehouse
single version
of the truth
HR
the essence of the difference between Inmon and Kimball
Kimball – a data warehouse is the union
of all of the data marts
data mart
data
mart
finance
data
mart
sales
HR
a data mart is based on
business function –
Ralph Kimball
the essence of the difference between Inmon and Kimball
the question being answered – how quickly can I
build reports? how quickly can I do analysis?
data mart
data
mart
finance
data
mart
sales
HR
over time the architectures have evolved
1990
Inmon –
Single version
of the truth
Kimball –
a union of
data marts
2000
Inmon –
an architecture
corporate information
factory
Kimball –
conformed dimension
2010
Inmon –
DW 2.0,
unstructured data
Kimball –
a need for
integration
1990
Inmon –
Single version
of the truth
Kimball –
a union of
data marts
2000
Inmon –
an architecture
corporate information
factory
2010
Inmon –
DW 2.0,
unstructured data
Kimball –
conformed dimension
Kimball is today where Inmon was in 1990
What has Kimball said to all of those people who
followed his teachings in 1990?
Kimball –
a need for
integration
the essence of the difference between Inmon and Kimball
1990
2000
2010
Inmon –
Single version
of the truth
Inmon –
an architecture
corporate
information
factory
Inmon –
DW 2.0,
unstructured data
Kimball –
a union of
data marts
Kimball –
conformed
dimension
Kimball –
a need for
integration
2020
Kimball –
unstructured data
belongs in a data
warehouse
prediction – in 2020 the Kimballites will “discover”
that textual data belongs in a data warehouse
from an implementation perspective
Kimball
appl
Inmon
data
marts
appl
data
marts
mktg
mktg
sales
sales
finance
finance
mgmt
mgmt
HR
HR
Engineering
Engineering
Production
Production
daily refreshment of data
appl
data
marts
appl
data
marts
mktg
mktg
sales
sales
finance
finance
mgmt
mgmt
HR
HR
Engineering
Engineering
Production
Production
each of these lines must be crossed
at least once a day
daily refreshment of data
appl
m
data
marts
n
mxn
data
marts
appl
mktg
mktg
sales
sales
finance
finance
mgmt
mgmt
HR
HR
Engineering
Engineering
Production
Production
m
n
m+n
daily refreshment of data
appl
m
data
marts
n
mxn
data
marts
appl
mktg
mktg
sales
sales
finance
finance
mgmt
mgmt
HR
HR
Engineering
Engineering
Production
Production
m
how many programs have to be
written? have to be maintained?
n
m+n
daily refreshment of data
appl
m
data
marts
n
mxn
data
marts
appl
mktg
mktg
sales
sales
finance
finance
mgmt
mgmt
HR
HR
Engineering
Engineering
Production
Production
m
which overnight batch processing
window do you want?
n
m+n
reconciliation
appl
data
marts
appl
$32000
mktg
mktg
$32000
sales
sales
finance
$1000
finance
$1000
mgmt
mgmt
HR
data
marts
$1,009,087
HR
$1,009,087
Engineering
Engineering
Production
Production
in which environment would you rather
do reconciliation?
appl
data
marts
appl
data
marts
mktg
mktg
sales
sales
finance
finance
mgmt
mgmt
HR
HR
Engineering
Engineering
Production
Production
in which environment would you rather add
a new data mart?
from an architectural perspective
star schema
(Kimball)
relational based
data warehouse
(Inmon)
star schema
(Kimball)
relational based
data warehouse
(Inmon)
good for fast reports
not a short term proposition
good for a system of record
as an end user I am confused…
there are 17 data marts that have information
and I don’t know which one to go to. And they
all have different information
every time there is a new requirement
I have to start from scratch. And these
darn data marts are hard to maintain.
I have to build a new one every time
there is a change in requirements
we have had data marts for five years now.
We have 250 of them and only 10 of them
are actually being used today……
I’ve got these auditors coming in and I don’t
have any data that I trust that I can show
them……
with Kimball, the star
schema is the architecture
with Inmon, the relational
foundation is only the start of
the architecture
Interactive
Very
current
Transaction
data
A
p
p
l
A
p
p
l
A
p
p
l
Integrated
Current++
Detailed
Textual
subjects
Internal, external
Captured
text
Simple
pointer
S
u
b
j
S
u
b
j
S
u
b
j
S
u
b
j
Continuous
snapshot
data
Profile
data
Text id ......
S
u
b
j
S
u
b
j
S
u
b
j
Linkage
Summary
Text to subj
Near line
Less than
current
Detailed
Textual
subjects
Internal, external
Captured
text
Simple
pointer
S
u
b
j
S
u
b
j
S
u
b
j
S
u
b
j
Continuous
snapshot
data
Profile
data
Text id ......
S
u
b
j
S
u
b
j
S
u
b
j
Linkage
Summary
Text to subj
Archival
Older
Detailed
Textual
subjects
Internal, external
Captured
text
Simple
pointer
S
u
b
j
S
u
b
j
S
u
b
j
S
u
b
j
Text id ......
Continuous
snapshot
data
Profile
data
S
u
b
j
S
u
b
j
S
u
b
j
Linkage
Text to subj
Summary
the Inmon approach is a FULL architecture
leading to DW 2.0. And DW 2.0 is a true
full scale architecture
Interactive
Very
current
Transaction
data
A
p
p
l
A
p
p
l
A
p
p
l
Integrated
Current++
Detailed
Textual
subjects
Internal, external
Captured
text
Simple
pointer
S S S S
u u u u
b b b b
j j j j
Continuous
snapshot
data
Profile
data
Text id ......
S
u
b
j
S
u
b
j
S
u
b
j
Linkage
Summary
Text to subj
Near line
Less than
current
Detailed
Textual
subjects
Internal, external
Captured
text
Simple
pointer
S S S S
u u u u
b b b b
j j j j
Continuous
snapshot
data
Profile
data
Text id ......
S
u
b
j
S
u
b
j
S
u
b
j
Linkage
Summary
Text to subj
Archival
Older
Detailed
Textual
subjects
Internal, external
Captured
text
Simple
pointer
S S S S
u u u u
b b b b
j j j j
Text id ......
Linkage
Text to subj
Summary
Continuous
snapshot
data
Profile
data
S
u
b
j
S
u
b
j
S
u
b
j
DW 2.0 supports some really important
architectural features –
- the life cycle of data within the data warehouse
- the accommodation for very large amounts
of data
- the recognition that cost is the ultimate limiting
factor for a data warehouse
- unstructured data as an essential component
- metadata as an essential component
ask Kimball how he supports unstructured data?
ask Kimball how he supports metadata?
ask Kimball how he supports really large amounts of data?
ask Kimball how he supports archival data?
corporate data
structured data
unstructured data
the vast majority of corporate data is not structured
the Inmon architecture is complete;
the Kimball architecture is not
structured data
unstructured data
Kimball
structured data
unstructured data
Inmon
Kimball
Inmon
Interactive
Very
current
Transaction
data
A
p
p
l
A
p
p
l
A
p
p
l
Florida
Integrated
Current++
Detailed
Textual
subjects
Internal, external
Captured
text
Simple
pointer
S S S S
u u u u
b b b b
j j j j
Continuous
snapshot
data
Profile
data
Text id ......
S
u
b
j
S
u
b
j
S
u
b
j
Linkage
Summary
Text to subj
Near line
Less than
current
Detailed
Textual
subjects
Internal, external
Captured
text
Simple
pointer
S S S S
u u u u
b b b b
j j j j
Continuous
snapshot
data
Profile
data
Text id ......
S
u
b
j
S
u
b
j
S
u
b
j
Linkage
Summary
Text to subj
Archival
Older
Detailed
Textual
subjects
Internal, external
Captured
text
Simple
pointer
S S S S
u u u u
b b b b
j j j j
Text id ......
Continuous
snapshot
data
Profile
data
S
u
b
j
S
u
b
j
S
u
b
j
Linkage
Text to subj
South America
NYC
Chicago
Hawaii
Sao Paolo
Mexico
Canada
Summary
Bermuda
Denver
Calgary
Los Angeles
Gold Coast
Florida
Miami
San Francisco
Seattle
Kimball only addresses one small part
of architecture. Inmon addresses a much
more comprehensive picture
data
mart
data
mart
data
mart
data
mart
finance
marketing
sales
data
mart
mgmt
data
warehouse
integrated
historical
granular
how Inmon/Kimball fit together
HR