Transcript CHAPTER 1

제 15 장
데이터웨어하우스에서의
OLAP
2015-07-21
Data Warehousing
1
15.0 장의 목표
• The unqualified demand for OLAP
• Review the major features and functions
of OLAP
• Grasp the intricacies of dimensional
analysis
• Examine the different OLAP models
• Consider OLAP implementation
2015-07-21
Data Warehousing
2
15.1 온라인 분석 처리를 위한 요구
• For building a data warehouse
– Top-down approach: E-R modeling technique
– Bottom-up approach: several data marts
– A practical approach: conglomeration of
supermarts
• In today’s business conditions, we find
that users need to go beyond such basic
analysis
2015-07-21
Data Warehousing
3
다차원 분석에 대한 필요
• Sales are interrelated to many business
dimensions
– The dates of the sales, the products, the distribution
channels, the stores, the sales territories, the
promotions, and a few more dimensions
• More useful query
– How much revenue did the new Product X generate
during the last three months, broken down by
individual months, in the South Central territory, by
individual stores, broken down by promotions,
compared to estimates, and compared to the
previous version of the product?
2015-07-21
Data Warehousing
4
빠른 접근과 강력한 계산
• Figure 15-1
• A list of typical calculations
–
–
–
–
–
Rolls-ups to provide summaries and aggregations
Drill-downs
Simple calculations, such as computation of margins
Share calculations
Algebraic equations involving key performance
indicators
– Moving averages and growth percentages
– Trend analysis using statistical methods
2015-07-21
Data Warehousing
5
2015-07-21
Data Warehousing
6
다른 분석 방법들의 한계
• OLTP
– Figure 15-2
• Report Writers
– Ability to point and click for generating and
issuing SQL calls, and
– Capability to format the output reports
• SQL(Structured Query Language)
– The language is too abstruse even for
sophisticated users
2015-07-21
Data Warehousing
7
2015-07-21
Data Warehousing
8
OLAP is the Answer
• Users need the ability to perform
multidimensional analysis with complex
calculations
• The basic virtues of OLAP
– Enables analysts, executives, and managers to gain
useful insights from the presentation of data
– Can reorganize metrics along several dimensions
and allow data to be viewed from different
perspectives
– Supports multidimensional analysis
– Is able to drill down or roll up within each dimension
2015-07-21
Data Warehousing
9
2015-07-21
Data Warehousing
10
OLAP 정의와 규칙
• The term OLAP in a paper entitled
– “Providing On-Line Analytical Processing to
User Analysts,” by Dr. E.F. Codd in 1993
• A short and precise definition for OLAP
– On-line Analytical Processing(OLAP) is a category of software
technology that enables analysts, managers and executives to
gain insight into data through fast, consistent, interactive access
in a wide variety of possible views of information that has been
transformed from raw data to reflect the real dimensionality of
the enterprise as understood by the user
2015-07-21
Data Warehousing
11
The initial twelve guidelines for an
OLAP system
• Multidimensional
Conceptual View
• Transparency
• Accessibility
• Consistent Reporting
Performance
• Client/Server Architecture
• Generic Dimensionality
2015-07-21
• Dynamic Sparse Matrix
handling
• Multiuser Support
• Unrestricted Crossdimensional Operations
• Intuitive Data
Manipulation
• Flexible Reporting
• Unlimited Dimensions
and Aggregation Levels
Data Warehousing
12
Six Additional Rules for an OLAP
system
•
•
•
•
•
•
•
Drill-through to Detail Level
OLAP Analysis Models
Treatment of Nonnormalized Data
Storing OLAP Result
Missing Values
Incremental Database Refresh
SQL Interface
2015-07-21
Data Warehousing
13
OLAP Characteristics
• OLAP systems
– Let business users have a multidimensional and
logical view of the data in the data warehouse
– Facilitate interactive query and complex analysis for
the users
– Allow users to drill down for greater details or roll up
for aggregations of metrics
– Provide ability to perform intricate calculations and
comparisons, and
– Present results in a number of meaningful ways,
including charts and graphs
2015-07-21
Data Warehousing
14
15.2 주요 특징과 기능
• OLAP is an information delivery system for the
data warehouse
• But OLAP is much more than that
• Contents
–
–
–
–
–
–
General Features : Figure 15-4
Dimensional Analysis
What are Hypercubes?
Drill-Down and Roll-Up
Slice-and-Dice or Rotation
Uses and Benefits
2015-07-21
Data Warehousing
15
2015-07-21
Data Warehousing
16
차원 분석
Dimensional Analysis
• The STAR schema in Figure 15-5
– PRODUCT, TIME, STORE
• Display the data for sales on a spreadsheet :
Figure 15-6
• A sample of simple queries
– Display the total sales of all products for past five
years in all stores
– Show comparison of total sales for all stores, product
by product, between years 2000 and 1999 only for
those products with reduced sales
– … on pages 355 - 356
2015-07-21
Data Warehousing
17
2015-07-21
Data Warehousing
18
2015-07-21
Data Warehousing
19
What are Hypercubes?
• Figure 15-7
– Two business dimensions of product and time
– Metrics to be analyzed
• Fixed cost, variable cost, indirect sales, direct sales, and
profit margin
– The three straight lines : multidimensional domain
structure(MDS)
• Figure 15-8
– Four dimensions
• STORE, TIME, PRODUCT, METRICS
2015-07-21
Data Warehousing
20
2015-07-21
Data Warehousing
21
2015-07-21
Data Warehousing
22
2015-07-21
Data Warehousing
23
2015-07-21
Data Warehousing
24
2015-07-21
Data Warehousing
25
드릴-다운과 롤-업
Drill-Down and Roll-Up
• In Figure 15-5
– These specific attributes of the product
dimensions
• Product name, subcategory, category, product line,
and department
• An ascending hierarchical sequence from product
name to department
• Figure 15-12
2015-07-21
Data Warehousing
26
2015-07-21
Data Warehousing
27
2015-07-21
Data Warehousing
28
2015-07-21
Data Warehousing
29
용도와 이점
• 사업 관리자, 임원, 분석가들의 증가된 생산성
• OLAP 시스템들에 내재하는 유연성은 사용자들
이 자신 스스로 분석을 실행한다
• IT 개발자들에 대한 이점
• 사용자들에게 잔무가 감소한다
• 빠른 응용들의 전달
• 질의 수행에 걸리는 시간과 네트워크 트래픽에서
의 감소를 통한 더 효과적인 연산
• 업무 측정치들과 차원들을 가진 실세계 난제들을
모델하는 능력
2015-07-21
Data Warehousing
30
15.3 OLAP MODELS
• ROLAP
– Relational online analytical processing
• MOLAP
– Multidimensional online analytical processing
• DOLAP
– Desktop online analytical processing
– Provide portability to users of OLAP
– Variation of ROLAP
2015-07-21
Data Warehousing
31
변종들의 개관
• MOLAP model
– Storing the data multidimensionally
– OLAP engine resides on a special server in
Figure 15-15
• ROLAP model
– Relies on the existing relational DBMS
– OLAP engine resides on the desktop in
Figure 15-15
2015-07-21
Data Warehousing
32
2015-07-21
Data Warehousing
33
2015-07-21
Data Warehousing
34
2015-07-21
Data Warehousing
35
MOLAP is the choice for faster
response and more intensive queries
2015-07-21
Data Warehousing
36
2015-07-21
Data Warehousing
37
15.4 OLAP 구현 고려사항
• MOLAP model
– The lack of standardization
– Scalability
• ROLAP model
– Using STAR schema
– Multidimensional representation of data in a
STAR schema arrangement
• Figure 5-20: Four Architectural Options
2015-07-21
Data Warehousing
38
2015-07-21
Data Warehousing
39
Some significant characteristics of
data in the OLAP system
• An OLAP system stores and uses much
less data compared to a data warehouse
• Data in the OLAP system is summarized
• OLAP data is more flexible for processing
and analysis
• Every instance of the OLAP system in
your environment is customized for the
purpose that instance serves
2015-07-21
Data Warehousing
40
Techniques for preparing OLAP data
for a specific group of users
• Define Subset
• 요약 Summarized
– Summarize and prepare aggregate data structures in
the way the marketing department needs for
summarizing
• 비정규화 Denormalized
• Calculate and Derive
• Index
– Choose those attributes that are appropriate for
marketing to build indexes
2015-07-21
Data Warehousing
41
2015-07-21
Data Warehousing
42
관리와 성능
• Administration
– OLAP system is part of the data warehouse
environment
– Administration of the OLAP is part of the data
warehouse administration
• A few of some key consideration for administering
and managing the OLAP system
– Expectation on what data will be accessed and how
– Selection of the right business dimensions
– Selection of the right filters for loading the data from the
data warehouse
– … on page 473
2015-07-21
Data Warehousing
43
Performance
• The presence of an OLAP system in your data
warehouse environment shifts the workload
• A corollary of shifting the complex queries to the
OLAP system is the improvement in the overall
query performance
– The OLAP system is designed for complex queries
• All the precalculations and preaggregations
results in faster response to queries at any level
of summarization
• This speed and performance do not come
without any cost
2015-07-21
Data Warehousing
44
OLAP Platforms
• 사용자의 편의성과 투입되는 비용에 따라
platform을 설정한다
• Main data warehouse system과 다른 platform이
필요한 경우
– 초기에는 data warehouse 시스템과 같이 사용하다가,
시스템이 느려지면(많은 사용자 또는 대용량의 데이
터로 인해) 다른 platform에다 구축한다
– 사용자가 분산되거나 main data warehouse와 다른
OLAP tools을 사용하면 다른 platform이 필요하다
– OLAP 시스템의 사용자들의 요구로 빈번한 refresh가
일어나는 상황이면 다른 platform 사용
2015-07-21
Data Warehousing
45
OLAP 도구와 제품
• List of a few guidelines to check products
– Do not be carried away by flashy technology
– Determine the scalability of the products
– Consider how easy it is to administer the OLAP
product
– Performance and flexibility are key ingredients in the
success of your OLAP system
– Do not worry too much about the merits between
ROLAP and MOLAP. Concentrate on the matching of
the vendor product with your users’ analytical
requirements
2015-07-21
Data Warehousing
46
OLAP 도구와 제품 선택시 선정 기준
• 데이터의 다차원 표현
• 집계, 요약, 사전계산, 유도
• Formulas and complex calculations in an
extensive library
• 교차-차원 계산
• Time intelligence
• Pivoting, cross-tabs, drill-down, and roll-up
along single or multiple dimensions
• Interface of OLAP with applications and
software
2015-07-21
Data Warehousing
47
구현 단계
•
•
•
•
•
•
•
•
차원 모델링
MDDB의 설계와 구축
OLAP 시스템으로 옮겨지는 데이터의 선택
OLAP 시스템을 위한 데이터 획득 또는 추출
OLAP 서버로의 데이터 적재
데이터 집계와 유도 데이터의 계산
데스크탑에서의 응용의 구현
사용자 훈련의 준비
2015-07-21
Data Warehousing
48