Transcript CHAPTER 1
제 15 장 데이터웨어하우스에서의 OLAP 2015-07-21 Data Warehousing 1 15.0 장의 목표 • The unqualified demand for OLAP • Review the major features and functions of OLAP • Grasp the intricacies of dimensional analysis • Examine the different OLAP models • Consider OLAP implementation 2015-07-21 Data Warehousing 2 15.1 온라인 분석 처리를 위한 요구 • For building a data warehouse – Top-down approach: E-R modeling technique – Bottom-up approach: several data marts – A practical approach: conglomeration of supermarts • In today’s business conditions, we find that users need to go beyond such basic analysis 2015-07-21 Data Warehousing 3 다차원 분석에 대한 필요 • Sales are interrelated to many business dimensions – The dates of the sales, the products, the distribution channels, the stores, the sales territories, the promotions, and a few more dimensions • More useful query – How much revenue did the new Product X generate during the last three months, broken down by individual months, in the South Central territory, by individual stores, broken down by promotions, compared to estimates, and compared to the previous version of the product? 2015-07-21 Data Warehousing 4 빠른 접근과 강력한 계산 • Figure 15-1 • A list of typical calculations – – – – – Rolls-ups to provide summaries and aggregations Drill-downs Simple calculations, such as computation of margins Share calculations Algebraic equations involving key performance indicators – Moving averages and growth percentages – Trend analysis using statistical methods 2015-07-21 Data Warehousing 5 2015-07-21 Data Warehousing 6 다른 분석 방법들의 한계 • OLTP – Figure 15-2 • Report Writers – Ability to point and click for generating and issuing SQL calls, and – Capability to format the output reports • SQL(Structured Query Language) – The language is too abstruse even for sophisticated users 2015-07-21 Data Warehousing 7 2015-07-21 Data Warehousing 8 OLAP is the Answer • Users need the ability to perform multidimensional analysis with complex calculations • The basic virtues of OLAP – Enables analysts, executives, and managers to gain useful insights from the presentation of data – Can reorganize metrics along several dimensions and allow data to be viewed from different perspectives – Supports multidimensional analysis – Is able to drill down or roll up within each dimension 2015-07-21 Data Warehousing 9 2015-07-21 Data Warehousing 10 OLAP 정의와 규칙 • The term OLAP in a paper entitled – “Providing On-Line Analytical Processing to User Analysts,” by Dr. E.F. Codd in 1993 • A short and precise definition for OLAP – On-line Analytical Processing(OLAP) is a category of software technology that enables analysts, managers and executives to gain insight into data through fast, consistent, interactive access in a wide variety of possible views of information that has been transformed from raw data to reflect the real dimensionality of the enterprise as understood by the user 2015-07-21 Data Warehousing 11 The initial twelve guidelines for an OLAP system • Multidimensional Conceptual View • Transparency • Accessibility • Consistent Reporting Performance • Client/Server Architecture • Generic Dimensionality 2015-07-21 • Dynamic Sparse Matrix handling • Multiuser Support • Unrestricted Crossdimensional Operations • Intuitive Data Manipulation • Flexible Reporting • Unlimited Dimensions and Aggregation Levels Data Warehousing 12 Six Additional Rules for an OLAP system • • • • • • • Drill-through to Detail Level OLAP Analysis Models Treatment of Nonnormalized Data Storing OLAP Result Missing Values Incremental Database Refresh SQL Interface 2015-07-21 Data Warehousing 13 OLAP Characteristics • OLAP systems – Let business users have a multidimensional and logical view of the data in the data warehouse – Facilitate interactive query and complex analysis for the users – Allow users to drill down for greater details or roll up for aggregations of metrics – Provide ability to perform intricate calculations and comparisons, and – Present results in a number of meaningful ways, including charts and graphs 2015-07-21 Data Warehousing 14 15.2 주요 특징과 기능 • OLAP is an information delivery system for the data warehouse • But OLAP is much more than that • Contents – – – – – – General Features : Figure 15-4 Dimensional Analysis What are Hypercubes? Drill-Down and Roll-Up Slice-and-Dice or Rotation Uses and Benefits 2015-07-21 Data Warehousing 15 2015-07-21 Data Warehousing 16 차원 분석 Dimensional Analysis • The STAR schema in Figure 15-5 – PRODUCT, TIME, STORE • Display the data for sales on a spreadsheet : Figure 15-6 • A sample of simple queries – Display the total sales of all products for past five years in all stores – Show comparison of total sales for all stores, product by product, between years 2000 and 1999 only for those products with reduced sales – … on pages 355 - 356 2015-07-21 Data Warehousing 17 2015-07-21 Data Warehousing 18 2015-07-21 Data Warehousing 19 What are Hypercubes? • Figure 15-7 – Two business dimensions of product and time – Metrics to be analyzed • Fixed cost, variable cost, indirect sales, direct sales, and profit margin – The three straight lines : multidimensional domain structure(MDS) • Figure 15-8 – Four dimensions • STORE, TIME, PRODUCT, METRICS 2015-07-21 Data Warehousing 20 2015-07-21 Data Warehousing 21 2015-07-21 Data Warehousing 22 2015-07-21 Data Warehousing 23 2015-07-21 Data Warehousing 24 2015-07-21 Data Warehousing 25 드릴-다운과 롤-업 Drill-Down and Roll-Up • In Figure 15-5 – These specific attributes of the product dimensions • Product name, subcategory, category, product line, and department • An ascending hierarchical sequence from product name to department • Figure 15-12 2015-07-21 Data Warehousing 26 2015-07-21 Data Warehousing 27 2015-07-21 Data Warehousing 28 2015-07-21 Data Warehousing 29 용도와 이점 • 사업 관리자, 임원, 분석가들의 증가된 생산성 • OLAP 시스템들에 내재하는 유연성은 사용자들 이 자신 스스로 분석을 실행한다 • IT 개발자들에 대한 이점 • 사용자들에게 잔무가 감소한다 • 빠른 응용들의 전달 • 질의 수행에 걸리는 시간과 네트워크 트래픽에서 의 감소를 통한 더 효과적인 연산 • 업무 측정치들과 차원들을 가진 실세계 난제들을 모델하는 능력 2015-07-21 Data Warehousing 30 15.3 OLAP MODELS • ROLAP – Relational online analytical processing • MOLAP – Multidimensional online analytical processing • DOLAP – Desktop online analytical processing – Provide portability to users of OLAP – Variation of ROLAP 2015-07-21 Data Warehousing 31 변종들의 개관 • MOLAP model – Storing the data multidimensionally – OLAP engine resides on a special server in Figure 15-15 • ROLAP model – Relies on the existing relational DBMS – OLAP engine resides on the desktop in Figure 15-15 2015-07-21 Data Warehousing 32 2015-07-21 Data Warehousing 33 2015-07-21 Data Warehousing 34 2015-07-21 Data Warehousing 35 MOLAP is the choice for faster response and more intensive queries 2015-07-21 Data Warehousing 36 2015-07-21 Data Warehousing 37 15.4 OLAP 구현 고려사항 • MOLAP model – The lack of standardization – Scalability • ROLAP model – Using STAR schema – Multidimensional representation of data in a STAR schema arrangement • Figure 5-20: Four Architectural Options 2015-07-21 Data Warehousing 38 2015-07-21 Data Warehousing 39 Some significant characteristics of data in the OLAP system • An OLAP system stores and uses much less data compared to a data warehouse • Data in the OLAP system is summarized • OLAP data is more flexible for processing and analysis • Every instance of the OLAP system in your environment is customized for the purpose that instance serves 2015-07-21 Data Warehousing 40 Techniques for preparing OLAP data for a specific group of users • Define Subset • 요약 Summarized – Summarize and prepare aggregate data structures in the way the marketing department needs for summarizing • 비정규화 Denormalized • Calculate and Derive • Index – Choose those attributes that are appropriate for marketing to build indexes 2015-07-21 Data Warehousing 41 2015-07-21 Data Warehousing 42 관리와 성능 • Administration – OLAP system is part of the data warehouse environment – Administration of the OLAP is part of the data warehouse administration • A few of some key consideration for administering and managing the OLAP system – Expectation on what data will be accessed and how – Selection of the right business dimensions – Selection of the right filters for loading the data from the data warehouse – … on page 473 2015-07-21 Data Warehousing 43 Performance • The presence of an OLAP system in your data warehouse environment shifts the workload • A corollary of shifting the complex queries to the OLAP system is the improvement in the overall query performance – The OLAP system is designed for complex queries • All the precalculations and preaggregations results in faster response to queries at any level of summarization • This speed and performance do not come without any cost 2015-07-21 Data Warehousing 44 OLAP Platforms • 사용자의 편의성과 투입되는 비용에 따라 platform을 설정한다 • Main data warehouse system과 다른 platform이 필요한 경우 – 초기에는 data warehouse 시스템과 같이 사용하다가, 시스템이 느려지면(많은 사용자 또는 대용량의 데이 터로 인해) 다른 platform에다 구축한다 – 사용자가 분산되거나 main data warehouse와 다른 OLAP tools을 사용하면 다른 platform이 필요하다 – OLAP 시스템의 사용자들의 요구로 빈번한 refresh가 일어나는 상황이면 다른 platform 사용 2015-07-21 Data Warehousing 45 OLAP 도구와 제품 • List of a few guidelines to check products – Do not be carried away by flashy technology – Determine the scalability of the products – Consider how easy it is to administer the OLAP product – Performance and flexibility are key ingredients in the success of your OLAP system – Do not worry too much about the merits between ROLAP and MOLAP. Concentrate on the matching of the vendor product with your users’ analytical requirements 2015-07-21 Data Warehousing 46 OLAP 도구와 제품 선택시 선정 기준 • 데이터의 다차원 표현 • 집계, 요약, 사전계산, 유도 • Formulas and complex calculations in an extensive library • 교차-차원 계산 • Time intelligence • Pivoting, cross-tabs, drill-down, and roll-up along single or multiple dimensions • Interface of OLAP with applications and software 2015-07-21 Data Warehousing 47 구현 단계 • • • • • • • • 차원 모델링 MDDB의 설계와 구축 OLAP 시스템으로 옮겨지는 데이터의 선택 OLAP 시스템을 위한 데이터 획득 또는 추출 OLAP 서버로의 데이터 적재 데이터 집계와 유도 데이터의 계산 데스크탑에서의 응용의 구현 사용자 훈련의 준비 2015-07-21 Data Warehousing 48