Transcript Chapter 13

Chapter 13 The Data Warehouse Database Systems: Design, Implementation, and Management, Seventh Edition, Rob and Coronel 1

13

In this chapter, you will learn: • How operational data and decision support data differ • What a data warehouse is, how data for it are prepared, and how it is implemented • What star schemas are and how they are constructed • What data mining is and what role it plays in decision support

13

Database Systems: Design, Implementation, & Management, 7 th Edition, Rob & Coronel 2

In this chapter, you will learn (continued):

13

• What online analytical processing (OLAP) is • How SQL extensions are used to support OLAP-type data manipulations Database Systems: Design, Implementation, & Management, 7 th Edition, Rob & Coronel 3

The Need for Data Analysis • Managers must be able to track daily transactions to evaluate how the business is performing • By tapping into operational database, management can develop strategies to meet organizational goals • Data analysis can provide information about short-term tactical evaluations and strategies Database Systems: Design, Implementation, & Management, 7 th Edition, Rob & Coronel 4

13

The Need for Data Analysis (continued)

13

Database Systems: Design, Implementation, & Management, 7 th Edition, Rob & Coronel 5

The Need for Data Analysis (continued)

13

Database Systems: Design, Implementation, & Management, 7 th Edition, Rob & Coronel 6

Decision Support Systems • Decision support is methodology (or series of methodologies) designed to extract information from data and to use such information as a basis for decision making • Decision support system (DSS) – Arrangement of computerized tools used to assist managerial decision making within business – Usually requires extensive data “massaging” to produce information – Used at all levels within organization – Often tailored to focus on specific business areas – Provides ad hoc query tools to retrieve data and to display data in different formats Database Systems: Design, Implementation, & Management, 7 th Edition, Rob & Coronel 7

13

Decision Support Systems (continued)

13

• Composed of following four main components: – Data store component • Basically a DSS database – Data extraction and data filtering component • Used to extract and validate data taken from operational database and external data sources – End-user query tool • Used to create queries that access database – End-user presentation tool • Used to organize and present data Database Systems: Design, Implementation, & Management, 7 th Edition, Rob & Coronel 8

Decision Support Systems (continued)

13

Database Systems: Design, Implementation, & Management, 7 th Edition, Rob & Coronel 9

Operational Data vs. Decision Support Data • Operational Data – Mostly stored in relational database – Optimized to support transactions representing daily operations • DSS Data – Give tactical and strategic business meaning to operational data – Differs from operational data in following three main areas: • Timespan • Granularity • Dimensionality Database Systems: Design, Implementation, & Management, 7 th Edition, Rob & Coronel 10

13

Operational Data vs. Decision Support Data (continued)

13

Database Systems: Design, Implementation, & Management, 7 th Edition, Rob & Coronel 11

Operational Data vs. Decision Support Data (continued)

13

Database Systems: Design, Implementation, & Management, 7 th Edition, Rob & Coronel 12

DSS Database Requirements • A specialized DBMS tailored to provide fast answers to complex queries.

• Four main requirements: – Database schema – Data extraction and loading – End-user analytical interface – Database size

13

Database Systems: Design, Implementation, & Management, 7 th Edition, Rob & Coronel 13

DSS Database Requirements (continued)

13

• Database schema – Must support complex data representations – Must contain aggregated and summarized data – Queries must be able to extract multidimensional time slices Database Systems: Design, Implementation, & Management, 7 th Edition, Rob & Coronel 14

DSS Database Requirements (continued)

13

Database Systems: Design, Implementation, & Management, 7 th Edition, Rob & Coronel 15

DSS Database Requirements (continued)

13

Database Systems: Design, Implementation, & Management, 7 th Edition, Rob & Coronel 16

DSS Database Requirements (continued)

13

• Data extraction – Should allow batch and scheduled data extraction – Should support different data sources • Flat files • Hierarchical, network, and relational databases • Multiple vendors • Data filtering – Must allow checking for inconsistent data Database Systems: Design, Implementation, & Management, 7 th Edition, Rob & Coronel 17

DSS Database Requirements (continued)

13

• End-user analytical interface – One of most critical DSS DBMS components – Permits user to navigate through data to simplify and accelerate decision-making process Database Systems: Design, Implementation, & Management, 7 th Edition, Rob & Coronel 18

DSS Database Requirements (continued)

13

• Database size – In 2005, Wal-Mart had 260 terabytes of data in its data warehouses – DBMS must support very large databases (VLDBs) Database Systems: Design, Implementation, & Management, 7 th Edition, Rob & Coronel 19

The Data Warehouse • Integrated, subject-oriented, time-variant, nonvolatile collection of data that provides support for decision making • Usually a read-only database optimized for data analysis and query processing • Requires time, money, and considerable managerial effort to create

13

Database Systems: Design, Implementation, & Management, 7 th Edition, Rob & Coronel 20

The Data Warehouse (continued)

13

Database Systems: Design, Implementation, & Management, 7 th Edition, Rob & Coronel 21

The Data Warehouse (continued)

13

Database Systems: Design, Implementation, & Management, 7 th Edition, Rob & Coronel 22

The Data Warehouse (continued) • Data mart – Small, single-subject data warehouse subset – Each is more manageable data set than data warehouse – Provides decision support to small group of people – Typically lower cost and lower implementation time than data warehouse

13

Database Systems: Design, Implementation, & Management, 7 th Edition, Rob & Coronel 23

DSS Architectural Styles • Several are available • Provide advanced decision support features • Some capable of providing access to multidimensional data analysis

13

Database Systems: Design, Implementation, & Management, 7 th Edition, Rob & Coronel 24

DSS Architectural Styles (continued)

13

Database Systems: Design, Implementation, & Management, 7 th Edition, Rob & Coronel 25

Twelve Rules that Define a Data Warehouse • Data warehouse and operational environments are separated • Data warehouse data are integrated • Data warehouse contains historical data over long time horizon • Data warehouse data are snapshot data captured at given point in time • Data warehouse data are subject oriented

13

Database Systems: Design, Implementation, & Management, 7 th Edition, Rob & Coronel 26

Twelve Rules that Define a Data Warehouse (continued) • Data warehouse data are mainly read-only with periodic batch updates from operational data – No online updates allowed • Data warehouse development life cycle differs from classical systems development • Data warehouse contains data with several levels of detail: current detail data, old detail data, lightly summarized data, and highly summarized data • Data warehouse environment is characterized by read-only transactions to very large data sets Database Systems: Design, Implementation, & Management, 7 th Edition, Rob & Coronel 27

13

13

Twelve Rules that Define a Data Warehouse (continued) • Data warehouse environment has system that traces data sources, transformations, and storage • Data warehouse’s metadata are critical component of this environment • Data warehouse contains chargeback mechanism for resource usage that enforces optimal use of data by end users Database Systems: Design, Implementation, & Management, 7 th Edition, Rob & Coronel 28

Online Analytical Processing • Advanced data analysis environment that supports decision making, business modeling, and operations research • OLAP systems share four main characteristics: – Use multidimensional data analysis techniques – Provide advanced database support – Provide easy-to-use end-user interfaces – Support client/server architecture

13

Database Systems: Design, Implementation, & Management, 7 th Edition, Rob & Coronel 29

Multidimensional Data Analysis Techniques • Data are processed and viewed as part of a multidimensional structure • Particularly attractive to business decision makers

13

Database Systems: Design, Implementation, & Management, 7 th Edition, Rob & Coronel 30

13

Multidimensional Data Analysis Techniques (continued) • Augmented by following functions: – Advanced data presentation functions – Advanced data aggregation, consolidation and classification functions – Advanced computational functions – Advanced data modeling functions Database Systems: Design, Implementation, & Management, 7 th Edition, Rob & Coronel 31