Transcript Document

Building a Data Warehouse
with SQL Server
Presented by John Sterrett
About Me…
AITP - Pittsburgh
http://www.aitp-pgh.org/
What is a data warehouse?
Building a Data Warehouse with SQL Server
What is Business Intelligence?
• According to Wikipedia BI refers to skills, technologies, applications
and practices used to help a business acquire a better understanding
of its commercial context.
• The following is a definition from Business Intelligence with Microsoft
Office PerformancePoint Server 2007
Business intelligence (BI) is more of a concept than a single technology. The
goal is to gain insight into the business by bringing together data, formatting
it in a way that enables better analysis, and then providing tools that give
users power—not just to examine and explore the data, but to quickly
understand it.
Data Warehouse = Business Intelligence
Business Problems that BI
Addresses
•
•
•
•
•
•
Slow-to-execute queries
General system slowdowns
Manual query writing
Disparate data sources
Data is not available to all users
Too much data
What is BI to the users?
• BI is fast to query.
• BI improves your return on investment
(ROI)
• BI is a tool that allows users to find what
they need.
Business Cases
• A technology department could analyze work completed for
departments and/or clients. This could then be used to estimate
future flat fee engagements, establish seasonal hiring, balance
overtime.
• A medical center could use statistics covering a wide range of lab
values and a large number of patients to identify whether a patient
currently being treated might be at an elevated risk for a certain types
of disease.
• Application that’s great for data entry but lacks in depth reporting and
drilldown capabilities.
• A company who wants to centralize data to a single data source for
allowing decision makers the ability to make decisions as needed.
What Products Are Involved?
• Database Technologies
SQL Server 2005/2008 (Database Engine)
SQL Server Analysis Services (SSAS)
SQL Server Integration Services (SSIS)
• User Interface Technologies
SQL Server Reporting Services (SSRS)
SQL Server Management Studio (SSMS)
Performance Point 2007
SharePoint
Excel
DEMO!
• Browse a Cube using Management Studio
Goals for dimensional modeling
•
•
•
•
Understand star schema
Understand dimensional modeling
Understand changing dimensions
Understanding fact (or measure) and cube
modeling
Star Schema…
• A star schema consists of at least one fact
table and a number of dimension tables.
• Star Schema is highly recommended
schema for SSAS cubes.
Star Schema Example
Facts and Measures
• Fact table consists of at least two types of
data: keys and measures.
• Keys are usually surrogate keys that link
to the dimension tables.
• Measures are numeric values that are
usually additive that express business
metrics.
Dimensions
• Dimensions describe who, what, when, where
and why for the facts.
• Dimensions should consist of the following data
types
1. Surrogate key.
2. Primary key of the loaded source(s)
3. Any additional attributes (columns) that describe the
business entity.
Dimension Examples
Slowly Changing Dimensions
(SCD)
Hierarchies
Hierarchies serve two purposes:
• Convenience for end users.
• Provides drill down / drill up features
Where do I go from here?
• Create Use Grain Statements
What are the key metrics for your business?
What factors do you use to evaluate those key
metrics?
What level of granularity do you use evaluate
each factor?
Grain Statement Examples:
We want to see time worked, hours billed, and cost of
work by date, by employee, by department, by location,
and by projects.
We want to see sales amount and sales quantity by day,
by product, by employee, and by store location.
We want to see average score and quantity of courses
taken, by course, by day, by student, by manager, by
curriculum, and by curriculum type.
DEMO
• Build a Cube
Extracting & Loading Data
• *The process of extracting, transferring
and loading data consumes about 75% of
the Data Warehouse project.
• It is highly recommended to use SSIS for
ETL instead of native T-SQL
Basic Dimension Package
Basic Fact Package
So… You are interested in
building a Data Warehouse
• Click here to download a Virtual PC that includes sample Data
Warehouses and all of Microsoft’s BI tools.
• If you already have SQL Server 2005 and Analysis Services
configured click here to download samples (Click here for SQL
Server 2008)
• Check out this Introduction to Data Warehousing with SQL Server
Q/A
Blog: www.johnsterrett.com
LinkedIn: www.linkedin.com/in/johnsterrett
Twitter: www.twitter.com/johnsterrett
E-Mail: [email protected]
References
• Foundation of SQL Server 2005 Business
Intelligence.
• Business Intelligence with Microsoft Office
PerformancePoint Server 2007
• ACM – Intro to Data Warehousing