Transcript dbp10_ch04

David M. Kroenke’s
Database Processing:
Fundamentals, Design, and Implementation
Chapter Four:
Database Design
Using Normalization
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition
© 2006 Pearson Prentice Hall, modified by Dr. Lyn Mathis
4-1
Chapter Premise
• Received one or more tables of existing
data
• Need to store data in new database
• Problem:
Store data as received, or transform?
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition
© 2006 Pearson Prentice Hall, modified by Dr. Lyn Mathis
4-2
How Many Tables?
One, two, more?
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition
© 2006 Pearson Prentice Hall, modified by Dr. Lyn Mathis
4-3
Assessing Table Structure
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition
© 2006 Pearson Prentice Hall, modified by Dr. Lyn Mathis
4-4
Counting Rows in a Table
• To count the number of rows in a table use
the SQL built-in function COUNT(*):
SELECT
FROM
COUNT(*) AS NumRows
SKU_DATA;
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition
© 2006 Pearson Prentice Hall, modified by Dr. Lyn Mathis
4-5
Examining the Columns
• To determine the number and type of
columns in a table, use an SQL SELECT
statement
• To limit the number of rows retreived, use
the SQL TOP {NumberOfRows} keyword:
SELECT
FROM
TOP (10) *
SKU_DATA;
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition
© 2006 Pearson Prentice Hall, modified by Dr. Lyn Mathis
4-6
Checking Validity of Assumed
Referential Integrity Constraints
• Given two tables with an assumed foreign
key constraint:
SKU_DATA (SKU, SKU_Description, Department, Buyer)
BUYER
(BuyerName, Department)
Where SKU_DATA.Buyer must exist in BUYER.BuyerName
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition
© 2006 Pearson Prentice Hall, modified by Dr. Lyn Mathis
4-7
Checking Validity of Assumed
Referential Integrity Constraints
• To find any foreign key values that violate the
foreign key constraint:
SELECT
Buyer
FROM
SKU_DATA
WHERE
Buyer NOT IT
(SELECT Buyer
FROM
SKU_DATA, BUYER
WHERE
SKU_DATA.BUYER =
BUYER.BuyerName;
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition
© 2006 Pearson Prentice Hall, modified by Dr. Lyn Mathis
4-8
Type of Database
• Updateable
• Read-only
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition
© 2006 Pearson Prentice Hall, modified by Dr. Lyn Mathis
4-9
Updateable Database
• Normally, put tables in BCNF
• Always, remove MVD
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition
© 2006 Pearson Prentice Hall, modified by Dr. Lyn Mathis
4-10
Read-Only Database
• Likely not to use BCNF tables
• Remove MVD
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition
© 2006 Pearson Prentice Hall, modified by Dr. Lyn Mathis
4-11
Designing
Updateable Databases
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition
© 2006 Pearson Prentice Hall, modified by Dr. Lyn Mathis
4-12
Normalization:
Advantages and Disadvantages
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition
© 2006 Pearson Prentice Hall, modified by Dr. Lyn Mathis
4-13
Choosing Not to Use BCNF
• BCNF controls anomalies from functional
dependencies
• BCNF is not, always, desirable
• Classic ZIP code example
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition
© 2006 Pearson Prentice Hall, modified by Dr. Lyn Mathis
4-14
Multivaled Dependencies
• MVDs cause very problematic anomalies
• Always place MVDs into a separate tables
for operational databases.
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition
© 2006 Pearson Prentice Hall, modified by Dr. Lyn Mathis
4-15
Designing
Read-Only Databases
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition
© 2006 Pearson Prentice Hall, modified by Dr. Lyn Mathis
4-16
Read-Only Database
• non-operational database
• data extracted from operational databases
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition
© 2006 Pearson Prentice Hall, modified by Dr. Lyn Mathis
4-17
Read-Only Databases
• For applications
– querying
– reporting
– data mining
• NEVER updated operationally
(new data imported)
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition
© 2006 Pearson Prentice Hall, modified by Dr. Lyn Mathis
4-18
Read-Only Databases
• Data Warehouse
• Data Mart
• Decision Support System
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition
© 2006 Pearson Prentice Hall, modified by Dr. Lyn Mathis
4-19
Read-Only Databases
• Normalization seldom an advantage
• Denormalization
– Join data
• from normalized tables
• To create a new table
– Store
• New non-normalized tables
• Not original normalized tables
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition
© 2006 Pearson Prentice Hall, modified by Dr. Lyn Mathis
4-20
Read-Only Databases
• Often, many copies
of the same data
• Each customized
for a specific
application
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition
© 2006 Pearson Prentice Hall, modified by Dr. Lyn Mathis
4-21
Customized Tables
PRODUCT_PURCHASING (SKU, SKU_Description, VendorNumber,
VendorName, VendorContact_1, VendorContact_2, VendorStreet,
VendorCity, VendorState, VendorZip)
PRODUCT_USAGE (SKU, SKU_Description, QuantitySoldPastYear,
QuantitySoldPastQuarter, QuantitySoldPastMonth)
PRODUCT_WEB (SKU, DetailPicture, ThumbnailPicture,
MarketingShortDescription, MarketingLongDescription, PartColor)
PRODUCT_INVENTORY (SKU, PartNumber, SKU_Description, UnitsCode,
BinNumber, ProductionKeyCode)
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition
© 2006 Pearson Prentice Hall, modified by Dr. Lyn Mathis
4-22
Creating Databases from
Existing Tables
• Generally, accomplished by automated
means
– Import
– SQL
• Beyond the scope of this class
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition
© 2006 Pearson Prentice Hall, modified by Dr. Lyn Mathis
4-23
Common Design Problems
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition
© 2006 Pearson Prentice Hall, modified by Dr. Lyn Mathis
4-24
The Multivalue, Multicolumn Problem
• WORKER
(WorkerID, Name, Skill-Type, BuildID1, BuildID2, BuildID3)
• Problems?
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition
© 2006 Pearson Prentice Hall, modified by Dr. Lyn Mathis
4-25
The Multivalue, Multicolumn Problem
• WORKER
(WorkerID, Name, Skill-Type, BuildID1, BuildID2, BuildID3)
• Another form of a MVD
• A Different Solution:
– use separate table to store multiple values
– WORKER (WorkerID, Name, Skill-Type)
– ASSIGNMENT (WorkerID, BuildingID
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition
© 2006 Pearson Prentice Hall, modified by Dr. Lyn Mathis
4-26
Inconsistent Values: Different forms
of the same data value
• Different codings:
• CommonName = ‘Blueberry'
• CommonName = ‘Huckleberry'
• CommonName = ‘Wild Blueberry‘
• Different spellings:
• ‘Blueberry’
• ‘Blueberries’
• ‘BLUBERRY’
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition
© 2006 Pearson Prentice Hall, modified by Dr. Lyn Mathis
4-27
Inconsistent Values
• Particularly problematic are
– primary key values,or
– foreign key values
• To detect:
– Some semi-automatic methods
– Beyond the scope of this class
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition
© 2006 Pearson Prentice Hall, modified by Dr. Lyn Mathis
4-28
Null Values
•
•
•
•
Missing value
Not a blank
Not a zero
Never been provided!
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition
© 2006 Pearson Prentice Hall, modified by Dr. Lyn Mathis
4-29
Null Values
• Ambiguous:
– May indicate n/a
– May indicate applicable but unknown
– May indicate value applicable and known, but
never entered
• When is your DB fault intolerant of null
values?
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition
© 2006 Pearson Prentice Hall, modified by Dr. Lyn Mathis
4-30
The General-Purpose Remarks Column
• Black Hole
– Remarks
– Comments
– Notes
• Such a column may:
– Be used inconsistently
– Hold multiple data items
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition
© 2006 Pearson Prentice Hall, modified by Dr. Lyn Mathis
4-31
The General-Purpose Remarks Column
• Example – Figure 4-10, p. 118
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition
© 2006 Pearson Prentice Hall, modified by Dr. Lyn Mathis
4-32
The General-Purpose Remarks Column
• Decompose column into separate columns
• Manual task
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition
© 2006 Pearson Prentice Hall, modified by Dr. Lyn Mathis
4-33
David M. Kroenke’s
Database Processing
Fundamentals, Design, and Implementation
(10th Edition)
End of Presentation:
Chapter Four
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition
© 2006 Pearson Prentice Hall, modified by Dr. Lyn Mathis
4-34