OOD: DB Design - Baylor University

Download Report

Transcript OOD: DB Design - Baylor University

Database Design
Chapter 12
Part-2: Normalization
1
Outline
Part Two
 Normalization
− 1NF
− 2NF
− 3NF
 Integrity Controls
2
Study the problems with this table.
This is why we need Normalization!
Update Anomaly Carla called to inquire about her appointment
time on 12/2/2014. She also gives us her
new phone number: 777-1234
12/2/2014
Delete Anomaly
Mike called to cancel his appointment.
Insert Anomaly
Sue calls to make another appointment
for December 15th.
Appointment Table
Appt
No
1
2
3
4
5
6
7
8
9
10
Appt
Date
12/1/2014
12/1/2014
12/1/2014
12/2/2014
12/2/2014
12/2/2014
12/2/2014
12/2/2014
12/3/2014
12/3/2014
Appt
Planned
Appt
Time
Duration
Type
3:00 AM
1.00
Physical
3:00 AM
0.25
Shot
3:15 AM
0.50
Flu
10:00 AM
0.50
Migraine
10:15 AM
0.25
Shot
10:30 AM
0.25
Shot
10:45 AM
0.50
Flu
11:00 AM
1.00
Physical
10:30 AM
1.00
Physical
9:00 AM
0.50
Migraine
Patient
ID
466927
456789
194756
329657
987453
384788
438754
345875
466927
345875
First
Nm
Lisa
Sue
Brandon
Marcus
Mike
Tonya
Iliana
Carla
Lisa
Carla
Last
Nm
Garcia
Carey
Pierre
Schwartz
Jones
Johnson
Hnatt
Basich
Garcia
Basich
Phone
562-3456
432-1234
432-7877
239-5502
456-0202
432-8806
823-4303
857-5566
562-3456
857-5566
Doctor
ID
C678
A528
S626
A528
G123
S626
C678
A528
C678
C678
Doctor
Nm
Chapman
Lopez
Smith
Lopez
Gray
Smith
Chapman
Lopez
Chapman
Chapman
3
Normalization
(def) the process of converting complex data
structures into simple, stable data structures.
Every non-primary key attribute depends upon
the whole primary key and nothing but the primary key.
Purpose: Create well-structured relations/tables
Why?  Data Redundancy =  Data Quality & Integrity
Result: where we can insert, modify, and delete the rows
without errors or inconsistencies.
4
Normalization
Normalization
Normal
Form
Definition
1NF
The table has no repeating fields or groups of fields
2NF
No Partial Dependencies!
The table is in 1NF, and …
every non-key attribute is functionally dependent on the entire primary key
3NF
No Transitive Dependencies!
The table is in 2NF, and …
each non-key attribute is not functionally dependent on another non-key attribute
First Normal Form (1NF)
The table has:
 No multi-valued attributes
− EMP (ssn, name, dept, sal, dependents)
 No repeating fields or groups of fields
− EMP (ssn, name, dept, sal, dep1, dep2, dep3, … depN)
6
Functional Dependency
BA
Field A is functionally dependent on Field B
if for each value of B, there is only 1 corresponding value of A
 Alternatively, it can be read as “Field B determines Field A”
− If we know the value of B, we can obtain the value of A.
 Functional Dependency involves a one-to-one relationship between the values of fields
• EmpNo  Salary 
• Salary  EmpNo 
• StudentId  Name 
• HomeTown  LastName 
 Notes:
− Sample Data does not prove the existence of a functional dependency.
− Knowledge of problem domain is critical!
7
Second Normal Form (2NF)
in 1NF and if each non-key attribute is functionally
1NF
dependent on entire primary key
What is the PK?
______________
Need a new table
since
CatIssueDate
is determined only by
______________
2NF
8
Third Normal Form (3NF)
in 2NF and if no non-key attribute is functionally
2NF
dependent on any other non-key attribute
What is the PK?
______________
Need a new table
since State
is determined by
______________
3NF
9
Employee Data
Courses Taken Listing
EmpNo
100
Name
John
Dept
Salary
MKTG
42000
Course
Date
Completed
Dept
Phone Ext.
SPSS
6/19/2007
5325
Surveys
11/3/2011
5325
140
Sue
ACCT
41000
Tax Acc
12/1/2005
4422
110
Bob
INFO
70000
Java
3/21/2008
7373
C#
10/23/2014
7373
190
Alex
FINA
65000
Investmts.
2/1/2008
4477
150
Jennifer
INFO
49000
Java
3/11/2006
7373
Oracle
5/19/2007
7373
Notice in this report that Employees can take multiple courses.
10
First Normal Form(1NF)
No repeating fields or groups!
 Note: notice the Primary Key!
EmpCourse
EmpNo
Name
Dept
Salary
Course
Date
Completed
Dept
Phone Ext.
100
John
MKTG
42000
SPSS
6/19/2007
5325
100
John
MKTG
42000
Surveys
11/3/2011
5325
140
Sue
ACCT
41000
Tax Acc
12/1/2005
4422
110
Bob
INFO
70000
Java
3/21/2008
7373
110
Bob
INFO
70000
C#
10/23/2014
7373
190
Alex
FINA
65000
Investmts.
2/1/2008
4477
150
Jennifer
INFO
49000
Java
3/11/2006
7373
150
Jennifer
INFO
49000
Oracle
5/19/2007
7373
EmpCourse(EmpNo, Name, Dept, Salary, Course, Date Completed, DeptPhoneExt)
11
Let’s transform this table
from 1NF to 2NF
 Ensure that all non-key attributes are functionally dependent
on the entire primary key!
EmpCourse
EmpNo
Name
Dept
Salary
Course
Date
Completed
Dept
Phone Ext.
100
John
MKTG
42000
SPSS
6/19/2007
5325
100
John
MKTG
42000
Surveys
11/3/2011
5325
140
Sue
ACCT
41000
Tax Acc
12/1/2005
4422
110
Bob
INFO
70000
Java
3/21/2008
7373
110
Bob
INFO
70000
C#
10/23/2014
7373
190
Alex
FINA
65000
Investmts.
2/1/2008
4477
150
Jennifer
INFO
49000
Java
3/11/2006
7373
150
Jennifer
INFO
49000
Oracle
5/19/2007
7373
EmpCourse(EmpNo, Name, Dept, Salary, Course, Date Completed, DeptPhoneExt)
EmpNo, Course 
EmpNo 
Course 
12
2NF
 NOW, all non-key attributes are functionally dependent on the entire
primary key!
Employee
EmpNo
Name
EmpCourse
Dept
Salary
Dept
Phone Ext.
EmpNo
Course
Date
Completed
100
John
MKTG
42000
5325
100
SPSS
6/19/2007
140
Sue
ACCT
41000
4422
100
Surveys
11/3/2011
110
Bob
INFO
70000
7373
140
Tax Acc
12/1/2005
190
Alex
FINA
65000
4477
110
Java
3/21/2008
150
Jennifer
INFO
49000
7373
110
C#
10/23/2014
190
Investmts.
2/1/2008
150
Java
3/11/2006
150
Oracle
5/19/2007
Employee (EmpNo, Name, Dept, Salary, DeptPhoneExt)
EmpCourse( EmpNo, Course, Date Completed
13
Not in 3NF

Ensure that all non-primary key attributes do not depend on
each other (i.e. no transitive dependencies).
Employee
EmpNo
Name
EmpCourse
Dept
Salary
Dept
Phone Ext.
EmpNo
Course
Date
Completed
100
John
MKTG
42000
5325
100
SPSS
6/19/2007
140
Sue
ACCT
41000
4422
100
Surveys
11/3/2011
110
Bob
INFO
70000
7373
140
Tax Acc
12/1/2005
190
Alex
FINA
65000
4477
110
Java
3/21/2008
150
Jennifer
INFO
49000
7373
110
C#
10/23/2014
190
Investmts.
2/1/2008
150
Java
3/11/2006
150
Oracle
5/19/2007
Do any of the non-key attributes
(Name, Dept, or Salary)
determine the phone number?
__________  DeptPhoneExt
This one is already in 3NF since it
has only has one non-key attribute!
14
3NF
 NOW, all non-primary key attributes do not depend on each other
Employee
EmpNo
Name
EmpCourse
Dept
Salary
EmpNo
Course
Dept
Date
Completed
Dept
Dept
Phone Ext.
100
John
MKTG
42000
100
SPSS
6/19/2007
MKTG
5325
140
Sue
ACCT
41000
100
Surveys
11/3/2011
ACCT
4422
110
Bob
INFO
70000
140
Tax Acc
12/1/2005
INFO
7373
190
Alex
FINA
65000
110
Java
3/21/2008
FINA
4477
150
Jennifer
INFO
49000
110
C#
10/23/2014
190
Investmts.
2/1/2008
150
Java
3/11/2006
150
Oracle
5/19/2007
Employee(EmpNo, Name, Dept, Salary)
EmpCourse(EmpNo, Course, Date Completed
Department(Dept, DeptPhoneExt)
15
Using Normalization we learned…
 that to represent the data on this report in a relational database, we will
need to create 3 separate tables.
EmpNo
Name
Dept
Salary
100
John
MKTG
42000
140
Sue
ACCT
41000
110
Bob
INFO
70000
190
Alex
FINA
65000
150
EmpNo INFO Course
Jennifer
49000

100
SPSS
6/19/2007
100
Surveys
11/3/2011
140
Tax Acc
12/1/2005
110
Java
3/21/2008
110
C#
10/23/2014
190
Dept
MKTG
Date
Completed
Investmts.
Dept
150
Java
Phone Ext.
150
Oracle
5325
ACCT
4422
INFO
7373
FINA
4477
2/1/2008
3/11/2006
5/19/2007
16
Here's our solution!
0..*
EmpCourse
EmpNo
PK
Course
DateCompleted
1..1
Employee
EmpNo - PK
Name
DeptCode
Salary
1..*
What are the foreign keys?
1..1
Dept
DeptCode - PK
DeptPhoneExt
17
Integrity Controls
Input Controls
 prevent erroneous data
− Value limit controls
− Completeness controls
− Data validation controls
− Field combination controls
Output Controls
 accurate, current, and
complete
− Physical access to output
devices
− Discarded output
− Completeness
• pg 1 of 10
− Currency
• date & timestamp
18
Integrity Controls
Access Controls
 SIUD; Views
Complex update controls
 Handling multi-user updates – DBMS handles
Protection from hardware failures
 Backup & Recovery
 Redundancy
Transaction Logging
 2 benefits
19
Integrity Controls
Fraud Prevention
 Fraud Triangle
Techniques to  risk
 Separation of duties
 Records & Audit trails
 Monitoring
− Unusual transactions
 Asset control
− Limit physical access
 Security
20
Normalization Practice
Order
No
Cust
No
Name
Addr
City
St
Zip
Order
Date
61384
1273
Cont. Designs
123 Oak
Austin
TX
28384
11/04/13
61384
1273
Cont. Designs
123 Oak
Austin
TX
28384
61384
1273
Cont. Designs
123 Oak
Austin
TX
62890
3891
J Consultants
523 Pine
Waco
62890
3891
J Consultants
523 Pine
63129
1273
Cont. Designs
123 Oak



Promised
Date
Prod
No
Desc
Qty
Ord
Unit
Price
11/21/13
M128
Bookcase
4
200
11/04/13
11/21/13
B381
Cabinet
2
150
28384
11/04/13
11/21/13
R210
Table
1
500
TX
76712
11/15/13
11/21/13
A891
Chair
2
300
Waco
TX
76712
11/15/13
11/21/13
M128
Bookcase
8
200
Austin
TX
28384
12/10/13
12/29/13
A891
Chair
6
300
It’s already in 1NF …but you need to understand why!
Convert this to 2NF. How many tables do you now have?
Convert this to 3NF. How many tables do you now have?
21
Normalization Practice
Appointment Table
Appt
No
1
2
3
4
5
6
7
8
9
10


Appt
Date
12/1/2014
12/1/2014
12/1/2014
12/2/2014
12/2/2014
12/2/2014
12/2/2014
12/2/2014
12/3/2014
12/3/2014
Appt
Planned
Appt
Time
Duration
Type
3:00 AM
1.00
Physical
3:00 AM
0.25
Shot
3:15 AM
0.50
Flu
10:00 AM
0.50
Migraine
10:15 AM
0.25
Shot
10:30 AM
0.25
Shot
10:45 AM
0.50
Flu
11:00 AM
1.00
Physical
10:30 AM
1.00
Physical
9:00 AM
0.50
Migraine
Patient
ID
466927
456789
194756
329657
987453
384788
438754
345875
466927
345875
First
Nm
Lisa
Sue
Brandon
Marcus
Mike
Tonya
Iliana
Carla
Lisa
Carla
Last
Nm
Garcia
Carey
Pierre
Schwartz
Jones
Johnson
Hnatt
Basich
Garcia
Basich
Phone
562-3456
432-1234
432-7877
239-5502
456-0202
432-8806
823-4303
857-5566
562-3456
857-5666
Doctor
ID
C678
A528
S626
A528
G123
S626
C678
A528
C678
C678
Doctor
Nm
Chapman
Lopez
Smith
Lopez
Gray
Smith
Chapman
Lopez
Chapman
Chapman
It’s already in 1NF, and 2NF …but you need to understand why!
Convert this to 3NF. How many tables do you now have?
22