The Relational Model and Normalization (2)

Download Report

Transcript The Relational Model and Normalization (2)

The Relational
Model and
Normalization (2)
IS 240 – Database Management
Lecture #8 2004-02-17
Prof. M. E. Kabay, PhD, CISSP
Norwich University
[email protected]
Topics
 Review of Modification Anomalies
 Essence of Normalization
 Classes of Relations
 1st – 5th Normal Forms
 2nd Normal Form
 3rd Normal Form
 Boyce-Codd Normal Form (BCNF)
 4th Normal Form
 Homework
2
Copyright © 2004 M. E. Kabay. All rights reserved.
Review of Modification
Anomalies
Member
Book
1:N
Member_ID
Book_ID
Member_Name
Book_Name
Member_Address
Book_ISBN
Member_Etc
Date_Borrowed
Member_ID
Date_Returned
Book_Location
3
Copyright © 2004 M. E. Kabay. All rights reserved.
What is the
Member_ID if
the book isn’t
out?
How can we
delete a
member if there
are still records
showing the
books (s)he
borrowed and
returned?
Essence of Normalization
 We run into problems when an entity tries to
include attributes that belong to different
relations.
*“Problems occur when a relation contains
facts about two different themes.”
 E.g.,
Storing information describing a book
along with information about when it was
borrowed
Grouping attributes about a member along
with information about what (s)he
borrowed
4
*Kroenke p. 127
Copyright © 2004 M. E. Kabay. All rights reserved.
Normalization (cont’d)
 When we try to group information about
disparate entities into a single relation, we
generate modification anomalies
 Solution is to break up a relation into
separate relations that don’t mix themes
 E.g., store
information about the member,
information about the book, and
information about each borrowing
in separate relations
5
Copyright © 2004 M. E. Kabay. All rights reserved.
Classes of Relations
 1970s: Theorists analyzed anomalies
 Developed names for different types of
problems
 Named relations as normal forms
1NF = first normal form
2NF = second normal form
3NF = third normal form
BCNF = Boyce-Codd normal form
4NF = fourth normal form
5NF = fifth normal form (not used)
6
Copyright © 2004 M. E. Kabay. All rights reserved.
1NF – First Normal Form
 All the relations that we have defined are 1NF:
 Each cell contains a single value (no lists, tables,
arrays)
 All instances of an attribute (field, column) must be
instances of the same quality
 Every attribute (field, column) is uniquely identified
(same name in all tuples (records, rows)
 Every tuple (record, row) is unique
 Order of attributes and tuples is arbitrary – many
designs are functionally equivalent
 A group of one or more attributes (fields, columns) that
uniquely identifies a tuple (records, row) is called a key
 Every relation has at least one key
 No record (tuple, row) may duplicate another
7
Copyright © 2004 M. E. Kabay. All rights reserved.
1st – 5th Normal Forms
 All relations as we have defined them are
always in first normal form (1NF)
 Other normal forms are subsets of 1NF
 When we split a relation that is in 1NF we
generate some other NF
8
Copyright © 2004 M. E. Kabay. All rights reserved.
Classes of Relations (cont’d)
1NF
2NF
Boyce-Codd
3NF
BCNF
Anomalies
Anomalies
Anomalies
Domain/Key
4NF
5NF
DK/NF
9
Anomalies
Anomalies
Anomalies
That’s all,
folks!
Copyright © 2004 M. E. Kabay. All rights reserved.
1st Normal Form Anomalies
LIBRARY
Book_ID
Book_Name
Member_ID
Member_Name
Date_Borrowed
Date_Returned
Book_Location
ANOMALIES:
Member has no books out?
No one has borrowed a book?
Book_ID  (Book_Name, Book_Location)
Member_ID  Member_Name
(Book_ID, Member_ID)  Date_Borrowed, Date_Returned
10
Copyright © 2004 M. E. Kabay. All rights reserved.
2NF
 A relation is in 2NF if all its nonkey attributes
are dependent on the entire key.
BOOK
Book_ID
Book_Name
Book_Location
MEMBER
Member_ID
Member_Name
Keys
in
RED
BORROWING
Book_ID
Member_ID
Date_Borrowed
Date_Returned
11
Copyright © 2004 M. E. Kabay. All rights reserved.
Only one
2nd Normal Form (cont’d)major
per
MAJOR_INFO
Student_ID
Major_Name
Director_ID
student
Only one
director
per major
Major_Name  Director_ID
Student_ID  Major_Name  Director_ID
Transitive dependency
ANOMALIES:
How will we know who is running a major
before a student registers for that major?
How do we keep information about who’s running a major
if the last student registered withdraws from that major?
How many records will we have to update
if we change the director of a major?
12
Copyright © 2004 M. E. Kabay. All rights reserved.
3NF
STUDENT_MAJOR
Student_ID
Major_Name
MAJOR_DIRECTOR
Major_Name
Director_ID
 “A relation is in 3NF if it is in 2NF
and has no transitive dependencies.”
13
Copyright © 2004 M. E. Kabay. All rights reserved.
3NF Anomalies
PATIENT
3NF
Patient
Ward
Nurse
50
Cardiac
Able
55
ICU
Baker
70
Pulmonary
Charlie
90
ICU
Douglas
110
Cardiac
Elkins
110
Cardiac
Able
Key: (Patient, Nurse)
FD*: Nurse  Ward
Nurse is a determinant yet it is not a
possible (candidate) key
Anomaly: removing patient 110 will remove
fact that Elkins works in the Cardiac ward
14
*FD = Functional Dependency
Copyright © 2004 M. E. Kabay. All rights reserved.
Boyce-Codd Normal Form
(BCNF)
Patient
Nurse
Ward
Patient Nurse
Ward
Nurse
50
Able
Cardiac
Able
55
Baker
ICU
Baker
70
Charlie
Pulmonary Charlie
90
Douglas
ICU
Douglas
110
Elkins
Cardiac
Elkins
110
Able
 A relation is in BCNF if every determinant is a
candidate key.
15
Copyright © 2004 M. E. Kabay. All rights reserved.
Multi-Value Dependencies
 Suppose a patient can have several diseases
Patient   Disease
 The patient can also have several insurers
Patient   Insurer
 Storing complete information about patients
would require all combinations of patient,
disease and insurer; e.g.,
Patient_ID
A
A
A
A
16
Disease
Liver
Liver
Heart
Heart
Copyright © 2004 M. E. Kabay. All rights reserved.
Insurer
Prudential
Liberty
Prudential
Liberty
4th Normal Form
Patient
Disease
Patient
 A relation is in 4NF if it is in BCNF
and has no multi-value dependencies.
17
Copyright © 2004 M. E. Kabay. All rights reserved.
Insurer
Homework
 Review Chapter 5 material thoroughly
 REQUIRED: by Tuesday 24 Feb 2004
For 7 points, complete Group 1 questions
5.11 through 5.17
Study to the end of Chapter 5 and then
review the entire chapter carefully
 By next Tuesday the 24th,
Work on Ferrett et al. to complete up to and
including Project #6 in time for your midterm exam
 OPTIONAL: by THURSDAY THE 26TH FEB,
complete Ch 5 FiredUp Project questions A
through H for an extra 8 extra points.
18
Copyright © 2004 M. E. Kabay. All rights reserved.
DISCUSSION
(Reminder:
Mid-term Exam
th
Thu 26 )
19
Copyright © 2004 M. E. Kabay. All rights reserved.