Itec 3220 - Department of Mathematics and Statistics

Download Report

Transcript Itec 3220 - Department of Mathematics and Statistics

ITEC 3220M
Using and Designing Database Systems
Instructor: Prof. Z. Yang
Course Website:
http://people.yorku.ca/~zyang/itec
3220m.htm
Office: Tel 3049
Supertypes and Subtypes
• Generalization hierarchy: depicts
relationships between higher-level supertype
and lower-level subtype entities
• Supertype: contains the shared attributes
• Subtype: contains the unique attributes
• Inheritance:
– Subtype entities inherit values of all attributes of
the supertype
– An instance of a subtype is also an instance of
the supertype
2
Supertypes and Subtypes (Cont’d)
Supertype/
subtype
relationships
Attributes shared by
all entities
General
entity type
SUPERTYPE
And so forth
SUBTYPE1
Attributes unique
to subtype1
SUBTYPE2
Specialized
version of
supertype
Attributes unique
to subtype2
3
Supertypes and Subtypes (Cont’d)
• Disjoint relationships
– Unique subtypes
– Non-overlapping
– Indicated with a ‘G’
• Overlapping subtypes
– An instance of the supertype could be more
than one of the subtypes
– Indicated with a ‘Gs’
4
Generalization Hierarchy with
Overlapping Subtypes
5
Chapter 5
Logical Database Design and
Normalization of Database Tables
In this chapter, you will learn:
• How to transform ERD into relations
• What normalization is and what role it plays in database
design
• About the normal forms 1NF, 2NF, 3NF, BCNF, and 4NF
• How normal forms can be transformed from lower normal
forms to higher normal forms
• Normalization and E-R modeling are used concurrently to
produce a good database design
• Some situations require denormalization to generate
information efficiently
7
Transforming ERD into Relations
• Step one: Map regular entities
– Each regular entity type in an ER diagram is
transformed into a relation
– The name given to the relation is generally the
same as the entity type
– Each simple attribute of the entity type becomes
an attribute of the relation and the identifier of
entity becomes the primary key of the
corresponding relation
8
Example
STUDENT
Student_ID Student_Name Other_Attributes
9
Transforming ERD into Relations
(Cont’d)
• Step two: Map weak entities
– Create a new relation and include all of the
simple attributes as the attributes of this
relation. Then include the primary key of
the identifying relation as a foreign key
attribute in this new relation. The primary
key of the new relation is the combination
of this primary key of the identifying
relation and the partial identifier of the
weak entity type.
10
Example
Employee_ID
Date_of_birth
Employee_Name
Dependent_Name
EMPLOYEE
Has
DEPENDENT
Gender
EMPLOYEE
Employee_ID
Employee_Name
DEPENDENT
Dependent_Name
Employee_ID
Date_of_birth
Gender
11
Transforming ERD into Relations
(Cont’d)
• Step three: Map binary relationship
– Map Binary one-to-many relations
•First create a relation for each of the two entity
types participating in the relationship, using the
procedure described in step one.
•Next, include the primary key attribute of the
entity on the one-side of the relationship as a
foreign key in the relation that is on the many-side
of the relationship.
12
Example
Customer_ID
Customer_Name
Order_ID
1
Order_Date
M
Customer
Submits
(0,N)
Order
(1,1)
Customer
Customer_ID
Customer_Name
Order
Order_ID
Order_Date
Customer_ID
13
Transforming ERD into Relations
(Cont’d)
• Step three: Map binary relationship
(Cont’d)
– Map binary one-to-one relationships
•First, two relationships are created one for each
of the participating entity types.
•Second, the primary key of one of the relations is
included as a foreign key in the other relation.
14
Example
Nurse_ID
Nurse_Name
Centre_Name
1
1
In_charge
Nurse
Location
(0,1)
(1,1)
Nurse
Nurse_ID
Nurse_Name
Care
Centre
Centre_Name
Location
Care
Centre
Nurse_in_c
harge
15
Transforming ERD into Relations
(Cont’d)
• Step Four: Map composite Entities
– First step
• Create three relations: one for each of the two
participating entities, and the third for the composite
entity. We refer to the relation formed from the
composite entity as the composite relation
– Second step
• Identifier not assigned: The default primary key for the
composite relation consists of the two primary key
attributes from the other two relations.
• Identifier assigned: The primary key for the composite
relation is the identifier. The primary keys for the two
participating entity types are included as foreign keys in
the composite relation.
16
Example
Order_ID
Order
Order_Date
1
(1,N)
(1,1)
M
Order
Line
Product_ID
M
(1,1)
1
(0,N)
Product
Description
Quantity
Price
17
Example
Order
Order_ID
Order_Date
Order Line
Product_ID
Order_ID
Quantity
Description
Standard_Price
Product
Product_ID
18
Example
Customer_ID
Vendor_ID
Name
Customer
1
Date
M
Shipment
Shipment_No
Customer Customer_ID
Vendor_ID
M 1
Vendor
Amount
Customer_Name
Shipment Shipment_N Vendor_I
o
D
Vendor
Address
Address
Customer_I Date
D
Amount
19
Transforming ERD into Relations
(Cont’d)
• Step Five: Map unary relationship
– Map unary one-to-may relationship
•The entity type in the unary relationship is
mapped to a relation using the procedure
described in Step one. Then a foreign key
attribute is added within the same relation that
references the primary key values. A recursive
foreign key is a foreign key in a relation that
references the primary key values of that same
relation.
20
Example
Name
Employee_ID
Employee
Birthdate
M (1,1)
1 (0,N)
Employee
Employee_ID
Manages
Name
Birthdate
Manager_ID
21
Transforming ERD into Relations
(Cont’d)
• Step six: Map ternary relationship
– Convert a ternary relationship to a composite
entity
– To map a composite entity that links three regular
entities, we create a new composite relation. The
default primary key of their relation consists of
the three primary key attributes for the
participating entities. Any attributes of the
composite entity become attributes of the new
relation
22
Example
Patient_ID
Patient_Name
Patient
1
M
Results
(0,N)
(1,1)
Physician_ID
(0,N)
(1,1) M
Physician_Name
Physician
1
Date
Patient
Treatment
Time
Treatment_
Code
M
(1,1)
1
(0,N)
Treatment
Description
23
Example
Patient
Patient_ID
Patient_Name
Physician
Physician_ID
Physician_Name
Patient Treatment
Patient_ Physician_I Treatment_C Date
ID
D
ode
Treatment
Treatment_Code
Time
Result
Description
24
Transforming ERD into Relations
(Cont’d)
• Step seven: Map supertype/subtype
relationships
1. Create a separate relation for the supertype and
for each of its subtype
2. Assign to the relation created for the supertype
the attributes that are common to all members
of the supertype, including the primary key
3. Assign to the relation for each subtype the
primary key of the supertype, and only those
attributes that are unique to that subtype
4. Assign one attribute of the supertype to function
as the subtype discriminator
25
Example
Address
Employee_Name
Employee_Number
Employee_Type
Employee
Date_Hired
Gs
Hourly
Employee
Hourly_Rate
Salaried
Employee
Annual_Salary
Stock_Option
26
Example
Employee
Employee_Nu Employee_N Address
mber
ame
Hourly_Employee
H_Employee_Number
Employee_Ty Date_Hi
pe
red
Hourly_Rate
Salaried_Employee
S_Employee_Number
Annual_Salary
Stock_Option
27
Database Tables and
Normalization
• Table is the basic building block in database design
• Normalization is the process for assigning attributes
to entities
– Reduces data redundancies
– Helps eliminate data anomalies
– Produces controlled redundancies to link tables
• Normalization stages
–
–
–
–
1NF - First normal form
2NF - Second normal form
3NF - Third normal form
4NF - Fourth normal form
28
Need for Normalization
29
Anomalies In the Table
• PRO_NUM intended to be primary key
• Table displays data anomalies
– Update
• Modifying JOB_CLASS
– Insertion
• New employee must be assigned project
– Deletion
• If employee deleted, other vital data lost
30
Conversion to First Normal Form
• Repeating group
– Derives its name from the fact that a group of
multiple entries of same type can exist for any
single key attribute occurrence
• Relational table must not contain repeating
groups
• Normalizing table structure will reduce data
redundancies
• Normalization is three-step procedure
31
Conversion to First Normal Form
(continued)
• Step 1: Eliminate the Repeating Groups
– Present data in tabular format, where each
cell has single value and there are no
repeating groups
– Eliminate repeating groups, eliminate nulls
by making sure that each repeating group
attribute contains an appropriate data
value
32
Conversion to First Normal Form
(continued)
33
Conversion to First Normal Form
(continued)
• Step 2: Identify the Primary Key
– Primary key must uniquely identify attribute
value
– New key must be composed
34
Conversion to First Normal Form
(continued)
• Step 3: Identify all dependencies
– Dependencies can be depicted with help of
a diagram
– Dependency diagram:
•Depicts all dependencies found within given
table structure
•Helpful in getting bird’s-eye view of all
relationships among table’s attributes
•Makes it less likely that will overlook an
important dependency
35
Conversion to First Normal Form
(continued)
36
Conversion to First Normal Form
(continued)
• First normal form describes tabular format in
which:
– All key attributes are defined
– There are no repeating groups in the table
– All attributes are dependent on primary key
• All relational tables satisfy 1NF requirements
• Some tables contain partial dependencies
– Dependencies based on only part of the primary key
– Sometimes used for performance reasons, but should
be used with caution
– Still subject to data redundancies
37
Conversion to Second Normal
Form
• Relational database design can be
improved by converting the database
into second normal form (2NF)
• Two steps
38
Conversion to Second Normal
Form (continued)
• Step 1: Write Each Key Component
on a Separate Line
– Write each key component on separate
line, then write original (composite) key on
last line
– Each component will become key in new
table
39
Conversion to Second Normal
Form (continued)
• Step 2: Assign Corresponding
Dependent Attributes
– Determine those attributes that are
dependent on other attributes
– At this point, most anomalies have been
eliminated
40
Conversion to Second Normal Form
(continued)
41
Conversion to Second Normal
Form (continued)
• Table is in second normal form (2NF)
when:
– It is in 1NF and
– It includes no partial dependencies:
•No attribute is dependent on only portion of
primary key
42
Conversion to Third Normal Form
• Data anomalies created are easily
eliminated by completing three steps
• Step 1: Identify Each New Determinant
– For every transitive dependency, write its
determinant as PK for new table
•Determinant
– Any attribute whose value determines other values
within a row
43
Conversion to Third Normal Form
(continued)
• Step 2: Identify the Dependent
Attributes
– Identify attributes dependent on each
determinant identified in Step 1 and
identify dependency
– Name table to reflect its contents and
function
44
Conversion to Third Normal Form
(continued)
• Step 3: Remove the Dependent Attributes from
Transitive Dependencies
– Eliminate all dependent attributes in transitive
relationship(s) from each of the tables that have
such a transitive relationship
– Draw new dependency diagram to show all tables
defined in Steps 1–3
– Check new tables as well as tables modified in
Step 3 to make sure that each table has
determinant and that no table contains
inappropriate dependencies
45
Conversion to Third Normal Form
(continued)
46
Conversion to Third Normal Form
(continued)
• A table is in third normal form (3NF)
when both of the following are true:
– It is in 2NF
– It contains no transitive dependencies
47
The Boyce-Codd Normal Form
(BCNF)
• Every determinant in table is a candidate key
– Has same characteristics as primary key, but for some
reason, not chosen to be primary key
• When table contains only one candidate key, the
3NF and the BCNF are equivalent
• BCNF can be violated only when table contains
more than one candidate key
48
The Boyce-Codd Normal Form
(BCNF) (continued)
• Most designers consider the BCNF as
special case of 3NF
• Table is in 3NF when it is in 2NF and
there are no transitive dependencies
• Table can be in 3NF and fails to meet
BCNF
– No partial dependencies, nor does it
contain transitive dependencies
– A nonkey attribute is the determinant of a
key attribute
49
The Boyce-Codd Normal Form
(BCNF) (continued)
50
The Boyce-Codd Normal Form
(BCNF) (continued)
51
An Example
GRADE( Student_ID, Student_Name, Address, Major,
Course_ID, Course_Title, Instructor_Name,
Instructor_Office, Grade)
52
Normalization and Database
Design
• Normalization should be part of the design
process
• E-R Diagram provides macro view
• Normalization provides micro view of entities
– Focuses on characteristics of specific entities
– May yield additional entities
• Difficult to separate normalization from E-R
diagramming
• Business rules must be determined
53
Higher-Level Normal Forms
• Fourth Normal Form (4NF)
– Table is in 3NF
– Has no multiple sets of multivalued
dependencies
54
Conversion to 4NF
Stud-ID
Course
Service
Stud-ID
Course
1126
1212F
Red Cross
1126
1212F
1126
1620F
United Way
1126
1620F
1126
1320F
1126
1320F
Stud-ID
Service
1126
Red Cross
1126
United Way
Stud-ID
Course
1126
1212F
1126
1620F
1126
1320F
Service
1126
Red Cross
1126
United Way
Stud-ID
Course
Service
1126
1212F
Red Cross
1126
1620F
United Way
1126
1320F
Multivalued Dependencies
Set of Tables in 4NF
55
Denormalization
• Normalization is one of many database
design goals
• Normalized table requires
– Additional processing
– Loss of system speed
• Normalization purity is difficult to sustain
due to conflict in:
– Design efficiency
– Information requirements
– Processing
56
Exercise
Part Supplier Data
Part_
Description
No
1234
Logic Chips
5678
Memory Chips
Vendor_Nam
e
Fast Chips
Smart Chips
Address
Unit_C
ost
Cupertino 10.00
Phoenix
8.00
Fast Chips
Cupertino
Quality Chips
Austin
Smart Chips
Phoenix
3.00
2.00
5.00
57
Exercise(Cont’d)
• Convert the table to a relation in first normal form
(Named Part Supplier)
• List the functional dependency in the Part Supplier
and identify a candidate key
• For the relation Part Supplier, identify the
followings: an insert anomaly, a delete anomaly, and
a modification anomaly.
• Draw a relation schema and show the functional
dependencies
• Develop a set of 3NF relations from Part Supplier
58