Normalisation, Determinancy

Download Report

Transcript Normalisation, Determinancy

Database Design
Normalization
Dr. Xiaogang Peng
College of Computer Science and Software Engineering
Shenzhen University
In this class you will get to know:
What’s the problem related to DB design
(Anomalies)
 How to Deal with it (Normal Form)
 How does it relate to last class( Functional
Dependency and Relational DB theory )
 Online Material:
 http://www.Drpeng.org

Content of this Class:
Normalization concept and the process
 First normal form (the foundation!)

 Identify
Multi values VS nesting relationship
 Decompose the table
 Finding primary keys
 Set up a foundation for other Normal form
The problem
Considering the relation:
 {Sno,sdept,mn,courseNo,grade}
 Some problem exists:

 Change
the name of the manager in a dept.
 Insert new classes in a new course
 Delete the student information will cause the
course information loss.
 Anomalies
Normalisation
Defn:
Normalisation is:
a technique for producing a set of
relations with desirable properties,
given the data requirements of an
enterprise.
•
The purpose of “The properties” :
•Removing redundancy
•By table decomposition
• Normalization Level:
•1st, 2nd, 3rd, Boyce-Codd, 4th and
5th Normal Forms
•Strictness trend: 1NF->5NF
First Normal Form
Defn:
A relation is in First Normal Form
if every non-primary-key attribute
is functionally dependent on the
primary key.
Analysis :

What does “functionally
dependent
on the primary key” mean?
1)
no multi-value cells.
2)no Nesting Relation
Question:
Is the following table in 1NF?
supplierNo
supplierName partNo
S5
Wells
P1
S2
Heath
P1, P4
S7
Barron
P6
S9
Edwards
P8, P2, P6
How about this one?
supplierNo
S5
S2
S2
S7
S9
S9
S9
supplierName
Wells
Heath
Heath
Barron
Edwards
Edwards
Edwards
partNo
P1
P1
P4
P6
P8,
P2
P6
•A relation that is not in 1NF
effectively embeds other relations
within it.
•The previous relation can be
represented as:
SupplierPart(supplierNo,
supplierName, Part(partNo))
Solution:

1)Form new relations for each nonatomic
attribute or nested relation along with the
primary key of the original table. (Why?)
2)Choose the Primary key for the newly
generated Relation.(How?)
 Simple
way: by enterprise rules and we get the FDs
 Complicated case: by Relational DB Theory : Key
Finding.
 Example:

SupplierPart(supplierNo,supplierName,
Part(partNo)).
 Solution!
SupplierPart(supplierNo,supplierName)
 SupplierPart(supplierNo,partNo)

suppli
erNo
S5
S2
S2
S7
S9
S9
S9
supplierN
ame
Wells
Heath
Heath
Barron
Edwards
Edwards
Edwards
suppli
erNo
S5
S2
S2
S7
S9
S9
S9
partNo
P1
P1
P4
P6
P8,
P2
P6
Any other solutions?





SupplierPart(supplierNo,supplierName,
Part(partNo))
Hint!
supplierNosupplierName
So we can replace SupplierNo with
supplierName
Solution:
 SupplierPart(supplierNo,supplierName)
 SupplierPart(supplierName,partNo)
Try to normalise the following to 1NF
Customer number
Customer name
Customer address
Order(
Order no
Order date
Order value
Order item(
Item code
Item quantity
Item unit price ))
Cust No
Name
Address
Why we don’t use
the Order-No and
Cust-no as the
primary key ???
Order No
Cust no
Order date
Order value
Item Code
Order No
Qty
Unit Price
Another Example:
A big Table with Real World Information
Branch (branchNo, branchAddress customerNo,
customerAddress, creditCode, creditLimit,
itemNo, itemDesc, price)
 steps:

 Analyze
the nesting relation
 Decompose the table according to the solution
Find the nesting relation!
Customer_sale
branch branch custN
No
Address o
Cust
credit
Address Code
credit
Limit
itemNo Item
Desc
price
B1
B3
B1
B2
B1
CustomerSale(branchNo,
branchAddress
Customer(customerNo,
customerAddress, creditCode,
creditLimit,Item(itemNo, itemDesc,
price)))
PR
C4
3 Preston St.
CR1
150
147
Nails
0.49
MAN
C6
1 Manchester
Rd.
CR1
150
142
Pins
0.62
PR
C4
3 Preston
St.
CR1
150
144
Cups
3.27
LIV
C19
7 Liverpool
Rd.
CR2
200
147
Nails
0.49
PR
C20
5 Preston St.
CR2
200
150
Nails
0.69
And now:
Branch (branchNo, branchAddress)
Customer(customerNo, customerAddress,
creditCode, creditLimit, branchNo)
CustomerItem (customerNo, itemNo, itemDesc,
price)
What assumption that I have made in deciding
primary key for the Customer table?
What is the case that Customer and branchNo are
chosen as the primary key?
Second Normal Form
Defn:
A relation is in Second Normal Form if it is
in First Normal Form and every non-key
attribute is fully functionally dependent on
any key.
Full functional dependency
B is fully functionally dependent on A if B is
functionally depend on A, but not on any
proper subset of A.
2NF, therefore, applies to a relation with
a composite primary key.
Consider the following relation
Result(studentID, moduleCode,
moduleTitle, mark)
Is it in 1NF?
Is it in 2NF?
solution
Set up a new relation for each partial key with
its dependent attribute(s). Keep a relation with
the original primary key and any attributes that
are fully functionally dependent on it.
Result(studentID, moduleCode, moduleTitle,
mark)=>
Result(studentID, moduleCode, mark) ;
Module(moduleCode, moduleTitle)

Try to normalise the following to 2NF
Item Code
Description
Qty in stock
Minimum Qty
Item Code
Order No
Order date
Order qty
Custno
Cust Name
Cust Address
Item Code
Supplier Code
Name
Address
Date of last order
assumptions:
1)order no uniquely identifies a date
and a customer;
2)the order qty is the qty of an item
on an order;
3)the supplier code uniquely
identifies a supplier and the date of
last order;
Item code
Description
Qty in stock
Minimum Qty
Item Code
Order no
Order qty
Order no
Order date
Custno
Name
Address
Item code
Supplier Code
Supplier Code
Name
Address
Date of last order
Third Normal Form
Defn:
A relation is in Third Normal Form if it is in
Second Normal Form, and no non-key
attribute is transitively dependent on any key.
Transitive dependency
If A
B and B
C, then C is transitively
dependent on A (provided that A is not
functionally dependent on B or C).
Consider the following relation
Customer(customerID, name, address,
creditCode, creditLimit)
Assume that one credit code can apply
to several customers. But one
customer can have only one credit
code
Is it in 1NF?
Is it in 2NF?
Is it in 3NF?
Solution
Set up a relation that includes the nonkey
attributes(s) that functionally determine(s)
other nonkey attributes(s).
Customer(customerID, name, address,
creditCode, creditLimit)=>
Customer(customerID, name, address,
creditCode); Credit(creditCode,
creditLimit)

Example
Example: StudentInfo(SSNo, Name, Major,
Dept, DeptChair), SSNo-> Name,Major;
Major -> Dept; Dept -> DeptChair
 Example: R(A, B, C); FDs: A->B, B->C

Solutions
Example: StudentInfo(SSNo, Name, Major,
Dept, DeptChair),
 Solution: StudentInfo(SSNo, Name, Major),
MajDept(Major, Dept), DeptInfo(Dept,
DeptChair)
 Example: R(A, B, C); FDs: A->B, B->C
 Solution: R1(A, B); R2(B, C);

Boyce-Codd Normal Form
Defn:
A relation is in BCNF if every
determinant is a candidate key.
BCNF applies to a relation which has
more than one candidate key, and
the candidate keys are composite
and overlapping.
Consider the following relation
SSL(student, subject, lecturer)





SSL(student, subject, lecturer)
Assume that: each teacher teach only one
subject. A subject has several teachers to teach,
{student, subject}  lecturer;
lecturer  subject; student,lecturer  subject
Is it in bcnf?
solution
Decompose the relation r(As) using the
functional dependency X -> Y (X is not a
superkey) as follows: r1(all As, but remove
Y), r2(XY).
 SSL( student, subject, lecturer)=>
 SS( student, lecturer);
 SL( subject, lecturer);

Supplier(supplierNo, productCode,
supplierName, quantity)
Where supplierNo SupplierName
Is it in 1NF?
Is it in 2NF?
Is it in 3NF?
Is it in BCNF?
supplierNo
productCode
quantity
supplierNo
supplierName
OR
supplierName
productCode
quantity
supplierName
supplierNo