Normal Forms

Download Report

Transcript Normal Forms

Introduction to Schema
Refinement
Normal Forms
• Types of Normal Form
•
•
•
•
•
•
First Normal Form (1NF)
Second Normal Form (2NF)
Third Normal Form(3NF)
Boyce-Codd Normal Form ( BCNF)
Fourth Normal Form (4NF)
Fifth Nornal Form (5NF)
Normal Forms
• First Normal Form (1NF)
• A row of data cannot contain repeating group
of data
Cid
name
Subject
• Ie atomic value
101
Jeet
PHY
101
102
103
•
•
Jeet
Seet
Swet
CHE
PHY
SOCIAL
Here the student Jeet is used twice in the table and subject PHY
is repeated
Another method is to divide the relation into 2
Cid
101
102
Cname
Jeet
Seet
Subid
1
2
3
4
Cid
101
101
102
103
Subject
PHY
CHE
PHY
SOCIAL
Normal Forms
• Second normal form (2NF):
• A relation that is in 1NF and every non-primary
key attribute is fully functionally dependent on
the primary key.
• Does not permit partial dependency
• No attribute is dependent on only to primary
key
• Primary key consists of only a single attribute it
is automatically be in 2NF if it is in 1NF
Normal Forms
• Second normal form (2NF):
•
•
•
•
•
•
•
•
Consider the following relation, not in 2NF
Cid
101
101
102
103
Cname
Adam
Adam
Alex
Sumo
Order_id Order_name Sale_details
10
Order1
Sale1
11
order2
sale2
12
Order3
Sale3
13
Order4
Sale4
Here Cid & Order_id is PK
It is in 1NF
Not in 2NF, there are partial dependencies of columns on Primary key
Cname is only dependent on Cid
Order_name is dependent on order_id
There is no link between Cname & Sale_details
To reduce this table into 2NF, break the table into 3 different tables
Cid
101
102
103
Cname
Adam
Alex
Sumo
Order_id
10
11
12
13
Order_name
Order1
order2
Order3
Order4
Cid
Order_id
101
101
102
103
10
11
12
13
Sale_detail
s
Sale1
sale2
Sale3
Sale4
•
•
•
•
•
•
•
•
Normal Forms
Second normal form (2NF):
1NF  2NF
Order
s1
s1
s2
s3
s3
Product
p1
p2
p3
p5
p1
Customer
Anish
Anish
Ammu
Sumo
Sumo
Address
bhopal
bhopal
kollam
kottayam
kottayam
Qty
300
100
500
200
450
Unit Price
500
600
300
200
500
This relation is in 1NF, because the value of each
domain are atomic.
To convert into 2NF
First find the attribute that make primary key
No one attribute alone form a primary key
2NF does not contain partial dependency
Divide the above relation into 3
Normal Forms
•
R1
Order
s1
s2
s3
Customer
Anish
Ammu
Sumo
R2
Address
bhopal
kollam
kottayam
Product
p1
p2
p3
p5
Unit Price
500
600
300
200
R3
Order
s1
s1
s2
s3
s3
Product
p1
p2
p3
p5
p1
Qty
300
100
500
200
450
• Advantages
• Insert : can insert any row in relation R1
without using product attribute
• Delete : can delete the S1 & P2 by deleting a
row from relation R3, without losing the
information that in R1
• Update: address for a given customer is written
once, it is not repeated many times in relation
R1, so we can update the address only once
Normal Forms
• Third normal form (3NF)
•
3NF is based on the concept of transitive dependency.
• Transitive dependencies are not allowed in 3NF.
• Transitive dependency means, if in a relation if XY and YZ
hold, then X Z is also a functional dependency that holds on
R.
• Here X, Y, Z are attributes of the table and also Y should not be
a candidate key or a subset of any key (prime attribute) of the
table R.
• Ie all non-prime attribute of table must be dependent on
primary key
• Example.
• Student3
•
•
•
•
•
•
•
•
•
•
•
•
•
Normal Forms
Third normal form (3NF)
3 FD’s here. that is
Fd1
Stdidgrade
Fd2
Stdid  marks
Fd3
Marks  grade
We can see that marks is not a prime attribute of student3.
Stdid  grade is a transitive dependency because of Fd2 and Fd3.
This is not allowed in 3NF.
Normal Forms
• Third normal form (3NF)
• A relation R is said to be in 3NF, if R is in 2NF and also no non
prime attribute of R is transitively dependent on the key of R.
• The above relation schema student3 is in 2NF, since there are
no partial dependencies on a key exists.
• But it is not in 3NF because of the transitive dependency stdid
 grade via ‘marks’.
• We can normalize student3 by decomposing it in to two 3NF
relation schemas,
• Student3A and student3B as follows.
•
• Student3A (stdid, branch, sem, rn, name, marks)
• Student3B (marks, grade)
Normal Forms
• Third normal form (3NF)
• Student3A and student3B as follows.
•
• Student3A (stdid, branch, sem, rn, name, marks)
• Student3B (marks, grade)
Normal Forms
•
•
•
•
•
Boyce Codd Normal form (BCNF)
It higher form of 3NF
This is because every relation in BCNF is als ion 3NF.
However a relation in 3NF may not be in BCNF.
A relation schema R is in BCNF if whenever a
non trivial functional dependency X  A holds in R,
then X is a superkey of R.
• The only difference between BCNF and 3NF is that
the condition (b) of 3NF is absent from BCNF.
•
Normal Forms
• Boyce Codd Normal form (BCNF)
•
Here we can see that the relation Lots1A is not in BCNF, but
it is in 3NF.
• FD5 violates BCNF because area is not a superkey.
• Fd1 and Fd2 satisfies BCNF because the LHS are super keys.
• So remove the attribute (county name) and place it in another
relation.
Normal Forms
• Boyce Codd Normal form (BCNF)
•
•
•
•
•
Normal Forms
Fourth Normal Form (4NF)
A relation is in 4NF if
It is in BCNF
And
It has no multivalued dependency
Subject
MCA
•
•
•
•
•
•
Computer
Trainer
Aji
Jinson
Lisha
JK
Textbooks
DBMS
JAVA
OS
C++
In this relation
Each subject has a well-defined set of trainers
Eg: MCA subject has 3 trainers
Each subject has a well-defined set of textbooks
Eg: mca subject has 2 textbooks
The textbook that are used for a given subject are
independent of the trainers
Normal Forms
• Fourth Normal Form (4NF)
•
•
The table has been converted to a realtion by filling in
Subject
Trainer
Textbooks
all of empty rows
MCA
AJI
DBMS
The relation is in 1NF
MCA
AJI
JAVA
MCA
MCA
MCA
MCA
Computer
Computer
•
•
Jinson
Jinson
Lisha
Lisha
JK
JK
DBMS
JAVA
DBMS
JAVA
OS
C++
The primary key of this relation consists of all three
attributes
Since there is no determinants other than the primary
key, this relation is actually in BCNF
Normal Forms
• Fourth Normal Form (4NF)
• Suppose for teaching MCA a new trainer
comes, it is necessary to create 2 new tuples ,
one for each of the 2 text books.
• See that it is not necessary to include all faculty
• Decomposition cannot be made on the basis of
functional dependencies, because there are no
functional dependencies in the relation.
• So we introduce multi valued dependencies
(MVDs) in the relation.
•
•
•
•
•
•
•
•
•
•
•
•
•
Normal Forms
Multivalued dependencies and 4NF
Subject  trainer
Subjecttextbook
Double arrows are used here.
Read as
“ subject multidetermines trainer”
or “trainer is multidependent on subject”)
we know that a subject does not have a single corresponding trainer,
ie..
functional dependency subject  trainer does not hold.
But each subject has a well defined set of corresponding trainers.
By well defined here means that for a given subject(MCA) and a given
text book (DBMS) the set of trainers (AJI,Jinson,Lisha) matching the
pair (MCA,Computer) in the relation depends on the value of MCA
alone.
It makes no difference what particular value of text book we choose.
The second MVD can also be interpreted like this.
Normal Forms
• Definition of multi valued dependency
• Let R be a table, and let A, B, C be arbitrary subsets
of the set of attributes of R.
• Then we say that B is multidependent on A ,
AB.
• If and only if the set of B values matching a given ( A
value, C value pair) in R depends only on the A
value and is independent of the C value.
• MVDs always go together in pairs. That is given the
table R (A, B, C), the MVD
• AB holds if and only if A C also holds.
Normal Forms
• Fourth normal form
• This is based on multivalued functional
dependencies.
• A relation schema R is in 4NF with respect to a
set of dependencies F (that includes FDs and
• MVDs) if, for every non trivial multivalued
Subject
Trainer
Textbooks
dependency
MCA
AJI
DBMS
MCA
AJI
JAVA
• X Y, X is a super key of R. MCA Jinson
DBMS
MCA
Jinson
JAVA
• Consider the table.
MCA
Lisha
DBMS
MCA
Computer
Computer
Lisha
JK
JK
JAVA
OS
C++
Normal Forms
• Fourth normal form
• The table or relation CFX is not in fourth normal
form because The MVDs subject  textbook
and subjecttrainer are not satisfying any of
the 2 conditions of fourth normal form.
• So decomposing it into tables
Subject
MCA
MCA
MCA
Computer
Trainer
AJI
Jinson
Lisha
JK
Subject
MCA
MCA
Computer
Computer
Textbooks
DBMS
JAVA
OS
C++
Normal Forms
• Lossless join decomposition
• Consider the example EMP
ename
Smith
Smith
Smith
smith
Pname
X
Y
X
Y
Dname
John
Anna
Anna
John
• Suppose we decompose the EMP table into
Emp_projects and Emp_dependents.
• Emp_projects
Emp_dependents
ename
Smith
Smith
Pname
X
Y
ename
Smith
Smith
Dname
John
Anna
Normal Forms
• Lossless join decomposition
• Suppose we again join these tables we can
see that we get the original EMP table.
• So this decomposition of EMP table in to
Emp_projects and Emp_dependents is a
lossless join decomposition because
nothing is lost after a decomposition.
Normal Forms
• Lossless join decomposition
• Suppose we decompose the supply table in to
two that is R1 and R2.
• We get
•
•
Normal Forms
Lossless join decomposition
If we again join these two tables R1 and R2 we will get
• the join of these tables will not give our original table
supply. So this is a lossy join
• decomposition because after decomposing the Supply
table we have lost some values.
Normal Forms
• Join dependencies and fifth normal form
• In some cases there may be no lossless join
decomposition of a table R into 2 tables but
there may be a lossless join decomposition into
more than 2 tables.
• For example in the supply table
Normal Forms
• Join dependencies and fifth normal form
• If we decompose the supply table in to 3
Normal Forms
• Join dependencies and fifth normal form
• Here we can see that if we again join these
tables R1, R2, R3 we will get the original table.
• We can see that by joining just R1 and R2 will
not get the supply table.
• But by joining all these 3 tables we will get the
supply table
Normal Forms
• Join dependencies and fifth normal form
• So we are moving to another type of dependency called Join
dependency.
• If a join dependency is present in a table we perform decomposition to
fifth normal form(5NF)
• Here for the supply table the join dependency is specified by JD (R1,
R2, R3)
• This is because by joining R1 and R2 and R3 tables we will get the
original table ‘Supply’.
• JD (R1, R2, R3) can also be written as
• JD( (sname, partname), (sname, projname), (partname,projname) )
• We can see that JD( R1, R2) is not valid for the supply table because on
joining R1 and R2 we will not get the Supply table.
• Trivial join dependency
•
For a table R, a join dependency specified as JD(R1, R2, R3…) is
trivial, if any of these Ri’ s is the table R.
Normal Forms
• Fifth normal form
• It is also called project join normal form.
• A relation schema is in fifth normal form (5NF) ,
if for every nontrivial join dependency JD( R1,
R2, R3…), every Ri is a superkey of R.
• example
Normal Forms
• Fifth normal form
• The key of this table is (sname, partname,
projname)
• We have seen that it has a join dependency
• JD { (sname,partname),(sname,projname),
(partname,projname) }
• Here the projections are (sname,partname),
(sname,projname) and (partname,projname).
Normal Forms
• Fifth normal form
• We can say that this table supply is not in 5NF
because of this join dependency
• Each of these projections do not form a
superkey of supply.
• Superkey of supply is (sname,projname,partname).
• (sname,partname) is not a superkey.
• (sname,projname) is not a super key.
• (partname,projname) is not a superkey.
Normal Forms
• Fifth normal form
• So we have to normalise this table supply in to
tables that satisfy 5NF.
• We are decomposing the table supply by
considering the JD.
• Take each of the projections in the JD and form
tables as
Normal Forms
• Fifth normal form
Normal Forms
• Fifth normal form
• See that each of these R1, R2, R3 are in fifth
normal form because there are no non trivial
join dependencies in each of these tables.
•
• A join dependency is very difficult to detect in
practice.
• So it is not normally applied in a database.
Normal Forms
• Fifth normal form
Normal Forms