Functional Dependencies

Download Report

Transcript Functional Dependencies

Functional Dependencies and
Normalization
Normalization


Normalization is a formalized procedure to
eliminating redundancy from data by the
progressive use of ‘non-lose decomposition’,
which involves splitting records without losing
information.
In reducing the data model to the state where each
bit of information is only held in one place, the
update process is much simpler, more efficient and
inconsistencies in the database are impossible.
Normalization (cont.)
5NF
4NF
3NF
2NF
1NF
Redundancy
Redundancy
Redundancy
Normalization (cont.)


Normalization is based on the idea that an
attribute may depend on another attribute
in some way.
There are 2 different kinds of dependencies
involved up to 5 NF


Functional dependency
Multivalued dependence
Functional Dependence
S#,P#
S#
CITY
P#
QTY
S1
Khon Kaen
P1
100
S1
Khon Kaen
P2
100
S2
Saraburii
P1
200
S2
Saraburii
P2
200
S3
Saraburii
P2
300
S4
Bangkok
P2
400
S4
Bangkok
P4
400
S4
Bangkok
P5
400
QTY
QTY is functionally dependent on S#andP#
S# and P# are the determinant of QTY
Functional Dependence


In a relation including attribute A and B, B is
functional dependent on A if, for every valid
occurrence, the value A determines the value B
An occurrence can not be used to show that a
dependency is true, only that it is false



A and B can be composite
If B is ‘Functional Dependent on’ A, then A ‘is the
determinant of B’
All fields are functionally dependent on the primary key
– or indeed any candidate key – be definition.
First Normal Form

A relation is in First Normal form if,
and only if, it contains no multivalue or no repeating groups.
First NF (cont.)
NO
Name
Province
PayDate1
Amount1 PayDate2
E001
Somchai
Khon Kaen
15/04/2004
5,000.00 30/04/2004 5,000.00
E002
Sompong
Sarakham
15/04/2004
4,500.00 30/04/2004 4,500.00
E003
Somchay
Ubon
15/04/2004
5,200.00 30/04/2004 5,200.00
Repeat
Amount2
Problem
Multi-value
Staff
ENO
Name
Dno
DeptName
ProjNo
E001
Somchai
D01
Physic
P01, P02 NMR, Laser
E002
Sompong
D01
Physic
E003
Somchay
D02
Computer
Science
P04, P05
Voice ordering, Speech
Coding
E004
SomSiri
D02
Computer
Science
P04, P06
Voice ordering, Speech
Synthesis
Problem
1. Difficult to manipulate data
2. Redundancy
• UPDATE ANOMALIES
P03
ProjName
Medical Image
processing
Insert
P06
Speech Corpus
Staff
ENO
Name
Dno
DeptName
ProjNo
E001
Somchai
D01
Physic
P01, P02 NMR, Laser
E002
Sompong
D01
Physic
E003
Somchay
D02
Computer
Science
P04, P05
Voice ordering, Speech
Coding
E004
SomSiri
D02
Computer
Science
P04, P06
Voice ordering, Speech
Synthesis
P03
ProjName
Medical Image
processing
We can not insert new project if the project has not assigned to
any employee yet.
UPDATE ANOMALIES
Staff
ENO
Name
Dno
DeptName
ProjNo
E001
Somchai
D01
Physic
P01, P02 NMR, Laser
E002
Sompong
D01
Physic
E003
Somchay
D02
Computer
Science
P04, P05
Voice ordering, Speech
Coding
E004
SomSiri
D02
Computer
Science
P04, P06
Voice ordering, Speech
Synthesis
P03
ProjName
Medical Image
processing
Change ProjName from Voice Ordering to Speech Ordering
need to change all in Database
DELETE Problem
Staff
ENO
Name
Dno
DeptName
ProjNo
E001
Somchai
D01
Physic
P01, P02 NMR, Laser
E002
Sompong
D01
Physic
E003
Somchay
D02
Computer
Science
P04, P05
Voice ordering, Speech
Coding
E004
SomSiri
D02
Computer
Science
P04, P06
Voice ordering, Speech
Synthesis
P03
Delete Employee E003 Somchay
Project P03 Medical Image Processing was deleted also
ProjName
Medical Image
processing
Solution


Remove the repeating group
In case of multi-valued



Create new relation
Columns = Key + multi-valued
Take its determinant with it
Repeating group
Employee
ENO
Name
Province
PayDate1
E001
Somchai
Khon Kaen
15/04/2004
5,000.00 30/04/2004 5,000.00
E002
Sompong
Sarakham
15/04/2004
4,500.00 30/04/2004 4,500.00
E003
Somchay
Ubon
15/04/2004
5,200.00 30/04/2004 5,200.00
Amount2
PayCheck
Employee
ENO
Amount1 PayDate2
Name
Province
ENO
PayDate
Amount
E001
15/04/2004
5,000.00
E001
Somchai
Khon Kaen
E001
30/04/2004
5,000.00
E002
Sompong
Sarakham
E002
15/04/2004
4,500.00
E003
Somchay
Ubon
E002
30/04/2004
4,500.00
E003
15/04/2004
5,200.00
E003
30/04/2004
5,200.00
Multi-Valued
Staff
ENO
Name
E001
Somchai
E002
Dno
DeptName
ProjNo
ProjName
D01
Physic
P01, P02
NMR, Laser
Sompong
D01
Physic
P03
E003
Somchay
D02
Computer Science
P04, P05
Voice ordering, Speech Coding
E004
SomSiri
D02
Computer Science
P04, P06
Voice ordering, Speech Synthesis
Medical Image processing
Multi-Valued
ENO
Name
Dno
DeptName
ProjNo
ProjName
E001
Somchai
D01
Physic
P01
NMR
E001
Somchai
D01
Physic
P02
Laser
E002
Sompong
D01
Physic
P03
Medical Image processing
E003
Somchay
D02
Computer Science
P05
Voice ordering
E003
Somchay
D02
Computer Science
P04
Speech Coding
E004
SomSiri
D02
Computer Science
P04
Voice ordering
E004
SomSiri
D02
Computer Science
P06
Speech Synthesis
Insert Project still has problem
Second Normal Form
(2NF)

A relation is in first normal form if and
only if


It is in 1NF
Every non-key attribute is dependent on
all parts of the primary key.
Staff
2NF ?
ENO
Name
Dno
E001
Somchai
D01
Physic
P01
NMR
E001
Somchai
D01
Physic
P02
Laser
E002
Sompong
D01
Physic
P03
Medical Image processing
E003
Somchay
D02
Computer Science
P05
Voice ordering
E003
Somchay
D02
Computer Science
P04
Speech Coding
E004
SomSiri
D02
Computer Science
P04
Voice ordering
E004
SomSiri
D02
Computer Science
P06
Speech Synthesis
KEY = ENO + ProjNo
DeptName
ProjNo
ProjName
Answer is No. Because ProjNo is
dependent on ProjNo. (not all
part of Key)
Problem
ENO
Name
Dno
DeptName
ProjNo
ProjName
E001
Somchai
D01
Physic
P01
NMR
E001
Somchai
D01
Physic
P02
Laser
E002
Sompong
D01
Physic
P03
Medical Image processing
E003
Somchay
D02
Computer Science
P05
Voice ordering
E003
Somchay
D02
Computer Science
P04
Speech Coding
E004
SomSiri
D02
Computer Science
P04
Voice ordering
E004
SomSiri
D02
Computer Science
P06
Speech Synthesis
We can not insert Project if have not yet
assigned project to any employee
Solution


Remove the attribute involved
Take its determinant with it
Normalize
ENO
Name
Dno
DeptName
ProjNo
ProjName
E001
Somchai
D01
Physic
P01
NMR
E001
Somchai
D01
Physic
P02
Laser
E002
Sompong
D01
Physic
P03
Medical Image processing
E003
Somchay
D02
Computer Science
P05
Voice ordering
E003
Somchay
D02
Computer Science
P04
Speech Coding
E004
SomSiri
D02
Computer Science
P04
Voice ordering
E004
SomSiri
D02
Computer Science
P06
Speech Synthesis
Result
Project
PERSON
ENO
Name
Dno
DeptNa
me
E001
Somchai
D01
Physic
E003
Somchay
D02
Computer
Science
E004
SomSiri
D02
Computer
Science
Proj ProjName
No
PERSON_Proj
ENO
Proj
No
P01
NMR
E001
P01
P02
Laser
E001
P02
P03
Medical Image
processing
E002
P03
E003
P04
E004
P05
E004
P06
P04
Speech Coding
P05
Voice ordering
P06
Speech Synthesis
PERSON(ENO,NAME,Dno,DeptName)
PROJECT(ProjNo,ProjName)
PERSON_PROJ(ENO,ProjNo)
Third Normal Form

A relation is in 3NF if, and only if:



It is in 2NF
Every non-key attribute is functionally
dependent upon the key. (No non-key
attribute is functional dependent on
another non-key attribute)
Or non-key attribute no transitive
dependent on key
Transitive dependent


R(A,B,C,D) ; A is Key, others are non- key
If A → B and B → C
can say
A → B → C (C transitive dependent on A)
3NF?
Project
PERSON
ENO
Name
Dno
DeptNa
me
E001
Somchai
D01
Physic
E003
Somchay
D02
Computer
Science
E004
SomSiri
D02
Computer
Science
Answer is No
Because DeptName is dependent
on Dno (has transitive dependent
on key)
Proj ProjName
No
PERSON_Proj
ENO
Proj
No
P01
NMR
E001
P01
P02
Laser
E001
P02
P03
Medical Image
processing
E002
P03
E003
P04
E004
P05
E004
P06
P04
Speech Coding
P05
Voice ordering
P06
Speech Synthesis
Solution


Remove the offending attributes
Take the determinant along
Result
PERSON
ENO
Name
E001
Somchai
D01
E003
Somchay
D02
E004
SomSiri
D02
Department
Dno
Dno
Project
Proj ProjName
No
PERSON_Proj
ENO
Proj
No
P01
NMR
E001
P01
P02
Laser
E001
P02
P03
Medical Image
processing
E002
P03
E003
P04
E004
P05
E004
P06
DeptNa
me
P04
Speech Coding
D01
Physic
P05
Voice ordering
D02
Computer
Science
P06
Speech Synthesis
D02
Computer
Science
Note

The third normal form is often reached in practice by
inspection, in a single step. Its meaning seems intuitively
clear; it represents a formalization of designer’s
common sense.

This level of normalization is widely accepted as the
initial target for a design which eliminates redundancy.

However, there are higher normal forms which, although
less frequently invoked, highlight further redundancy
problems which may affect the designer
Boyce-Codd Normal Form
(BCNF)


A relation is in BCNF if, and only if,
every determinant is a candidate key.
BCNF is a refinement to third normal
form, and tightens its duration.
Multivalued Dependence

In a relation including attribute A, B and C,
B is multivalued dependent on A if the
set of B values matching a given A+C value
pair, depends only on the A value.