CS 195 Course Outline & Introduction to Java

Download Report

Transcript CS 195 Course Outline & Introduction to Java

CSC 411/511:
DBMS Design
Schema Refinement and
Normal Forms
Chapter 19
Dr. Nan Wang
1
The Evils of Redundancy
• Redundancy causes several problems
associated with relational schemas:
–
–
–
–
–
Redundant storage
Insert anomalies
Delete anomalies
Update anomalies
What are the problems in the following schema?
Hourly_Emps (ssn, name, lot, rating, hourly_wages,
hours_worked)
Dr. Nan Wang
Example
• Hourly_Emps (ssn, name, lot, rating, hourly_wages,
hours_worked)
• Abbreviate an attribute name to a single letter
• Refer to a relation schema by a string of letters, one per
attribute.
• In the example, hourly_wages is determined by rating.
This IC is an example of a functional dependency (RW).
• What does it lead to?
S
Dr. Nan Wang
N
L
R W H
123-22-3666 Attishoo
48 8
10 40
231-31-5368 Smiley
22 8
10 30
131-24-3650 Smethurst 35 5
7
30
434-26-3751 Guldu
35 5
7
32
612-67-4134 Madayan
35 8
10 40
Example
• Problems:
– Redundant storage
–
–
Update anomaly:
Can
we
change W in just
the 1st tuple of
SNLRWH?
Solutions to the
anomalies?
Dr. Nan Wang
S
N
L
R W H
123-22-3666 Attishoo
48 8
10 40
231-31-5368 Smiley
22 8
10 30
131-24-3650 Smethurst 35 5
7
30
434-26-3751 Guldu
35 5
7
32
612-67-4134 Madayan
35 8
10 40
Example
• Problems:
– Insertion anomaly:
What if we want to
insert an employee
and don’t know the
hourly wage for his
rating?
– Solution?
•
•
Null values
What if the s is
unknow?
Dr. Nan Wang
S
N
L
R W H
123-22-3666 Attishoo
48 8
10 40
231-31-5368 Smiley
22 8
10 30
131-24-3650 Smethurst 35 5
7
30
434-26-3751 Guldu
35 5
7
32
612-67-4134 Madayan
35 8
10 40
Example
• Problems:
– Deletion anomaly: If
we delete all
employees with rating
5, we lose the
information about the
wage for rating 5!
– Solutions?
Dr. Nan Wang
S
N
L
R W H
123-22-3666 Attishoo
48 8
10 40
231-31-5368 Smiley
22 8
10 30
131-24-3650 Smethurst 35 5
7
30
434-26-3751 Guldu
35 5
7
32
612-67-4134 Madayan
35 8
10 40
Fighting the Evils of Redundancy
• Redundancy is at the root of evil - true or not?
• Integrity constraints, in particular functional
dependencies, can be used to identify schemas
with such problems and to suggest refinements.
• RW
• Decomposition(replace SNLRWH with SNLRH
and RW)
S
N
L
R H
R W
123-22-3666 Attishoo
48 8 40
8 10
231-31-5368 Smiley
22 8 30
5 7
131-24-3650 Smethurst 35 5 30
Dr. Nan Wang
434-26-3751 Guldu
35 5 32
612-67-4134 Madayan
35 8 40
Hourly_Emps2
Wages
Fighting the Evils of Redundancy
• RW
• Decomposition(replace SNLRWH with SNLRH
and RW)
• Do we solve all this problems?
Hourly_Emps2
–
–
–
–
Redundant storage
Insert anomalies
Delete anomalies
Update anomalies
R W
8 10
5 7
Dr. Nan Wang
S
N
L
R H
123-22-3666 Attishoo
48 8 40
231-31-5368 Smiley
22 8 30
131-24-3650 Smethurst 35 5 30
434-26-3751 Guldu
35 5 32
612-67-4134 Madayan
35 8 40
Wages
Fighting the Evils of Redundancy
• Main refinement technique: decomposition
(replacing ABCD with, say, AB and BCD, or
ACD and ABD).
• Problems related to decomposition:
–
Is there reason to decompose a relation?
•
•
–
Normal forms help us to decide whether or not to
decompose a relation in further.
How?
What problems (if any) does the decomposition cause?
•
•
Dr. Nan Wang
Lossless join decomposition
Dependency-preservation decomposition
Properties of decomposition
•
Lossless join decomposition
–
•
Enable us to recover any instance of the decomposed relation
from corresponding instances of the smaller relations.
Dependency-preservation decomposition
–
–
Enable us to enforce any constraints on the original relation
by simple enforcing some constraints on each of smaller
relations.
Key constrains
S
N
L
R H
123-22-3666 Attishoo
48 8 40
231-31-5368 Smiley
22 8 30
131-24-3650 Smethurst 35 5 30
Dr. Nan Wang
434-26-3751 Guldu
35 5 32
612-67-4134 Madayan
35 8 40
Hourly_Emps2
R W
8 10
5 7
Wages
Good?
–
–
List N, W, H
How?
S
S
N
L
R W H
123-22-3666 Attishoo
48 8
10 40
231-31-5368 Smiley
22 8
10 30
131-24-3650 Smethurst 35 5
7
30
434-26-3751 Guldu
35 5
7
32
612-67-4134 Madayan
35 8
10 40
N
L
R H
123-22-3666 Attishoo
48 8 40
R W
231-31-5368 Smiley
22 8 30
8 10
131-24-3650 Smethurst 35 5 30
Dr. Nan Wang
434-26-3751 Guldu
35 5 32
612-67-4134 Madayan
35 8 40
5 7
11
Good DB designer
• Normal forms and what problems they do or do
not alleviate?
• Decomposition and what problems with
decompositions.
• Questions:
– Is a relation in a given normal form?
– Is a decomposition dependency preserving?
Dr. Nan Wang
12
Functional Dependencies (FDs)
• A functional dependency X  Y holds over relation R
if, for every allowable instance r of R:
–
–
t1  r, t2  r, t1.X =t2.X implies t1.Y =t2.Y
i.e., given two tuples in r, if the X values agree, then the Y values
must also agree. (X and Y are sets of attributes.)
S
Dr. Nan Wang
N
L
R W H
123-22-3666 Attishoo
48 8
10 40
231-31-5368 Smiley
22 8
10 30
131-24-3650 Smethurst 35 5
7
30
434-26-3751 Guldu
35 5
7
32
612-67-4134 Madayan
35 8
10 40
Functional Dependencies (FDs)
• An FD is a statement about all allowable relations.
–
–
Must be identified based on semantics of application.
Given some allowable instance r1 of R, we can check if it
violates some FD f, but we cannot tell if f holds over R!
• K is a candidate key for R means that K R
–
However, KR does not require K to be minimal!
Dr. Nan Wang
FDs- example
FD: ABC
Question:
Do all tuples in the relation satisfy the Functional Dependents ABC
Can we add a tuple <a1,b1,c2,d1> to the relation? Why?
Dr. Nan Wang
15
exercise
• List all the functional dependencies that this
relation instance satisfies.
• Z → Y, X → Y, and XZ → Y
Dr. Nan Wang
16
Example: Constraints on Entity Set
• Consider relation obtained from Hourly_Emps:
–
Hourly_Emps (ssn, name, lot, rating, hrly_wages, hrs_worked)
• Notation: We will denote this relation schema by
listing the attributes: SNLRWH
–
–
This is really the set of attributes {S,N,L,R,W,H}.
Sometimes, we will refer to all attributes of a relation by using
the relation name. (e.g., Hourly_Emps for SNLRWH)
• Some FDs on Hourly_Emps:
–
–
ssn is the key: SSNLRWH
rating determines hrly_wages: RW
Dr. Nan Wang
Wages R W
Example (Contd.)
8 10
Hourly_Emps2
5 7
S
N
• Problems due to RW :
– Update anomaly: Can
we change W in just
the 1st tuple of SNLRWH?
– Insertion anomaly: What if
we want to insert an employee
and don’t know the hourly
wage for his rating?
– Deletion anomaly: If we
delete all employees with
rating 5, we lose the
S
information about the wage
for rating 5!
R H
123-22-3666 Attishoo
48 8 40
231-31-5368 Smiley
22 8 30
131-24-3650 Smethurst 35 5 30
434-26-3751 Guldu
35 5 32
612-67-4134 Madayan
35 8 40
N
L
R W H
123-22-3666 Attishoo
48 8
10 40
231-31-5368 Smiley
22 8
10 30
131-24-3650 Smethurst 35 5
7
30
434-26-3751 Guldu
35 5
7
32
35 8
10 40
Will 2 smaller tables be better? 612-67-4134 Madayan
Dr. Nan Wang
L
Reasoning About FDs
• Given some FDs, we can usually infer additional
FDs:
– For example:
– Workers(ssn,name,lot,did,since)
– Ssndid holds,
– did lot holds, implies ssnlot
• An FD f is implied by a set of FDs F if f holds
whenever all FDs in F hold.
– F+ = closure of F is the set of all FDs that are implied by a
given set F of FDs.
Dr. Nan Wang
Closure of a set of FDs
•
Infer or compute the closure of a given set of F
of FDs
– Following three rules (Armstrong’s Axioms)
• Armstrong’s Axioms (X, Y, Z are sets of
attributes):
– Reflexivity: If X  Y, then YX
–
–
Augmentation: If XY, then XZYZ for any Z
Transitivity: If XY and YZ, then XZ
• These are sound and complete inference rules for
FDs!
Dr. Nan Wang
Soundness and Completeness
• Completeness: given F, the rules allows us to
determine all dependencies in F 
• Soundness:
we cannot generate FDs not in

F
• Addition rules:
– Union: if XY, XZ then XYZ
– Decomposition: if XYZ, then XY and XZ
Dr. Nan Wang
Soundness and Completeness
• R=(A,B,C) and F={AB, BC}
– Reflexivity: if Y is subset of X, then XY
– Transitivity : if XY and YZ then XZ
• AC
– Augmentation : if XY then XZYZ for any Z
• ACBC
• ABBC
• ABAC
– ABC
F+ ={A B, BC, AC, ACBC,ABAC. …}
Dr. Nan Wang
Reasoning About FDs
• FD is a statement about the world as we
understand it
–
Ssndid, did lot implies ssnlot
• Is it correct to state that
“Since X functionally determines Y, if we know X, we know Y.”
Or “if XY, then X identifies Y”
Dr. Nan Wang
Reasoning About FDs (Contd.)
• Couple of additional rules
–
–
Union: If X  Y and X Z, then XYZ
Decomposition: If XYZ, then XY and XZ
• Example:
and:
–
–
–
Contracts(cid,sid,jid,did,pid,qty,value),
C is the key: CCSJDPQV
Project purchases each part using single contract: JPC
Dept purchases at most one part from a supplier: SDP
Dr. Nan Wang
Reasoning About FDs (Contd.)
•
•
•
C is the key: CCSJDPQV
Project purchases each part using single contract:
JPC
Dept purchases at most one part from a supplier:
SDP
• JPC, CCSJDPQV imply JPCSJDPQV
• SDP implies SDJJP
• SDJJP, JPCSJDPQV imply
SDJCSJDPQV
Dr. Nan Wang
Exercise
• Consider a relation R with five attributes ABCDE.
You are given the following dependencies: A→B,
BC→E, and ED→A.
• List all keys for R.
• CDE, ACD, BCD
Dr. Nan Wang
26