Transcript Lecture 2

Main concepts of relational model
The relational model was proposed in 1970 by
Edgar Codd
 The relational model assumes that data is stored
in two-dimensional tables called relations
As an example, information about company cars
(vehicle identification number, number plate,
mark and year) can be stored in the following
relation:

VIN
NUMBER_PLATE
MARK
YEAR
GTRE545WRTH256452
EPA60PL
Ford Fusion I
2003
THER186ACVG636853
EL432PL
Opel Corsa II
2005
WDH144TETU063632
EZG42PL
Citroen C3 III
2011
Definition (Relation schema): Let R be a finite set of elements
A1,A2,...,An called attributes. Each Ai (i=1..n) is assigned to
the set of values D(Ai) called the domain of Ai. The set R is
called a relation schema.
Example 1 (Relation schema):
STUDENT:={INDEX_NO, FIRST_NAME, LAST_NAME, YEAR,
FACULTY}
D(INDEX_NO):= {927502, 138571, 714072, 841053,
595723, 965024},
D(FIRST_NAME):= {David, Agnes, Andrew, Charles,
Eva, Clara},
D(LAST_NAME):= {Holl, Lewis, Harris, Klark,
Barker, Morgan},
D(YEAR):= {1, 2, 3},
D(FACULTY):= {Mathematics, Computer Science,
Management, Law, Economics, Physics, Chemistry,
Biology}.
Definition (Tuple): Consider the relation schema
R:={A1,A2,...,An} and the set D(R) := D(A1)∪D(A2) ∪… ∪ D(An).
Let f: R  D(R) be a function such that f(Ai)D(Ai) for each
i=1..n. The function f is called a tuple. Tuples defined on R
will be denoted by r, r1, r2, s, t,....
Table representation of tuples
Let r be a tuple defined on R={A1,A2,...,An}. r is a function so it
can be expressed as
r = {(A1,r(A1)), (A2,r(A2)),...,(An,r(An))}
where r(Ai)  D(Ai), i=1,...,n.
This notation implies the following table representation of r:
A1
A2
...
An
r(A1)
r(A2)
...
r(An)
Definition (Relation): Let T(R) denotes the set of all tuples
defined on the set R. A finite subset of the set T(R) is a called
a relation with the schema R. This subset is also called an
instance of the schema R. Relations with the schema R
(instances of R) will be denoted by I, J, K,....
Table representation of relations
Let I be a relation with the schema R={A1,A2,...,An}. According to
the definition I:={r1, r2,...,rm}, where ri: R  D(R), ri(Aj) 
D(Aj), i=1,...,m, j=1,...,n. Since each ri function has the same
set of arguments, therefore relation I can be described by the
following table:
A1
A2
...
An
r1(A1)
r1(A2)
...
r1(An)
r2(A1)
r2(A2)
...
r2(An)
...
...
...
...
rm(A1)
rm(A2)
...
rm(An)
Example 1 (Relation)
I(STUDENT):={
{138571, Eva, Klark, 2, Law},
{927502, David, Holl, 1, Biology},
{714072, Andrew, Lewis, 3, Computer
Science},
{965024, Charles, Barker, 1,
Mathematics}
}
is a relation with the STUDENT schema
(I(STUDENT) is an instance of the STUDENT
schema).
INDEX_NO
FIRST_NAME LAST_NAME YEAR
FACULTY
138571
Eva
Klark
2
Law
927502
David
Holl
1
Biology
714072
Andrew
Lewis
3
Computer Science
965024
Charles
Barker
1
Mathematics
Example 1 (cont. Relation)
I(STUDENT):={
{138571, Eva, Klark, 2, Law},
{927502, David, Holl, 1, Biology},
{714072, Andrew, Lewis, 3, Computer
Science},
{965024, Charles, Barker, 1,
Mathematics}
}
is a relation with the STUDENT schema
(I(STUDENT) is an instance of the STUDENT
schema).
In the relational model the best known formal
query languages are relational algebra and
relational calculus.
 The relational algebra is based on a set of
operators (e.g. selection, projection, join,
union, intersect, etc.) which are applied to
relation instances. Queries in relational algebra
are expressed in the procedural manner: each
query specifies a sequence of operations needed
to compute the desired answer.
 The relational calculus provides a declarative
manner of formulating queries. It assumes that a
query specifies only the desired answer without
describing a precise sequence of required
operations.

The relation algebra specifies standard set
operations (union, intersect, difference, cross
product) which are applied to relations. Let
R={A1,A2,...,An} be a relation schema and let T(R)
denotes the set of all tuples defined on R. Then
the union, intersection, and the difference of relations
I(R) and J(R) are defined as follows.
Definition.
I(R)  J(R) := {rT(R); rI(R)  rJ(R)} (union)
I(R)  J(R) := {r  T(R); rI(RrJ(R)} (intersection)
I(R)\J(R) := {r  T(R); rI(RrJ(R)} (difference)
Example 2
Consider the following instances I and J of
PHONE:={PHONE_ID, NUMBER, MODEL} :
R = {A1,A2,,A3} (R = PHONE, A1 = PHONE_ID A2 = NUMBER, A3 = MODEL)
D(A1)=D(PHONE_ID)={6,3,9,11},
D(A2)=D(NUMBER)={426235020, 491503592, 525246252, 426548716},
D(A3)=D(MODEL)={Siemens Gigaset W341, Sony Xperia WE, Nokia Lumia
329, Panasonic E425},
r1 := {(A1,r(A1)), (A2,r(A2)),(A3,r(A3))},
r1 := {(PHONE_ID ,6), (NUMBER, 426235020),(MODEL, Siemens Gigaset
W341)},
r2 := {(A1,r(A1)), (A2,r(A2)),(A3,r(A3))},
r2 := {(PHONE_ID ,3), (NUMBER, 491503592),(MODEL, Sony Xperia WE)},
r3 := {(A1,r(A1)), (A2,r(A2)),(A3,r(A3))},
r3 := {(PHONE_ID ,9), (NUMBER, 525246252),(MODEL, Nokia Lumia
329)},
r4 := {(A1,r(A1)), (A2,r(A2)),(A3,r(A3))},
r4 := {(PHONE_ID ,11), (NUMBER, 426548716),(MODEL, Panasonic
E425)},
I:={r1,r2,r3}, I(PHONE)={6, 426235020, Siemens Gigaset W341}, {3,
491503592, Sony Xperia WE}, {9, 525246252, Nokia Lumia 329},
{11, 426548716, Panasonic E425 }
I(PHONE)
PHONE_ID
NUMBER
MODEL
6
426235020
Siemens Gigaset W341
3
491503592
Sony Xperia WE
9
525246252
Nokia Lumia 329
11
426548716
Panasonic E425
PHONE_ID
NUMBER
MODEL
1
506923054
Nokia Lumia 6556
3
491503592
Sony Xperia WE
4
543012943
Samsung Galaxy A45
J(PHONE)
I(PHONE)  J(PHONE) I(R)J(R) := {rT(R); rI(R)  rJ(R)}
PHONE_ID
NUMBER
MODEL
1
506923054
Nokia Lumia 6556
3
491503592
Sony Xperia WE
4
543012943
Samsung Galaxy A45
6
426235020
Siemens Gigaset W341
9
525246252
Nokia Lumia 329
11
426548716
Panasonic E425
I(PHONE)
PHONE_ID
NUMBER
MODEL
6
426235020
Siemens Gigaset W341
3
491503592
Sony Xperia WE
9
525246252
Nokia Lumia 329
11
426548716
Panasonic E425
PHONE_ID
NUMBER
MODEL
1
506923054
Nokia Lumia 6556
3
491503592
Sony Xperia WE
4
543012943
Samsung Galaxy A45
J(PHONE)
I(PHONE)  J(PHONE) I(R)  J(R) := {r  T(R); rI(RrJ(R)}
PHONE_ID
NUMBER
MODEL
3
491503592
Sony Xperia WE
I(PHONE)
PHONE_ID
NUMBER
MODEL
6
426235020
Siemens Gigaset W341
3
491503592
Sony Xperia WE
9
525246252
Nokia Lumia 329
11
426548716
Panasonic E425
PHONE_ID
NUMBER
MODEL
1
506923054
Nokia Lumia 6556
3
491503592
Sony Xperia WE
4
543012943
Samsung Galaxy A45
J(PHONE)
I(PHONE) \ J(PHONE) I(R)\J(R) := {r  T(R); rI(RrJ(R)}
PHONE_ID
NUMBER
MODEL
6
426235020
Siemens Gigaset W341
9
525246252
Nokia Lumia 329
11
426548716
Panasonic E425
The projection operator p chooses a set of
specified attributes from a given relation.
Let R={A1,A2,...,An} be relation schema and let
P  {A1,A2,...,An}.
Definition 2.4 (Tuple restriction) : A tuple
sT(P) is called a restriction of rT(R) to P if
and only if
s(Ai)=r(Ai) for each AiP.
The tuple restriction is denoted by r[P].
Example 1 (cont. Tuple restriction)
Let
STUDENT:={INDEX_NO, FIRST_NAME, LAST_NAME,
YEAR, FACULTY}
be relation schema and let
P  {INDEX_NO, LAST_NAME, FACULTY}.
Next let
r = {138571, Eva, Klark, 2, Law}
be a given tuple, then
r[P] = {138571, Klark, Law}
INDEX_NO
FIRST_NAME LAST_NAME YEAR
138571
Eva
INDEX_NO
LAST_NAME FACULTY
138571
Klark
Klark
Law
2
FACULTY
Law
Definition (Projection) Given a relation I(R) and the set PR, the set
pP(I(R)) := {rT(P); $(sI(R)) (r = s[P])} is called the projection of
I(R) on P.
Example 1 (cont. Projection)
Let
STUDENT:={INDEX_NO, FIRST_NAME, LAST_NAME, YEAR,
FACULTY}
be relation schema and let
I(STUDENT):={
{138571, Eva, Klark, 2, Law},
{927502, David, Holl, 1, Biology},
{714072, Andrew, Lewis, 3, Computer Science},
{965024, Charles, Barker, 1, Mathematics}
}
be a realtion, next let
P  {INDEX_NO, LAST_NAME, FACULTY},
then
pP(STUDENT) ={{138571, Klark, Law},{927502, Holl,
Biology},
{714072, Lewis, Computer Science}, {965024, Barker,
Mathematics}}
INDEX_NO
FIRST_NAME
LAST_NAME
YEAR
FACULTY
138571
Eva
Klark
2
Law
927502
David
Holl
1
Biology
714072
Andrew
Lewis
3
Computer Science
965024
Charles
Barker
1
Mathematics
INDEX_NO
LAST_NAME
FACULTY
138571
Klark
Law
927502
Holl
Biology
714072
Lewis
Computer Science
965024
Barker
Mathematics
The selection operator s is used to produce a
horizontal subset of a given relation by
selecting only the tuples which meet the
specified selection condition.
Definition (Selection): Let I(R) be an instance
of relation schema R={A1,A2,...,An}, SC a
selection condition (a boolean expression
involving terms connected by logical
connectives). The set
sSC(I(R)) := {rI(R); SC(r)=true}
is called selection of I(R) on the basis of the SC
condition.
Consider the following instance I(R) of R={A1, A2,
A3, A4}.
I(R)
A1
A2
A3
A4
5
2
a
x
11
9
b
b
6
3
c
y
20
17
d
d
The following instance is the selection of I(R) on
the basis of the condition
SC: (A1>A2 and A3=A4)
s{A1>A2 and A3=A4}(I(R))
A1
A2
A3
A4
11
9
b
b
20
17
d
d
The join operator is applied to combine tuples
from two or more relations.
Definition (Joins): Let R and S be relation
schemas such that [RS]. The join of two
relations I(R) and J(S) is defined as the
ralation
I(R)BCJ(S):= {rT(RS); r[R]I(R)  r[S]J(S)}
 If
R=S then I(R) )BCJ(S) is equivalent to
I(R)J(S).
Example (Join)
Consider the following relation instances I(R) and J(S) of the schemas
R={A,B,C,D,E} and S={A,B,F,G,H}, respectively.
I(R)
A
B
C
D
E
9
h
b
x
h
5
o
e
k
o
12
r
i
m
r
17
t
d
d
t
J(S)
A
B
C
D
E
F
G
9
h
B
x
h
n
f
17
t
d
d
t
d
s
I(R) BC J(S)
A
B
C
D
E
F
G
9
h
b
x
h
n
f
17
t
d
d
t
d
s
A functional dependency defines a relationship between attributes of a
given relation schema.
Definition (Functional dependency) Let R={A1, A2,...,An} be a relation
schema and let X and Y be any subsets of R. We say that an attribute Y is
functionally dependent on attribute X (XY) if and only if

(r, s I(R)) (r[X] = s[X] r[Y] = s[Y]).
for any instance I(R) of the R schema.
In particular, an instance J(R) of R satisfies a functional dependency XY
(X,YR) if for any two tuples t and u of J(R), t[X] = u[X] t[Y] = u[Y].
Assume that a set of attributes X1, X2,...,Xn functionally determines several
attributes Y1, Y2,...,Ym:
X1X2...XnY1
X1X2...XnY2
…
X1X2...XnYm
The above dependencies will be written as X1X2...Xn Y1Y2...Ym.
Example Consider a relation schema
COMPANY_CAR={VIN, NUMBER_PLATE, MARK, YEAR}.
In COMPANY_CAR, for instance, attributes
NUMBER_PLATE, MARK, and YEAR are functionally
dependent on attribute VIN. These dependencies are
represented as follows (in short VINNUMBER_PLATE
MARK YEAR):
VINNUMBER_PLATE
VINMARK
VINYEAR
The above dependencies hold for any instance of the
COMPANY_CAR schema, in particular for the following
instance called I(COMPANY_CAR):
VIN
NUMBER_PLATE MARK
YEAR
GTRE545WRTH256452
EPA60PL
Ford Fusion I
2003
THER186ACVG636853
EL432PL
Opel Corsa II
2005
WDH144TETU063632
EZG42PL
Citroen C3 III
2011
Note that the functional dependency
MARKYEAR is also satisfied in
I(COMPANY_CAR). However it does not hold
for any instance of COMPANY_CAR. It is
enough to consider the instance below to
show that the condition occurring in the
Definition does not hold:
VIN
number_plate mark
year
GTRE545WRTH256452
EPA60PL
Ford Fusion I
2003
THER186ACVG636853
EL432PL
Opel Corsa II
2005
WDH144TETU063632
EZG42PL
Citroen C3 III
2011
YDAT45729683278503
EL476PL
Citroen C3 III
2005
In other words, the sentence “cars having the
same mark have the same production year”
is not true in general case.
Let F be a subset of the set of all functional
dependencies over a relation schema R,
F:={XY, X,YR}. The set of all functional
dependencies which can be derived from a
given set F is called the closure of F, denoted
as F+. The following rules, called Armstrong's
axioms allow to construct the set F+
(X,Y,ZR):
1. YX  XY (reflexivity)
2. XY  XZY  Z (augmentation)
3. XY  YZ  XZ (transitivity)
Introducing functional dependencies allows to
extend the notion of a relation schema.
Let A denote the previously defined relation
schema R, A={A1,A2,...,An} (each Ai is
assigned to the set of values D(Ai)).
Let F be a set of all functional dependencies
defined on A, F={XY, X,YA}. The ordered
pair R=(A,F) is called a relation schema.
Each relation schema R=(A,F) possesses a certain
minimal subset of attributes whose values
uniquely identify tuples of any instance of R.
Such a subset of attributes is called a key
(candidate key) for the schema. In other words,
a subset KA is a key if and only if the following
conditions hold
 There are no tuples of any instance of R which
have the same values in K.
 No proper subset of K have unique
identification property (key must be minimal).
In the words case, a key can contain all the
attributes of the given relation schema. Note
that a schema can possess several keys. A
primary key is one of all the keys selected by a
database designer.
Example 2 Consider the relation schema
COMPANY_CAR=(A,F) in which A={VIN,
NUMBER_PLATE, MARK, YEAR}.
To determine keys for COMPANY_CAR all its
instances must be taken into account. The
attributes VIN and NUMBER_PLATE always
uniquely identify tuples occurring in any
instance of the COMPANY_CAR schema.
Besides there is no proper subset of these
attributes which has unique identification
property. The primary key for COMPANY_CAR
is one of its keys.
A set of attributes of a given relation can
points to a primary key in another relation.
This set is called foreign key. The values
inserted to a foreign key column must match
the values stored in the primary key column
of the reference relation.
Example Relations and their keys (person_id is
the primary key of Person, phone_id is the
primary key of Phone, p_id is the foreign key
of Phone which points to person_id.
Example Relations and their keys (person_id is the primary
key of PERSON, phone_id is the primary key of PHONE, p_id
is the foreign key of PHONE which points to person_id.
PERSON
person_i
first_name
last_name
1
Emily
Orman
2
Martin
Lindsey
3
Anthony
Cain
PHONE
phone_id
number
type
p_id
5
345-034-923
M
1
7
582-532-572
M
3
11
582-04-43
S
1
The p_id column can store only the values occurring in the
column person_id of PERSON.