Transcript Relations

The Relational Data Model
Properties of Relations
Keys and Constraints
CG084&085 /01-02
2/ 1
Relational Data Base
• The data base is a collection of relations or
tables.
• Data organisation is based on the relational data
model.
• A relation is actually a set of tuples – but we
can visualise it as a table with columns and
rows.
• Each tuple in a relation has the same number
and type of attributes.
CG084&085 /01-02
2/ 2
Relational Model
• A data model is a collection of constructs used
to organise data.
• The Relational Model is based on a simple and
uniform construct – the relation (Codd, 1970).
• A relation is a mathematical concept based on
the idea of sets.
• The strength of the relational model comes
from the formal foundation provided by the
theory of relations.
CG084&085 /01-02
2/ 3
Relational Concepts – Relations
• A relation consists of a relation scheme and a
relation instance.
• The relation scheme can be thought of as a
heading for the table.
• It consists of the relation name, and a set of
pairs (attribute name and domain), where
• A domain represents the set of (atomic) values
that the attribute can take.
CG084&085 /01-02
2/ 4
Relations – cont…
• e.g.
Student (Name: string, RegNo: string,
Address: string, PhoneNo: string,
Age: integer, AvgeMark: real)
• A relation instance can be visualised as the
“body” of a table.
• It consists of a set of rows (tuples) each
containing values corresponding to the various
attributes in the relation scheme.
CG084&085 /01-02
2/ 5
Relations – cont…
• Student Table
• Alternative terminologies for the relational model
Informal Terms
Table
Column
Row
Values in a column
Table Definition
Populated Table
CG084&085 /01-02
Formal Terms
Relation
Attribute/Domain
Tuple
Domain
Relation Scheme
Relation Instance/Extension
2/ 6
Properties of Relations
• No duplicate tuples in a relation
– because a relation is a set of tuples
• The tuples in a relation have no order
– ditto
• No duplicate attributes
– because the relation scheme contains a set of attributes
• The attributes have no order
– they are identified by name, not position
• Every tuple must have attribute values drawn
from each of the domains of the relation scheme
(or alternatively the special “value” NULL)
CG084&085 /01-02
2/ 7
Position is Not Important
A1
A4
A3
A2
A1
A4
A3
A2
Tuples are
unordered
If you change the place of a row in a table, the meaning of the
data remains the same.
A1
A2
A3
A4
A1
A4
A3
A2
Attributes are
unordered
If you change the place of a column in a table, the meaning of the
data remains the same.
CG084&085 /01-02
2/ 8
Mathematical Relations
Mathematical definition of a relation:
• Consider two sets, Cars and Colours.
– Cars = {NissanMicra, FordEscort}
– Colours = {Red, Blue, Silver}
(the notation { … } denotes a set)
• Let’s find all combinations of elements: first
element from Cars, and second from Colours.
– {(NissanMicra, Red), (NissanMicra, Blue),
(NissanMicra, Silver), (FordEscort, Red),
(FordEscort, Blue), (FordEscort, Silver)}
• Mathematically, this set is called the Cartesian
Product of Cars and Colours.
CG084&085 /01-02
2/ 9
Mathematical Relations – cont…
A relation is a sub-set of the Cartesian Product of its
domains, e.g.
BlueCars = {(NissanMicra, Blue), (FordEscort, Blue)}
The above set can be written mathematically using
conditions as:
{(x, y) such that x belongs to Cars
and y belongs to Colours and y = Blue}
Cars = {NissanMicra, FordEscort}
Colours = {Red, Blue, Silver}
BlueCars = {(NissanMicra, Blue), (FordEscort, Blue)}
CG084&085 /01-02
2/ 10
Mathematical Relations – cont…
Consider another set for models:
Years = {2008, 2009, 2006, 2007}
Let’s define a relation for LatestBlueCars:
{(x, y, z) such that x belongs to Cars
and y belongs to Colours and y = Blue
and z belongs to Years and z > 2007}
So we can write:
LatestBlueCars = {(NissanMicra, Blue, 2008),
(NissanMicra, Blue, 2009),
(FordEscort, Blue, 2008),
(FordEscort, Blue, 2009)}
CG084&085 /01-02
2/ 11
Relational Concepts – Domains
• A domain is a set of atomic values, e.g.
– Gender = {“Male”, “Female”}
– DepartmentCodes = {“CG”, “EN”, “BE”, …}
– NationalPhoneNums = {“0191-227 3521”,
“0161-275 6138”, …}
• An atomic value is indivisible.
– A data item is treated as a whole for simplicity.
– Although “0191-227-3521” can be divided into
0191, 227, and 3521, in the relational model it is
considered as one atomic value.
CG084&085 /01-02
2/ 12
Domains – cont…
• A data type may be used to define a domain,
e.g. integer, real.
• A format can be defined to specify the pattern
of the legal values, e.g.
– NationalPhoneNums: dddd-ddd dddd
– Date: MM/DD/YYYY (where MM = {01, 02, …,
12}, DD = {01, 02, …, 31}, YYYY = {0000,
0001, …, 1999, 2000, 2001, …, 9999})
CG084&085 /01-02
2/ 13
Relational Constrains
Constraints are conditions that must hold on all
relation instances.
• Domain Constraint.
– Values of attributes must belong to the appropriate
domain.
• Key Constraint.
– A sub-set (Primary Key) of the attributes of a
relation should identify each tuple uniquely.
• Entity Integrity Constraint.
– No Primary Key value can be null.
• Referential Integrity Constraint.
– Specifies data dependencies between relations.
CG084&085 /01-02
2/ 14
Domain Constraint
• The value of each attribute A must be an atomic
value drawn from the domain of A.
• E.g. if a Person relation has a Sex attribute
defined on the Gender domain, the only
allowable values are “Male” and “Female”.
• Each domain has an associated data type.
• Standard data types include INT(EGER),
DECIMAL, DATE, CHAR, VARCHAR, etc.
CG084&085 /01-02
2/ 15
Key Constraint
• Since a relation is a set, all tuples must be
unique (no duplicates).
• No two tuples can have the same combinations
of values for all of their attributes.
• The Key Constraint makes sure that all tuples
in a relation are unique.
• A sub-set of the attributes of a relation may be
used to guarantee that no two tuples have the
same values for these attributes.
CG084&085 /01-02
2/ 16
Relational Keys
Superkey
• A set of attributes (one or more) that uniquely
identifies a tuple within a relation.
• Every relation must have at least one superkey:
the set of all its attributes.
Key
• A minimum set of attributes that guarantees
uniqueness of each tuple within a relation. (No
proper sub-set of the key does so).
E.g. RegNo is a key of the Student relation.
CG084&085 /01-02
2/ 17
Relational Keys – cont…
Candidate Key
• A relation may have more than one key. Each such key
is called a Candidate Key.
Primary Key
• A Candidate Key chosen to identify each tuple in a
relation uniquely – usually underlined.
CG084&085 /01-02
2/ 18
Entity Integrity Constraint
NULL
• Represents a value for an attribute that is
currently unknown or is not applicable for this
tuple.
• Deals with incomplete or exceptional data.
• Represents the absence of a value and is not the
same as zero or spaces, which are values.
CG084&085 /01-02
2/ 19
Entity Integrity Constraint
• No Primary Key (PK) attribute can be NULL.
• A NULL PK would mean that a tuple cannot be
identified, which is against the concept of PK,
i.e. to identify tuples uniquely.
• If two tuples had NULLs in the PK, there
would be no way of distinguishing them.
• Non-key attributes may or may not allow null
values.
CG084&085 /01-02
2/ 20
Relational Keys – cont…
Single attribute key
Every value in this column
is defined to be different
from all other values in that
column.
Multi-attribute key (composite key)
Every combination of values in these
columns is defined to be different from
all other combined values in these
columns.
CG084&085 /01-02
2/ 21
Referential Integrity Constraint
• A constraint involving two relations: the
referencing relation and the referenced relation
(other constraints involve just a single relation).
• Tuples in the referencing relation R1 have
attributes FK (called foreign key attributes) that
reference the primary key attributes PK of the
referenced relation R2.
• A tuple t1 in R1 is said to reference a tuple t2 in
R2 if t1[FK] = t2[PK].
CG084&085 /01-02
2/ 22
Referential Integrity – cont…
Foreign Key
• A set of attributes in a relation that exactly
“matches” a Primary Key in another relation.
• The names of the attributes don’t have to be the
same, but they must be of the same domain.
• A Foreign Key in a relation A, matching a
Primary Key in a relation B, represents a
relationship between A and B.
CG084&085 /01-02
2/ 23
Referential Integrity – cont…
Primary Key
value here
These values are all the same
CG084&085 /01-02
Foreign key
values here
2/ 24
Referential Integrity – cont…
For example, consider two relations:
Student (Name, RegNo, Address, …, TutorId)
Lecturer (Name, IdNo, RoomNo, …)
Student [TutorId] is a FK referencing
Lecturer [SSN], the PK of Lecturer.
CG084&085 /01-02
2/ 25
Referential Integrity – cont…
Referential Integrity says
• A FK value cannot reference a non-existent PK.
• Therefore we cannot delete a tuple from
Lecturer whilst a tuple in Student has a FK value
the same as the PK of that tuple.
But a FK can be NULL
• If a Lecturer tuple is deleted, the TutorId values
of all his / her tutees can be set to NULL,
meaning that those students have no Tutor.
CG084&085 /01-02
2/ 26
Referential Integrity – cont.
CG084&085 /01-02
2/ 27
Relational Schemas
A relational data base usually contains many
relations.
A relational schema S contains many relation
schemes and a set of integrity constraints IC:
• S = {R1, R2, …, Rm}
A relational data base state DB over S is a set of the
relation instances:
• DB = {r1, r2, …, rm}
• Each relation instance in DB satisfies the
integrity constraints defined in IC.
CG084&085 /01-02
2/ 28
Company Data Base Schema
CG084&085 /01-02
2/ 29
Company Data Base Schema –
Constraints
CG084&085 /01-02
2/ 30
Company Database State
CG084&085 /01-02
2/ 31
Relational Data Model
Implementation
• The Relational Data Model is a formalism and
an abstract structure for data organisation.
• A particular DBMS provides implementation of
the Relational Data Model in general.
• SQL (Structured Query Language) is the
standard for commercial DBMSs.
• It is composed of three sub-languages.
– DDL: Data Definition Language.
– DML: Data Manipulation Language.
– DCL: Data Control Language.
CG084&085 /01-02
2/ 32
Data Definition in DDL
• DDL is used to specify relations (as tables) and
the various types of constraints.
• We will use the SQL2 standard, a version of
SQL proposed in 1992.
• Before SQL2, all tables were considered to be
part of one global schema.
• A schema is defined using the CREATE
SCHEMA command.
CG084&085 /01-02
2/ 33
Data Definition in DDL
• For the Company data base, the schema might
be created by:
CREATE SCHEMA Company
AUTHORIZATION NEliot;
• NEliot is the authorisation identifier of the user.
• Not implemented in Oracle!
CG084&085 /01-02
2/ 34
Data Definition in DDL – cont.
• SQL2 provides data types for string, numeric, date,
and time data.
– CHAR, VARCHAR
– INT(EGER), NUMERIC or DEC(IMAL)
– FLOAT or REAL
– DATE, TIME, TIMESTAMP, INTERVAL
• Domains can be defined using CREATE DOMAIN.
– Suppose we want to define a domain for the RegNo
attribute of Student:
CREATE DOMAIN RegNoType AS CHAR(8);
– CHAR(8) means a string of 8 characters.
– Oracle doesn’t implement domains!
CG084&085 /01-02
2/ 35
Data Definition in DDL – cont.
Defining relations.
• A relation becomes a TABLE.
• Relations are defined using CREATE TABLE.
• CREATE TABLE specifies both Attributes and
Constraints.
• Each attribute is given a name, a data type or
domain, and may have constraints specified,
e.g. NOT NULL.
• Constraints on the whole table, e.g. composite
Primary and Foreign Keys, are specified after
all of the attributes have been defined.
CG084&085 /01-02
2/ 36
Data Definition in DDL – cont.
Creating Employee Table
EMPLOYEE (FNAME, MINIT, LNAME, SSN,
BDATE, ADDRESS, SEX, SALARY,
SUPERSSN*, DNO*)
CG084&085 /01-02
2/ 37
Data Definition in DDL – cont.
Alternative, specifying default values and
referential actions:
CG084&085 /01-02
2/ 38
Summary
DBMS requires structured data.
• Structure is recorded in the Relation Scheme.
• Structure depends on the kind of data model.
• A Relational DBMS uses the Relational Model.
• A Relation is a mathematically defined concept.
• A Relation is visualised as a table with special
properties.
Data model implementation using SQL DDL.
• Relations become Tables.
• Relations, attributes, and constraints are defined
using CREATE TABLE.
CG084&085 /01-02
2/ 39