Transcript 投影片 1

Database Systems
Chapter 6
The Relational Algebra
主講人:陳建源
研究室 :法401
Email: [email protected]
日期:99/11/30
Outline
1. Relational Algebra Overview
2. Unary Relational Operations
3. Set Operations
4. Binary Relational Operations
5. Complete Set of Relational Algebra
6. Additional Relational Operations
7. Examples Queries
8. Relational calculus
9. Summary
1. Relational Algebra Overview

Basic concepts



The relational algebra is a set of operations to
manipulate relations
Used to specify retrieval requests (queries)
Query result is in the form of a relation
1. Relational Algebra Overview

Relational Algebra consists of several groups of operations
 Unary Relational Operations
 SELECT (symbol:  (sigma))
 PROJECT (symbol:  (pi))
 RENAME (symbol:  (rho))
 Relational Algebra Operations From Set Theory
 UNION (  ), INTERSECTION (  ), DIFFERENCE (or
MINUS, – )
 CARTESIAN PRODUCT (  )
 Binary Relational Operations
 JOIN (several variations of JOIN exist)
 DIVISION
 Additional Relational Operations
 OUTER JOINS, OUTER UNION
 AGGREGATE FUNCTIONS (These compute summary of
information: for example, SUM, COUNT, AVG, MIN, MAX)
2. Unary Relational Operations

SELECT operation 


Selects the tuples (rows) from a relation R
that satisfy a certain selection condition c
Form of the operation:  c(R)


The condition c is an arbitrary Boolean
expression (AND, OR, NOT) on the attributes
of R
Resulting relation has the same attributes
as R
(DNO=4 AND SALARY>25000) OR (DNO=5 AND SALARY > 30000) (EMPLOYEE)
2. Unary Relational Operations

PROJECT operation 




Keeps only certain attributes (columns) from
a relation R specified in an attribute list L
Form of operation:  L(R)
Resulting relation has only those attributes of R
specified in L
The PROJECT operation eliminates duplicate
tuples in the resulting relation
2. Unary Relational Operations
 FNAME,LNAME,SALARY(EMPLOYEE)
 SEX,SALARY(EMPLOYEE)
2. Unary Relational Operations

RENAME operation (rho)
General form:

S (B1, B2, …, Bn )(R) changes both:



S(R) changes:


the relation name to S, and
the column (attribute) names to B1, B2, …..Bn
the relation name only to S
(B1, B2, …, Bn )(R) changes:

the column (attribute) names only to B1, B2, …..Bn
2. Unary Relational Operations

Sequences of operations

Several operations can be combined to
form a relational algebra expression (query)

e.g., retrieve the names and salaries of
employees
who work in department 5:
 FNAME,LNAME,SALARY ( DNO=5(EMPLOYEE))
2. Unary Relational Operations

Alternatively, we specify explicit intermediate
relations for each step
DEPT5_EMPS   DNO=5(EMPLOYEE)
RESULT   FNAME,LNAME,SALARY(DEPT5_EMPS)

Attributes can optionally be renamed in the
resulting left-hand-side relation
TEMP   DNO=5(EMPLOYEE)
R(FIRSTNAME,LASTNAME,SALARY) 
 FNAME,LNAME,SALARY(TEMP)
2. Unary Relational Operations
3. Set Operations

Operations from set theory





Binary operations from mathematical set theory
UNION: R1  R2
INTERSECTION: R1  R2
SET DIFFERENCE: R1 - R2
CARTESIAN PRODUCT: R1  R2
3. Set Operations

Union compatibility


For ,  , -, the operand relations R1(A1,
A2, ..., An) and R2(B1, B2, ..., Bn) must have
the same number of attributes, and the
domains of corresponding attributes must be
compatible
The resulting relation for ,  , -, has the
same attribute names as the first operand
relation R1 (by convention)
The Relational Algebra (cont.)

3. Set Operations
3. Set Operations

Cartesian product (cross product, cross join)



R(A1, A2, ..., Am, B1, B2, ..., Bn) 
R1(A1, A2, ..., Am)  R2 (B1, B2, ..., Bn)
A tuple t exists in R for each combination of
tuples t1 from R1 and t2 from R2 such that
t[A1, A2, ..., Am]=t1 and t[B1, B2, ..., Bn]=t2
If R1 has n1 tuples and R2 has n2 tuples, then R
will have n1*n2 tuples
3. Set Operations


CARTESIAN PRODUCT is a meaningless
operation on its own
It is useful when followed by a SELECT operation
that matches values of attributes coming from the
component relations
3. Set Operations
Example:







FEMALE_EMPS  SEX=’F’(EMPLOYEE)
EMPNAMES  FNAME, LNAME, SSN(FEMALE_EMPS)
EMP_DEPENDENTS  EMPNAMES  DEPENDENT
ACTUAL_DEPENDENTS 
SSN=ESSN(EMP_DEPENDENTS)
RESULT 
FNAME, LNAME,DEPENDENT_NAME(ACTUAL_DEPENDENTS)
The Relational Algebra (cont.)
3. Set Operations
4. Binary Relational Operations

JOIN Operations

THETA JOIN


R(A1, A2, ..., Am, B1, B2, ..., Bn) 
R1(A1, A2, ..., Am) c R2 (B1, B2, ..., Bn)
The condition c is called a join condition of the form


<condition> AND <condition> AND . . . AND <condition>
Each condition is of the form Ai  Bj, Ai and Bj have the same
domain
 is one of the comparison operators {=, <, , >, , }
4. Binary Relational Operations

Example
DEPT_MGR  DEPARTMENT

MGRSSN=SSN
EMPLOYEE
THETA JOIN is similar to a CARTESIAN PRODUCT
followed by a SELECT, e.g.,
DEP_EMP  DEPARTMENT  EMPLOYEE
DEPT_MGR   MGRSSN=SSN (DEP_EMP)
4. Binary Relational Operations

EQUIJOIN

The join condition c involves only equality
comparisons
(Ai=Bj) AND ... AND (Ah=Bk); 1<i,h<m, 1<j,k<n



Ai, ..., Ah are called the join attributes of R1
Bj, ..., Bk are called the join attributes of R2
Notice that in the result of an EQUIJOIN one or more
pairs of attributes have identical values in every tuple
e.g., MGRSSN and SSN in Figure 6.6
4. Binary Relational Operations

NATURAL JOIN (*)
R  R1 *(join attributes of R1),(join attributes of R2) R2
 In a NATURAL JOIN, the redundant join
attributes of R2 are eliminated from R
 The equality condition is implied and need not be
specified
 Example
Retrieve each EMPLOYEE's name and the name of the
DEPARTMENT he/she works for:
T  EMPLOYEE *(DNO),(DNUMBER) DEPARTMENT
RESULT   FNAME,LNAME,DNAME(T)
4. Binary Relational Operations


If the join attributes have the same names in both
relations, they need not be specified and we can
write R  R1 * R2
Example
Retrieve each EMPLOYEE's name and the name
of
his/her SUPERVISOR:
SUPERVISOR(SUPERSSN,SFN,SLN) 
 SSN,FNAME,LNAME(EMPLOYEE)
T  EMPLOYEE * SUPERVISOR
RESULT   FNAME,LNAME,SFN,SLN(T)
4. Binary Relational Operations

Note: In the original definition of NATURAL JOIN,
the join attributes were required to have the same
names in both relations
4. Binary Relational Operations

The natural join or equijoin operation can also
be specified among multiple tables, leading to
an n-way join

For example, consider the following three-way join:
((PROJECT DNUM=DNUMBERDEPARTMENT)
MGRSSN=SSN EMPLOYEE)
4. Binary Relational Operations

A relation can have a set of join attributes to
join it with itself, e.g.,
JOIN ATTRIBUTES
RELATIONSHIP
EMPLOYEE(1).SUPERSSN= EMPLOYEE(2) supervises
EMPLOYEE(2).SSN
EMPLOYEE(1)



This type of operation (called recursive closure
algebra) is applied to a recursive relationship
One can think of this as joining two distinct copies
of the relation, although only one relation actually
exists
In this case, renaming can be useful
4. Binary Relational Operations

Example 1
Retrieve each EMPLOYEE's name and the name
of his/her SUPERVISOR:
SUPERVISOR(SSSN,SFN,SLN)   SSN,FNAME,LNAME(EMPLOYEE)
T  EMPLOYEE
SUPERSSN=SSSNSUPERVISOR
RESULT   FNAME,LNAME,SFN,SLN(T)
4. Binary Relational Operations
Example 2
Retrieve all employees supervised by ‘James Borg’ at level 1:
BORG_SSN   SSN(FNAME=’James’ AND LNAME=’Borg’(EMPLOYEE))
SUPERVISION(SSN1, SSN2)   SSN, SUPERSSN(EMPLOYEE)
RESULT1(SSN)   SSN1(SUPERVISION
SSN2=SSN BORG_SSN)
4. Binary Relational Operations
Example 2 (cont.)
Retrieve all employees supervised by ‘James Borg’ at level 2:
RESULT2(SSN)   SSN1(SUPERVISION
SSN2=SSN
RESULT1)
All employees supervised at levels 1 and 2 by ‘James Borg:’
RESULT  RESULT2  RESULT1
4. Binary Relational Operations

DIVISION Operation



The DIVISION operation is applied to two
relations R(Z) ÷ S(X), where X  Z
That is, the result of DIVISION is a relation
T(Y) = R(Z) ÷ S(X), Y = Z - X
Example

Retrieve the names of employees who work on
all the projects that ‘John Smith’ works on
4. Binary Relational Operations
SMITH   FNAME=’John’ AND LNAME=’Smith’(EMPLOYEE)
SMITH_PNOS 
PNO(WORKS_ON
ESSN=SSN SMITH)
SSN_PNOS  ESSN,PNO(WORKS_ON)
SSNS(SSN)  SSN_PNOS ÷ SMITH_PNOS
RESULT  FNAME, LNAME(SSNS * EMPLOYEE)
4. Binary Relational Operations
4. Binary Relational Operations
TRS
4. Binary Relational Operations

The DIVISION operator can be expressed as a
sequence of , , and - operations as follows:
T1  Y(R)
T2  Y((S  T1) - R)
T  T1 - T2
5. Complete Set of Relational Algebra

Complete Set of Relational Algebra
Operations


All the operations discussed so far can be
described as a sequence of only the operations
SELECT, PROJECT, UNION, SET
DIFFERENCE, and CARTESIAN PRODUCT
Hence, the set {, , , -,} is called a
complete set of relational algebra operations
5. Complete Set of Relational Algebra


Any query language equivalent to these
operations is called relationally complete
For database applications, additional operations
are needed that were not part of the original
relational algebra. These include:
1. Aggregate functions and grouping
2. OUTER JOIN and OUTER UNION
5. Complete Set of Relational Algebra
5. Complete Set of Relational Algebra
5. Complete Set of Relational Algebra

Query tree

Represents the input relations of query as leaf nodes
of the tree
 Represents the relational algebra operations as
internal nodes
5. Complete Set of Relational Algebra
(((
(PROJECT))
(DEPARTMENT))
(EMPLOYEE))
6. Additional Relational Operations

Generalized projection



A useful operation for developing reports with
computed values output as columns
Form of operation
 F1, F2, …, Fn (R)
F1, F2, …, Fn are functions over attributes in R
6. Additional Relational Operations

Example
 Relation
EMPLOYEE(SSN, SALARY, DEDUCTION,
YEARS_SERVICE)
 Relation expression
REPORT  (Ssn, Net_salary, Bonus, Tax) (ssn, salary –
deduction, 2000*years_service, 0.25*salary (EMPLOYEE))
SSN Net_salary
Bonus
Tax
6. Additional Relational Operations

Aggregate functions

Functions such as SUM, COUNT, AVERAGE,
MIN, MAX are often applied to sets of values
or sets of tuples in database applications
<grouping attributes>  <function list> (R)

The grouping attributes are optional

<function list> is a list of
(<function> <attribute>) pairs
6. Additional Relational Operations

Example 1
For each department, retrieve the department
number, the number of employees, and the
average salary (in the department):

Attributes renaming
R(DNO,NUMEMPS,AVGSAL) 
DNO  COUNT SSN, AVERAGE SALARY (EMPLOYEE)
 DNO is called the grouping attribute

No attributes renaming
DNO  COUNT SSN, AVERAGE SALARY (EMPLOYEE)
6. Additional Relational Operations
The resulting attributes nam are in the form
<function>_<attribute>
6. Additional Relational Operations

Example 2 -- no grouping attributes are
specified
Retrieve the average salary of all employees (no
grouping needed)
R(AVGSAL)   AVERAGE SALARY (EMPLOYEE)

The functions are applied to the attribute values of all the
tuples in the relation, so the resulting relation has a single
tuple only
6. Additional Relational Operations

OUTER JOIN



In a regular EQUIJOIN or NATURAL JOIN
operation, tuples in R1 or R2 that do not have
matching tuples in the other relation do not
appear in the result
Some queries require all tuples in R1 (or R2 or
both) to appear in the result
When no matching tuples are found, nulls are
placed for the missing attributes
6. Additional Relational Operations

LEFT OUTER JOIN
R1
R2
Lets every tuple in R1 appear in the result
 Example
List all employee names and also the name
of the departments they manage if they
happen to manage a department:

TEMP  (EMPLOYEE
SSN=MGRSSN
DEPARTMENT)
RESULT  FNAME, MINIT, LNAME, DNAME(TEMP)
6. Additional Relational Operations
6. Additional Relational Operations

RIGHT OUTER JOIN
R1
R2
 Lets every tuple in R2 appear in the result

FULL OUTER JOIN
R1
R2
 Lets every tuple in R1 or R2 appear in the
result
6. Additional Relational Operations

OUTER UNION



It was developed to take the union of tuples from
two relations if the relations are not union
compatible
This operation will take the UNION of tuples in two
relations that are partially compatible
Example


STUDENT(Name, SSN, Department, Advisor) and
FACULTY(Name, SSN, Department, Rank).
The resulting relation
R(Name, SSN, Department, Advisor, Rank)
7. Example Queries

Query 1

Retrieve the name and address of all employees
who work for the ‘Research’ department:
RESEARCH_DEPT 
DNAME=’Research’(DEPARTMENT)
RESEARCH_EMPS 
(RESEARCH_DEPT
DNUMBER=DNO
EMPLOYEE)
RESULT 
FNAME, LNAME, ADDRESS(RESEARCH_EMPS)
7. Example Queries

Query 2

For every project located in ‘Stafford’, list the project
number, the controlling department number, and the
department manager’s last name, address, and birthdate:
STAFFORD_PROJS  PLOCATION=’Stafford’(PROJECT)
CONTR_DEPT 
(STAFFORD_PROJS
DNUM=DNUMBER DEPARTMENT)
PROJ_DEPT_MGR 
(CONTR_DEPT
RESULT 
MGRSSN=SSN
EMPLOYEE)
 PNUMBER, DNUM, LNAME, ADDRESS, BDATE(PROJ_DEPT_MGR)
8. Relational Calculus


Declarative expression
 Specify a retrieval request nonprocedural language
Any retrieval that can be specified in basic relational
algebra
 Can also be specified in relational calculus
8. Relational Calculus

Tuple variables



Ranges over a particular database relation
Satisfy COND(t):
Specify:

Range relation R of t
 Select particular combinations of tuples
 Set of attributes to be retrieved (requested
attributes)
8. Relational Calculus

General expression of tuple relational calculus
is of the form:

Truth value of an atom


Evaluates to either TRUE or FALSE for a specific
combination of tuples
Formula (Boolean condition)

Made up of one or more atoms connected via logical
operators AND, OR, and NOT
8. Relational Calculus



Universal quantifier (∀)
Existential quantifier (∃)
Define a tuple variable in a formula as free or
bound
8. Relational Calculus
8. Relational Calculus
8. Relational Calculus

Transform one type of quantifier into other with
negation (preceded by NOT)

AND and OR replace one another
 Negated formula becomes unnegated
 Unnegated formula becomes negated
8. Relational Calculus
8. Relational Calculus

Guaranteed to yield a finite number of tuples as
its result


Otherwise expression is called unsafe
Expression is safe

If all values in its result are from the domain of the
expression
8. Relational Calculus

Differs from tuple calculus in type of variables
used in formulas


Variables range over single values from domains of
attributes
Formula is made up of atoms

Evaluate to either TRUE or FALSE for a specific set
of values
•
Called the truth values of the atoms
8. Relational Calculus

QBE language

Based on domain relational calculus
9. Summary

Relational Algebra






Unary Relational Operations
Relational Algebra Operations From Set Theory
Binary Relational Operations
Additional Relational Operations
Examples of Queries in Relational Algebra
Relational calculus

Based predicate calculus