Transcript 投影片 1
Database Systems
Chapter 6
The Relational Algebra
主講人:陳建源
研究室 :法401
Email: [email protected]
日期:99/11/30
Outline
1. Relational Algebra Overview
2. Unary Relational Operations
3. Set Operations
4. Binary Relational Operations
5. Complete Set of Relational Algebra
6. Additional Relational Operations
7. Examples Queries
8. Relational calculus
9. Summary
1. Relational Algebra Overview
Basic concepts
The relational algebra is a set of operations to
manipulate relations
Used to specify retrieval requests (queries)
Query result is in the form of a relation
1. Relational Algebra Overview
Relational Algebra consists of several groups of operations
Unary Relational Operations
SELECT (symbol: (sigma))
PROJECT (symbol: (pi))
RENAME (symbol: (rho))
Relational Algebra Operations From Set Theory
UNION ( ), INTERSECTION ( ), DIFFERENCE (or
MINUS, – )
CARTESIAN PRODUCT ( )
Binary Relational Operations
JOIN (several variations of JOIN exist)
DIVISION
Additional Relational Operations
OUTER JOINS, OUTER UNION
AGGREGATE FUNCTIONS (These compute summary of
information: for example, SUM, COUNT, AVG, MIN, MAX)
2. Unary Relational Operations
SELECT operation
Selects the tuples (rows) from a relation R
that satisfy a certain selection condition c
Form of the operation: c(R)
The condition c is an arbitrary Boolean
expression (AND, OR, NOT) on the attributes
of R
Resulting relation has the same attributes
as R
(DNO=4 AND SALARY>25000) OR (DNO=5 AND SALARY > 30000) (EMPLOYEE)
2. Unary Relational Operations
PROJECT operation
Keeps only certain attributes (columns) from
a relation R specified in an attribute list L
Form of operation: L(R)
Resulting relation has only those attributes of R
specified in L
The PROJECT operation eliminates duplicate
tuples in the resulting relation
2. Unary Relational Operations
FNAME,LNAME,SALARY(EMPLOYEE)
SEX,SALARY(EMPLOYEE)
2. Unary Relational Operations
RENAME operation (rho)
General form:
S (B1, B2, …, Bn )(R) changes both:
S(R) changes:
the relation name to S, and
the column (attribute) names to B1, B2, …..Bn
the relation name only to S
(B1, B2, …, Bn )(R) changes:
the column (attribute) names only to B1, B2, …..Bn
2. Unary Relational Operations
Sequences of operations
Several operations can be combined to
form a relational algebra expression (query)
e.g., retrieve the names and salaries of
employees
who work in department 5:
FNAME,LNAME,SALARY ( DNO=5(EMPLOYEE))
2. Unary Relational Operations
Alternatively, we specify explicit intermediate
relations for each step
DEPT5_EMPS DNO=5(EMPLOYEE)
RESULT FNAME,LNAME,SALARY(DEPT5_EMPS)
Attributes can optionally be renamed in the
resulting left-hand-side relation
TEMP DNO=5(EMPLOYEE)
R(FIRSTNAME,LASTNAME,SALARY)
FNAME,LNAME,SALARY(TEMP)
2. Unary Relational Operations
3. Set Operations
Operations from set theory
Binary operations from mathematical set theory
UNION: R1 R2
INTERSECTION: R1 R2
SET DIFFERENCE: R1 - R2
CARTESIAN PRODUCT: R1 R2
3. Set Operations
Union compatibility
For , , -, the operand relations R1(A1,
A2, ..., An) and R2(B1, B2, ..., Bn) must have
the same number of attributes, and the
domains of corresponding attributes must be
compatible
The resulting relation for , , -, has the
same attribute names as the first operand
relation R1 (by convention)
The Relational Algebra (cont.)
3. Set Operations
3. Set Operations
Cartesian product (cross product, cross join)
R(A1, A2, ..., Am, B1, B2, ..., Bn)
R1(A1, A2, ..., Am) R2 (B1, B2, ..., Bn)
A tuple t exists in R for each combination of
tuples t1 from R1 and t2 from R2 such that
t[A1, A2, ..., Am]=t1 and t[B1, B2, ..., Bn]=t2
If R1 has n1 tuples and R2 has n2 tuples, then R
will have n1*n2 tuples
3. Set Operations
CARTESIAN PRODUCT is a meaningless
operation on its own
It is useful when followed by a SELECT operation
that matches values of attributes coming from the
component relations
3. Set Operations
Example:
FEMALE_EMPS SEX=’F’(EMPLOYEE)
EMPNAMES FNAME, LNAME, SSN(FEMALE_EMPS)
EMP_DEPENDENTS EMPNAMES DEPENDENT
ACTUAL_DEPENDENTS
SSN=ESSN(EMP_DEPENDENTS)
RESULT
FNAME, LNAME,DEPENDENT_NAME(ACTUAL_DEPENDENTS)
The Relational Algebra (cont.)
3. Set Operations
4. Binary Relational Operations
JOIN Operations
THETA JOIN
R(A1, A2, ..., Am, B1, B2, ..., Bn)
R1(A1, A2, ..., Am) c R2 (B1, B2, ..., Bn)
The condition c is called a join condition of the form
<condition> AND <condition> AND . . . AND <condition>
Each condition is of the form Ai Bj, Ai and Bj have the same
domain
is one of the comparison operators {=, <, , >, , }
4. Binary Relational Operations
Example
DEPT_MGR DEPARTMENT
MGRSSN=SSN
EMPLOYEE
THETA JOIN is similar to a CARTESIAN PRODUCT
followed by a SELECT, e.g.,
DEP_EMP DEPARTMENT EMPLOYEE
DEPT_MGR MGRSSN=SSN (DEP_EMP)
4. Binary Relational Operations
EQUIJOIN
The join condition c involves only equality
comparisons
(Ai=Bj) AND ... AND (Ah=Bk); 1<i,h<m, 1<j,k<n
Ai, ..., Ah are called the join attributes of R1
Bj, ..., Bk are called the join attributes of R2
Notice that in the result of an EQUIJOIN one or more
pairs of attributes have identical values in every tuple
e.g., MGRSSN and SSN in Figure 6.6
4. Binary Relational Operations
NATURAL JOIN (*)
R R1 *(join attributes of R1),(join attributes of R2) R2
In a NATURAL JOIN, the redundant join
attributes of R2 are eliminated from R
The equality condition is implied and need not be
specified
Example
Retrieve each EMPLOYEE's name and the name of the
DEPARTMENT he/she works for:
T EMPLOYEE *(DNO),(DNUMBER) DEPARTMENT
RESULT FNAME,LNAME,DNAME(T)
4. Binary Relational Operations
If the join attributes have the same names in both
relations, they need not be specified and we can
write R R1 * R2
Example
Retrieve each EMPLOYEE's name and the name
of
his/her SUPERVISOR:
SUPERVISOR(SUPERSSN,SFN,SLN)
SSN,FNAME,LNAME(EMPLOYEE)
T EMPLOYEE * SUPERVISOR
RESULT FNAME,LNAME,SFN,SLN(T)
4. Binary Relational Operations
Note: In the original definition of NATURAL JOIN,
the join attributes were required to have the same
names in both relations
4. Binary Relational Operations
The natural join or equijoin operation can also
be specified among multiple tables, leading to
an n-way join
For example, consider the following three-way join:
((PROJECT DNUM=DNUMBERDEPARTMENT)
MGRSSN=SSN EMPLOYEE)
4. Binary Relational Operations
A relation can have a set of join attributes to
join it with itself, e.g.,
JOIN ATTRIBUTES
RELATIONSHIP
EMPLOYEE(1).SUPERSSN= EMPLOYEE(2) supervises
EMPLOYEE(2).SSN
EMPLOYEE(1)
This type of operation (called recursive closure
algebra) is applied to a recursive relationship
One can think of this as joining two distinct copies
of the relation, although only one relation actually
exists
In this case, renaming can be useful
4. Binary Relational Operations
Example 1
Retrieve each EMPLOYEE's name and the name
of his/her SUPERVISOR:
SUPERVISOR(SSSN,SFN,SLN) SSN,FNAME,LNAME(EMPLOYEE)
T EMPLOYEE
SUPERSSN=SSSNSUPERVISOR
RESULT FNAME,LNAME,SFN,SLN(T)
4. Binary Relational Operations
Example 2
Retrieve all employees supervised by ‘James Borg’ at level 1:
BORG_SSN SSN(FNAME=’James’ AND LNAME=’Borg’(EMPLOYEE))
SUPERVISION(SSN1, SSN2) SSN, SUPERSSN(EMPLOYEE)
RESULT1(SSN) SSN1(SUPERVISION
SSN2=SSN BORG_SSN)
4. Binary Relational Operations
Example 2 (cont.)
Retrieve all employees supervised by ‘James Borg’ at level 2:
RESULT2(SSN) SSN1(SUPERVISION
SSN2=SSN
RESULT1)
All employees supervised at levels 1 and 2 by ‘James Borg:’
RESULT RESULT2 RESULT1
4. Binary Relational Operations
DIVISION Operation
The DIVISION operation is applied to two
relations R(Z) ÷ S(X), where X Z
That is, the result of DIVISION is a relation
T(Y) = R(Z) ÷ S(X), Y = Z - X
Example
Retrieve the names of employees who work on
all the projects that ‘John Smith’ works on
4. Binary Relational Operations
SMITH FNAME=’John’ AND LNAME=’Smith’(EMPLOYEE)
SMITH_PNOS
PNO(WORKS_ON
ESSN=SSN SMITH)
SSN_PNOS ESSN,PNO(WORKS_ON)
SSNS(SSN) SSN_PNOS ÷ SMITH_PNOS
RESULT FNAME, LNAME(SSNS * EMPLOYEE)
4. Binary Relational Operations
4. Binary Relational Operations
TRS
4. Binary Relational Operations
The DIVISION operator can be expressed as a
sequence of , , and - operations as follows:
T1 Y(R)
T2 Y((S T1) - R)
T T1 - T2
5. Complete Set of Relational Algebra
Complete Set of Relational Algebra
Operations
All the operations discussed so far can be
described as a sequence of only the operations
SELECT, PROJECT, UNION, SET
DIFFERENCE, and CARTESIAN PRODUCT
Hence, the set {, , , -,} is called a
complete set of relational algebra operations
5. Complete Set of Relational Algebra
Any query language equivalent to these
operations is called relationally complete
For database applications, additional operations
are needed that were not part of the original
relational algebra. These include:
1. Aggregate functions and grouping
2. OUTER JOIN and OUTER UNION
5. Complete Set of Relational Algebra
5. Complete Set of Relational Algebra
5. Complete Set of Relational Algebra
Query tree
Represents the input relations of query as leaf nodes
of the tree
Represents the relational algebra operations as
internal nodes
5. Complete Set of Relational Algebra
(((
(PROJECT))
(DEPARTMENT))
(EMPLOYEE))
6. Additional Relational Operations
Generalized projection
A useful operation for developing reports with
computed values output as columns
Form of operation
F1, F2, …, Fn (R)
F1, F2, …, Fn are functions over attributes in R
6. Additional Relational Operations
Example
Relation
EMPLOYEE(SSN, SALARY, DEDUCTION,
YEARS_SERVICE)
Relation expression
REPORT (Ssn, Net_salary, Bonus, Tax) (ssn, salary –
deduction, 2000*years_service, 0.25*salary (EMPLOYEE))
SSN Net_salary
Bonus
Tax
6. Additional Relational Operations
Aggregate functions
Functions such as SUM, COUNT, AVERAGE,
MIN, MAX are often applied to sets of values
or sets of tuples in database applications
<grouping attributes> <function list> (R)
The grouping attributes are optional
<function list> is a list of
(<function> <attribute>) pairs
6. Additional Relational Operations
Example 1
For each department, retrieve the department
number, the number of employees, and the
average salary (in the department):
Attributes renaming
R(DNO,NUMEMPS,AVGSAL)
DNO COUNT SSN, AVERAGE SALARY (EMPLOYEE)
DNO is called the grouping attribute
No attributes renaming
DNO COUNT SSN, AVERAGE SALARY (EMPLOYEE)
6. Additional Relational Operations
The resulting attributes nam are in the form
<function>_<attribute>
6. Additional Relational Operations
Example 2 -- no grouping attributes are
specified
Retrieve the average salary of all employees (no
grouping needed)
R(AVGSAL) AVERAGE SALARY (EMPLOYEE)
The functions are applied to the attribute values of all the
tuples in the relation, so the resulting relation has a single
tuple only
6. Additional Relational Operations
OUTER JOIN
In a regular EQUIJOIN or NATURAL JOIN
operation, tuples in R1 or R2 that do not have
matching tuples in the other relation do not
appear in the result
Some queries require all tuples in R1 (or R2 or
both) to appear in the result
When no matching tuples are found, nulls are
placed for the missing attributes
6. Additional Relational Operations
LEFT OUTER JOIN
R1
R2
Lets every tuple in R1 appear in the result
Example
List all employee names and also the name
of the departments they manage if they
happen to manage a department:
TEMP (EMPLOYEE
SSN=MGRSSN
DEPARTMENT)
RESULT FNAME, MINIT, LNAME, DNAME(TEMP)
6. Additional Relational Operations
6. Additional Relational Operations
RIGHT OUTER JOIN
R1
R2
Lets every tuple in R2 appear in the result
FULL OUTER JOIN
R1
R2
Lets every tuple in R1 or R2 appear in the
result
6. Additional Relational Operations
OUTER UNION
It was developed to take the union of tuples from
two relations if the relations are not union
compatible
This operation will take the UNION of tuples in two
relations that are partially compatible
Example
STUDENT(Name, SSN, Department, Advisor) and
FACULTY(Name, SSN, Department, Rank).
The resulting relation
R(Name, SSN, Department, Advisor, Rank)
7. Example Queries
Query 1
Retrieve the name and address of all employees
who work for the ‘Research’ department:
RESEARCH_DEPT
DNAME=’Research’(DEPARTMENT)
RESEARCH_EMPS
(RESEARCH_DEPT
DNUMBER=DNO
EMPLOYEE)
RESULT
FNAME, LNAME, ADDRESS(RESEARCH_EMPS)
7. Example Queries
Query 2
For every project located in ‘Stafford’, list the project
number, the controlling department number, and the
department manager’s last name, address, and birthdate:
STAFFORD_PROJS PLOCATION=’Stafford’(PROJECT)
CONTR_DEPT
(STAFFORD_PROJS
DNUM=DNUMBER DEPARTMENT)
PROJ_DEPT_MGR
(CONTR_DEPT
RESULT
MGRSSN=SSN
EMPLOYEE)
PNUMBER, DNUM, LNAME, ADDRESS, BDATE(PROJ_DEPT_MGR)
8. Relational Calculus
Declarative expression
Specify a retrieval request nonprocedural language
Any retrieval that can be specified in basic relational
algebra
Can also be specified in relational calculus
8. Relational Calculus
Tuple variables
Ranges over a particular database relation
Satisfy COND(t):
Specify:
Range relation R of t
Select particular combinations of tuples
Set of attributes to be retrieved (requested
attributes)
8. Relational Calculus
General expression of tuple relational calculus
is of the form:
Truth value of an atom
Evaluates to either TRUE or FALSE for a specific
combination of tuples
Formula (Boolean condition)
Made up of one or more atoms connected via logical
operators AND, OR, and NOT
8. Relational Calculus
Universal quantifier (∀)
Existential quantifier (∃)
Define a tuple variable in a formula as free or
bound
8. Relational Calculus
8. Relational Calculus
8. Relational Calculus
Transform one type of quantifier into other with
negation (preceded by NOT)
AND and OR replace one another
Negated formula becomes unnegated
Unnegated formula becomes negated
8. Relational Calculus
8. Relational Calculus
Guaranteed to yield a finite number of tuples as
its result
Otherwise expression is called unsafe
Expression is safe
If all values in its result are from the domain of the
expression
8. Relational Calculus
Differs from tuple calculus in type of variables
used in formulas
Variables range over single values from domains of
attributes
Formula is made up of atoms
Evaluate to either TRUE or FALSE for a specific set
of values
•
Called the truth values of the atoms
8. Relational Calculus
QBE language
Based on domain relational calculus
9. Summary
Relational Algebra
Unary Relational Operations
Relational Algebra Operations From Set Theory
Binary Relational Operations
Additional Relational Operations
Examples of Queries in Relational Algebra
Relational calculus
Based predicate calculus