No Slide Title
Download
Report
Transcript No Slide Title
Relational Algebra
Relational Query Languages
Query = “retrieval program”
Language examples:
Theoretical:
1. Relational Algebra
2. Relational Calculus
a. tuple relational calculus (TRC)
b. domain relational calculus (DRC)
Practical
1. SQL (SEQUEL from System R)
2. QUEL (Ingres)
3. Datalog (Prolog-like)
Theoretical QL’s:
give semantics to practical QL’s
key to understand query optimization in relational DBMSs
Relational Algebra
Basic operators
select ( )
project (p )
union ( )
set difference ( - )
cartesian product ( x )
rename ( )
The operators take one or two relations as inputs and give a
new relation as a result.
relation
relation
relational
operator
relation
Example Instances
bid
101
102
103
104
bname
Interlake
Interlake
Clipper
Marine
color
blue
red
green
red
Boats
Schema:
Boats(bid, bname, color)
Sailors(sid, sname, rating, age)
Reserves( sid, bid, day)
R1 sid
22
58
bid
101
103
day
10/10/96
11/12/96
sid
S1 22
31
58
sname
dustin
lubber
rusty
rating
7
8
10
age
45.0
55.5
35.0
sid
S2 28
31
44
58
sname
yuppy
lubber
guppy
rusty
rating
9
8
5
10
age
35.0
55.5
35.0
35.0
Projection
Examples:
p age(S2)
;
p sname,rating(S2)
Retains only attributes that are in the “projection list”.
Schema of result:
exactly the columns in the projection list, with the same names that they
had in the input relation.
Projection operator has to eliminate duplicates
(How do they
arise? Why remove them?)
Note: real systems typically don’t do duplicate elimination unless the
user explicitly asks for it. (Why not?)
Projection
sid
28
31
44
58
sname rating age
yuppy
9
35.0
lubber
8
55.5
guppy
5
35.0
rusty
10 35.0
S2
sname
rating
yuppy
lubber
guppy
rusty
9
8
5
10
p sname,rating (S 2)
age
35.0
55.5
p age(S2)
Selection ()
Selects rows that satisfy selection condition.
Result is a relation.
Schema of result is same as that of the input relation.
Do we need to do duplicate elimination?
sid
28
31
44
58
sname rating age
yuppy
9
35.0
lubber
8
55.5
guppy
5
35.0
rusty
10 35.0
rating 8(S2)
sname
yuppy
rusty
rating
9
10
p sname,rating( rating 8(S2))
Selection
Notation:
p(r)
p is called the selection predicate , r can be the name of a table,
or another query
Predicate:
1. Simple
attr1 = attr2
Attr = constant value
(also, <, > , etc)
2. Complex
predicate1 AND predicate2
predicate1 OR predicate2
NOT (predicate)
Union and Set-Difference
All of these operations take two input relations, which
must be union-compatible:
Same number of columns (attributes).
`Corresponding’ columns have the same type.
For which, if any, is duplicate elimination required?
Union
sid
22
31
58
sname rating age
dustin
7
45.0
lubber
8
55.5
rusty
10 35.0
S1
sid
28
31
44
58
sname rating age
yuppy
9
35.0
lubber
8
55.5
guppy
5
35.0
rusty
10 35.0
S2
sid sname rating age
22
31
58
44
28
dustin
lubber
rusty
guppy
yuppy
7
8
10
5
9
S1S2
45.0
55.5
35.0
35.0
35.0
Set Difference
sid
22
31
58
sname rating age
dustin
7
45.0
lubber
8
55.5
rusty
10 35.0
sid sname rating age
22 dustin 7
45.0
S1 S2
S1
sid
28
31
44
58
sname rating age
yuppy
9
35.0
lubber
8
55.5
guppy
5
35.0
rusty
10 35.0
S2
sid sname rating age
28 yuppy
9
35.0
44 guppy
5
35.0
S2 – S1
Cartesian-Product
S1 R1: Each row of S1 paired with each row of R1.
Like the c.p for mathematical relations: every tuple of S1 “appended” to every
tuple of R1
Q: How many rows in the result?
Result schema has one field per field of S1 and R1, with field
names `inherited’ if possible.
May have a naming conflict: Both S1 and R1 have a field with the
same name.
In this case, can use the renaming operator…
Cartesian Product Example
sid
22
31
58
sname
dustin
lubber
rusty
rating
7
8
10
age
45.0
55.5
35.0
sid bid
day
22 101 10/10/96
58 103 11/12/96
R1
S1
(sid) sname rating age
R1 X S1 =
(sid) bid day
22
dustin
7
45.0
22
101 10/10/96
22
dustin
7
45.0
58
103 11/12/96
31
lubber
8
55.5
22
101 10/10/96
31
lubber
8
55.5
58
103 11/12/96
58
rusty
10
35.0
22
101 10/10/96
58
rusty
10
35.0
58
103 11/12/96
Rename ( )
Allows us to refer to a relation by more than one name
and to rename conflicting names
Example:
x (E)
returns the expression E under the name X
If a relational-algebra expression E has arity n, then
x (A1, A2, …, An) (E)
returns the result of expression E under the name X, and with the
attributes renamed to A1, A2, …., An.
Ex.
temp1(sid1,sname,rating, age, sid2, bid, day) (R1 x S1)
Compound Operator: Intersection
In addition to the 6 basic operators, there are
several additional “Compound Operators”
These add no computational power to the language, but are
useful shorthands.
Can be expressed solely with the basic ops.
Intersection takes two input relations, which must be
union-compatible.
Q: How to express it using basic operators?
R S = R (R S)
Intersection
sid
22
31
58
sname rating age
dustin
7
45.0
lubber
8
55.5
rusty
10 35.0
S1
sid
28
31
44
58
sname rating age
yuppy
9
35.0
lubber
8
55.5
guppy
5
35.0
rusty
10 35.0
S2
sid sname rating age
31 lubber 8
55.5
58 rusty
10
35.0
S1 S2
Compound Operator: Join
Joins are compound operators involving cross product,
selection, and (sometimes) projection.
Most common type of join is a “natural join” (often
just called “join”). R
S conceptually is:
Compute R S
Select rows where attributes that appear in both relations have equal values
Project all unique atttributes and one copy of each of the common ones.
Note: Usually done much more efficiently than this.
Useful for putting “normalized” relations back
together.
Natural Join Example
sid
22
58
bid
101
103
day
10/10/96
11/12/96
sid
22
31
58
sname rating age
dustin
7
45.0
lubber
8
55.5
rusty
10
35.0
R1
R1
S1
S1 =
sid
22
58
sname
dustin
rusty
rating
7
10
age
45.0
35.0
bid
101
103
day
10/10/96
11/12/96
Other Types of Joins
Condition Join (or “theta-join”):
R c S c (R S)
(sid)
22
31
sname
dustin
lubber
rating
7
8
S1
age
45.0
55.5
(sid)
58
58
S1.sid R1.sid
bid
103
103
day
11/12/96
11/12/96
R1
Result schema same as that of cross-product.
May have fewer tuples than cross-product.
Equi-join: special case: condition c contains only conjunction of
equalities.
Compound Operator: Division
Useful for expressing “for all” queries like:
Find sids of sailors who have reserved all boats.
For A/B attributes of B are subset of attrs of A.
May need to “project” to make this happen.
E.g., let A have 2 fields, x and y; B have only field y:
AB
x y
B( x, y A)
A/B contains all tuples (x) such that for every y tuple in B, there is an
xy tuple in A.
Examples of Division A/B
sno
s1
s1
s1
s1
s2
s2
s3
s4
s4
pno
p1
p2
p3
p4
p1
p2
p2
p2
p4
A
pno
p2
pno
p2
p4
B2
pno
p1
p2
p4
sno
s1
s2
s3
s4
sno
s1
s4
A/B1
A/B2
sno
s1
B1
B3
A/B3
Expressing A/B Using Basic Operators
Division is not essential op; just a useful shorthand.
(Also true of joins, but joins are so common that systems implement joins
specially.)
Idea: For A/B, compute all x values that are not `disqualified’
by some y value in B.
x value is disqualified if by attaching y value from B, we obtain an xy tuple
that is not in A.
Disqualified x values =
p x((p x( A)B) A)
A/B = p x ( A) Disqualified x values
Banking Example
branch (branch-name, branch-city, assets)
customer (customer-name, customer-street, customer-only)
account (account-number, branch-name, balance)
loan (loan-number, branch-name, amount)
depositor (customer-name, account-number)
borrower (customer-name, loan-number)
Example Queries
loan (loan-number, branch-name, amount)
Find all loans of over $1200
amount >1200 (loan)
Find the loan number for each loan of an amount greater than
$1200
ploan-number (amount > 1200 (loan))
Example Queries
depositor (customer-name, account-number)
borrower (customer-name, loan-number)
Find the names of all customers who have a loan, an
depositor account, or both, from the bank
pcustomer-name (borrower) pcustomer-name (depositor)
Find the names of all customers who have a loan and
a depositor account at bank.
pcustomer-name (borrower) pcustomer-name (depositor)
Example Queries
loan (loan-number, branch-name, amount)
depositor (customer-name, account-number)
borrower (customer-name, loan-number)
Find the names of all customers who have a loan at the Perryridge
branch but do not have an depositor account at any branch of the bank.
pcustomer-name (branch-name = “Perryridge” (borrower
– pcustomer-name (depositor)
loan))
Example Queries
account (account-number, branch-name, balance)
Find the largest account balance
Rename account relation as d
The query is:
p balance(account) -
paccount.balance( account.balance < d.balance (account x d (account)))
Example Queries
account (account-number, branch-name, balance)
depositor (customer-name, account-number)
Find all customers who have an account from at least
the “Downtown” and the Uptown” branches.
Query 1
pCN(BN=“Downtown”(depositor
account))
pCN(BN=“Uptown”(depositor
account))
where CN denotes customer-name and BN denotes
branch-name.
Query 2
customer-name, branch-name (depositor
account)
temp(branch-name) ({(“Downtown”), (“Uptown”)})
Example Queries
Find all customers who have an account at all
branches located in Boston.
pcustomer-name, branch-name (depositor account)
pbranch-name (branch-city = “Boston” (branch))
Extended Relational Operations
Additional Operators that extend the power of the
language
Based on SQL… make the language less clean
Generalized projections
Outer Joins
Update
General Projection
Notation:
p e1, e2, …, en (Relation)
ei: can include any arithmetic operation – not only attributes
Example:
credit =
cname
Jones
Turner
Smith
limit
5000
3000
4000
balance
2000
2500
3000
Then:
p cname, limit – balance =
cname limit -balance
Jones
3000
Turner
500
Smith
1000
Outer Joins
Motivation:
loan
loan
bname
Downtown
Redwood
Perry
borrower =
lno
L-170
L-230
L-260
amt
3000
4000
1700
bname
lno
Downtown L-170
Redwood L-230
Join result loses:
•any record of Perry
•any record of Smith
cname
Jones
Turner
Smith
lno
L-170
L-230
L-155
borrower
amt
3000
4000
cname
Jones
Turner
Outer Join (
Left outer Join (
)
)
preserves all tuples in left relation
loan
borrower =
bname
Downtown
Redwood
Perry
lno
L-170
L-230
L-260
amt
3000
4000
1700
bname
Right outer Join (
)
Downtown
preserves all tuples in right relation Redwood
NULL
cname
Jones
Turner
NULL
lno
L-170
L-230
L-155
amt
3000
4000
NULL
cname
Jones
Turner
Smith
Outer Join (cont)
Full Outer Join
(
)
preserves all tuples in both relations
bname
Downtown
Redwood
Perry
NULL
lno
L-170
L-230
L-260
L-155
amt
3000
4000
1700
NULL
cname
Jones
Turner
NULL
Smith
Update ()
1) Deletion: r r – s
account account – bname = Perry (account)
2) Insertion: r r s
branch branch {( BU, Boston, 9M)}
3) Update: r
account
pe1, e2, …, en (r)
p bname, acct_no, bal * 1.05 (account)