CS 157B Database Systems
Download
Report
Transcript CS 157B Database Systems
CS 157B
Database Systems
Dr. T Y Lin
Delete old p37 and change old
p40.
CS 157B
Database Systems
Review Relational Algebra
2.4 An Algebraic Query Language
• The set of real numbers has a natural
algebraic structure.
• The set of relations (tables) has a natural
algebraic structure, called Relational
algebra
• For now, we will not look at it as a query
language
2.4 An Algebraic Query Language
•
•
•
•
•
•
•
•
•
•
•
•
•
•
2.4.1 Why Do We Need a Special Query Language?
2.4.2 What is an Algebra?
2.4.3 Overview of Relational Algebra
2.4.4 Set Operations on Relations
2.4.5 Projection
2.4.6 Selection
2.4.7 Cartesian Product
2.4.8 Natural Joins
2.4.9 Theta-Joins
2.4.10 Combining Operations to Form Queries
2.4.11 Naming and Renaming
2.4.12 Relationships Among Operations
2.4.13 A Linear Notation for Algebraic Expressions
2.4.14 Exercises for Section 2.4
2.4.1 Why Do We Need a Special
Query Language?
2.4.2 What is an Algebra?
2.4.3 Overview of Relational
Algebra
• A sample From CS157A Exam:
Choose appropriate names for the following SQL
SELECT
TITLE,LENGTH
FROM
Movie
Special Relational Operators
• Projection
• Selection
• Natural Join
Traditional Set Operators
4. Cartesian Product
5. Union
6. Intersect
7. Minus
2.4.3 Overview of Relational
Algebra
• NUMERICAL ALGEBRA handles
operations involving numbers:
• RELATIONAL ALGEBRA handles
operations among relations.
2.4.3 Overview of Relational
Algebra
+
*
/
Projection
Selection
Natural Join
Cartesian
Product
Union
Intersect
Minus
2.4.3 Overview of Relational
Algebra
• Both numerical algebra and relational
algebra are defined by (respective sets of)
• Algebraic Laws.
2.4.3 Overview of Relational
Algebra
Numerical Algebra is defined by
• Algebraic laws
that we have learned in calculus, algebra
and etc since high school days
2.4.3 Overview of Relational
Algebra
• Algebraic laws
of relational algebra are
•
“New” to most of us
QUERY OPTIMIZATION
• In query optimization the query is
transformed by compiler into such a form
that can solve the problem “fastest ”
QUERY OPTIMIZATION
•
•
Illustration: 2 * 3 + 5 * 3
From the laws of Numerical algebra, it
can be computed as follows:.
Method I - It computes in three operations
• 2 * 3 + 5 * 3 = 6 (first) + 15(second)
= 21 (third)
QUERY OPTIMIZATION
•
•
Illustration: 2 * 3 + 5 * 3
From the laws of Numerical algebra, it
can Also be computed as follows:.
Method II- transform into equivalent one.
= (2 + 5) * 3 = 7 (first) * 3 = 21 (second)
• It computes in two operations
So compiler choose Method II
Roles of Relational Algebra
• QUERY OPTIMIZATION
• Similar Idea is used in Relational Algebra
• So we need to know the
• Algebraic Laws
of Relational Algebra well
• That is the main goal for Ch5 and part of
Ch16
2.4.4 Set Operations on Relations
1. Recall Venn Diagram
Y
X
a
b
d
c
e
2.4.4 Set Operations on Relations
Name
Address
Gender
Birthdate
Carrie Fisher
123 Maple st., Hollywood
F
9/9/99
Mark hamill
456 Oak road., Brentwood
M
8/8/88
Relation R
Name
Address
Gender
Birthdate
Carrie Fisher
123 Maple st., Hollywood
F
9/9/99
Harrison Ford
789 Palm Dr., Beverly Hills
M
7/7/77
Relation S
Figure 2.12 Two Relations
Example 2.8
• Relation R and S
• UNION (R S) is
Name
Address
Gender
Birthdate
Carrie Fisher
123 Maple st., Hollywood
F
9/9/99
Mark Hamill
456 oak Rd., Brentwood
M
8/8/88
Harrison Ford
789 Palm Dr., Beverly Hills
M
7/7/77
• Note: the two tuples for Carrie Fisher from
two relations appear only once in result
2.4.4 Set Operations on Relations
• The Intersection (RS)
Name
Address
Gender
Birthdate
Carrie Fisher
123 Maple st.,
Hollywood
F
9/9/99
• Now, Only the Carrie Fisher tuple appears, because only
it is in both relations. The Difference is R-S is
Name
Address
Gender
Birthdate
Mark
Hamill
456 oak Rd.,
Brentwood
M
8/8/88
• Fisher and Hamill tuples appear in R and thus are
candidates for R-S. thus Fisher appear in S.
• so is not R-S.
2.4.5 Projection
Projection
Title
Year
Length
Genre
Studioname
producerC#
Star Wars
1977
124
SciFi
Fox
12345
Galaxy
1999
104
Comedy
DreamWorks
67890
Wayne’s
World
1992
95
Comedy
Paramount
99999
Figure 2.13 the Relation Movies
Example 2.9
Title,year,length (Movies)
Title
Year
Length
Star Wars
1977
124
Galaxy Quest
1999
104
Wayne’s World
1992
95
genre (Movies)
Genre
SciFi
Comedy
Comedy
2.4.6 Selection
Example 2.10
• The relation Movies be as in Fig 2.13. then
the value of expression
length >= 100(Movie) is
Title
Year
Length
Genre
StudioName
producerC#
Star Wars
1977
124
SciFi
Fox
12345
Galaxy
1999
104
Comedy DreamWorks
67890
Example 2.11
• Set tuples in the relation movies that
represent Fox Movies at least 100 minutes
long.
• You can use AND for more than one
condition.
Length >= 100 AND studioName = ‘Fox’ (Movies)
Title
Year
Length
Genre
StudioName
producerC#
Star Wars
1977
124
SciFi
Fox
12345
Is the only one in the resulting relation.
2.4.7 Cartesian Product
2.4.7 Cartesian Product
A
B
1
2
3
4
Relation R X S
Relation R
A
R.B
S.B
C
D
1
2
2
5
6
1
2
4
7
8
B
C
D
2
5
6
1
2
9
10
11
4
7
8
3
4
2
5
6
9
10
11
3
4
4
7
8
3
4
9
10
11
Relation S
2.4.8 Natural Joins
2.4.8 Natural Joins
A
B
1
2
3
4
Relation R
Relation
RS
A
R.B
S.B
C
D
1
2
2
5
6
1
2
4
7
8
B
C
D
2
5
6
1
2
9
10
11
4
7
8
3
4
2
5
6
9
10
11
3
4
4
7
8
3
4
9
10
11
Relation S
Example 2.13
The natural join is:
A
B
C
D
1
2
5
6
3
4
7
8
2.4.9 Theta-Joins
2.4.9 Theta-Joins
U A<D V
A
B
C
1
2
3
6
7
8
9
7
8
A
B
C
D
(a)
1
2
3
4
1
2
3
5
Relation U
B
C
D
6
7
8
10
2
3
4
9
7
8
10
2
3
5
7
8
10
Relation V
(b)
Figure 2.16 Natural join of relations
Example 2.15
• Consider all nine pairs of tuples, one from
each relation.
• Whether A component from U-tuple is less
than the D Component of the V-tuple.
A
U.B
U.C
V.B
V.C
D
1
2
3
2
3
4
1
2
3
2
3
5
1
2
3
7
8
10
6
7
8
7
8
10
9
7
8
7
8
10
Figure 2.17 Result of U A < D V
Example 2.16
• U and V that has more complex condition :
U A < D AND (U.B != V.B) V
We require for successful pairing not only that the A component of
U-tuple be less than D component of the V-tuple, but that the two
tuples disagree on their respective B components
A
U.B
U.C
V.B
V.C
D
1
2
3
7
8
10
is the only one to satisfy both conditions, so this relation is the result
of the theta-join.
2.4.10 Combining Operations to
Form Queries
Combining Operations
• Let us find “the title and year of movies
made by fox that are at least 100 minutes
long.
Title
Year
Length
InColor
StudioNa
m
Producer
C#
Star Wars
1977
124
true
fox
12345
Might
Ducks
1991
104
true
Disey
Paramoun
t
67890
Wayne’s
world
1992
95
true
99999
Steps to create expression tree
1. note that all data is in one table.
2. select those Movie tuples that have
length>=100
3. Select those Movie tuples that have
studioName=‘Fox’
4. Find the intersection of steps 2 ad 3. we
want 2 AND 3.
5. Project the relation form step 4 onto
attributes title and year.
Expression Tree
Title,year (σ length >=100 (Movies) ∩ σ StudioName =‘Fox’ (Movies)
Title,year (σ length >=100 AND StudioName =‘Fox’ (Movies)
title, year
∩
σlength>=100
σStudioName=‘Fox
’
movie
Movie
Now the same expression
represented linearly
• πtitle,
year(σlength>=100(movie)∩σstudioName=‘fox’(movie
))
• Note that every parentheses represents
relation.
• Movie is a relation after the intersection we
get a relation that comply with our
condones and finally we use projection to
get our two attributes relation.
• Get the social security number & last
name of each employee who works in
department #5.
•
SSN, LNAME (DNO = 5(Employee) )
• Get the snum who locate in London.
•
SNUM (CITY = London(Suppliers))
2.4.11 Naming and Renaming
Example 2.18
A
B
X
C
D
1
2
2
5
6
1
2
4
7
8
1
2
9
10
11
3
4
2
5
6
3
4
4
7
8
3
4
9
10
11
Figure 2.19 R X ρS(X,C,D) (S)
2.4.12 Relationships Among
Operations
• RS= L (C(R S))
• RS= A,U.B, U.C, D(U.B=V.B AND U.C=V.B (UV))
• A <D AND U.BV.B (UV) =
A <D (UV) U.BV.B (UV)
2.4.12 Relationships Among
Operations
• Many of these are obvious; we will pick
them up in illustrating examples
Please prepare a cheat sheet for the
“algebraic laws”
• RS = R –(R –S)
• RCS= C(R S)
2.4.13 A Linear Notation for
Algebraic Expressions
2.4.14 Exercises for Section 2.4
• Build the four relations in 2.4.14 using the
data in Figure 2.20-2.21
• Submit the “create-populate files”