From Parse Trees to Logical Query Plans

Download Report

Transcript From Parse Trees to Logical Query Plans

CS 255: Database System
Principles
slides: From Parse Trees to Logical
Query Plans
By:- Arunesh Joshi
Id:-006538558
Agenda
•
•
•
•
Conversion to Relational Algebra.
Removing Sub queries From Conditions.
Improving the Logical Query Plan.
Grouping Associative/Commutative Operators.
Parsing
• Goal is to convert a text string containing a
query into a parse tree data structure:
– leaves form the text string (broken into lexical
elements)
– internal nodes are syntactic categories
• Uses standard algorithmic techniques from
compilers
– given a grammar for the language (e.g., SQL),
process the string and build the tree
Example: SQL query
SELECT title
FROM StarsIn
WHERE starName IN (
SELECT name
FROM MovieStar
WHERE birthdate LIKE ‘%1960’
);
(Find the movies with stars born in 1960)
Assume we have a simplified grammar for SQL.
Example: Parse Tree
<Query>
<SFW>
SELECT
<SelList> FROM <FromList>
<Attribute>
title
SELECT
WHERE
<RelName>
<Tuple> IN <Query>
StarsIn
<SelList>
<Attribute>
name
FROM
<Condition>
<FromList>
<RelName>
MovieStar
<Attribute>
( <Query> )
starName
<SFW>
WHERE
<Condition>
<Attribute> LIKE <Pattern>
birthDate
‘%1960’
The Preprocessor
• It replaces each reference to a view with a
parse (sub)-tree that describes the view (i.e., a
query)
• It does semantic checking:
– are relations and views mentioned in the schema?
– are attributes mentioned in the current scope?
– are attribute types correct?
Convert Parse Tree to Relational
Algebra
• The complete algorithm depends on specific
grammar, which determines forms of the parse
trees
• Here is a flavor of the approach
Conversion
• Suppose there are no subqueries.
• SELECT att-list FROM rel-list WHERE cond
is converted into
PROJatt-list(SELECTcond(PRODUCT(rel-list))), or
att-list(cond( X (rel-list)))
SELECT movieTitle
FROM StarsIn, MovieStar
WHERE starName = name AND birthdate LIKE '%1960';
<Query>
<SFW>
SELECT <SelList> FROM <FromList>
<Attribute>
movieTitle
WHERE
<Condition>
<RelName> , <FromList>
StarsIn
<RelName>
AND <Condition>
<Attribute> LIKE <Pattern>
MovieStar
birthdate
<Condition>
<Attribute> = <Attribute>
starName
name
'%1960'
Equivalent Algebraic Expression
Tree
movieTitle
starname = name AND birthdate LIKE '%1960'
X
StarsIn
MovieStar
Handling Subqueries
• Recall the (equivalent) query:
SELECT title
FROM StarsIn
WHERE starName IN (
SELECT name
FROM MovieStar
WHERE birthdate LIKE ‘%1960’
);
• Use an intermediate format called twoargument selection
Example: Two-Argument Selection
title

StarsIn
<condition>
<tuple>
<attribute>
starName
IN
name
birthdate LIKE ‘%1960’
MovieStar
Converting Two-Argument
Selection
• To continue the conversion, we need rules for
replacing two-argument selection with a
relational algebra expression
• Different rules depending on the nature of the
sub query
• Here is shown an example for IN operator and
uncorrelated query (sub query computes a
relation independent of the tuple being tested)
Rules for IN

R
C
<Condition>
t
IN
S
X
R

S
C is the condition that equates
attributes in t with corresponding
attributes in S
Example: Logical Query Plan
title
starName=name

StarsIn
name
birthdate LIKE ‘%1960’
MovieStar
What if Subquery is Correlated?
• Example is when subquery refers to the current
tuple of the outer scope that is being tested
• More complicated to deal with, since subquery
cannot be translated in isolation
• Need to incorporate external attributes in the
translation
• Some details are in textbook
Improving the Logical Query Plan
• There are numerous algebraic laws
concerning relational algebra operations
• By applying them to a logical query plan
judiciously, we can get an equivalent query
plan that can be executed more efficiently
• Next we'll survey some of these laws
Example: Improved Logical Query Plan
title
starName=name
StarsIn
name
birthdate LIKE ‘%1960’
MovieStar
Associative and Commutative
Operations
•
•
•
•
product
natural join
set and bag union
set and bag intersection
associative: (A op B) op C = A op (B op C)
commutative: A op B = B op A
Laws Involving Selection
• Selections usually reduce the size of the
relation
• Usually good to do selections early, i.e.,
"push them down the tree"
• Also can be helpful to break up a complex
selection into parts
Selection Splitting
•  C1 AND C2 (R) =  C1 (  C2 (R))
•  C1 OR C2 (R) = ( C1 (R)) Uset ( C2 (R))
if R is a set
•  C1 (  C2 (R)) =  C2 (  C1 (R))
Selection and Binary Operators
• Must push selection to both arguments:
–  C (R U S) =  C (R) U  C (S)
• Must push to first arg, optional for 2nd:
–  C (R - S) =  C (R) - S
–  C (R - S) =  C (R) -  C (S)
• Push to at least one arg with all attributes
mentioned in C:
– product, natural join, theta join, intersection
– e.g.,  C (R X S) =  C (R) X S, if R has all the atts in C
Pushing Selection Up the Tree
• Suppose we have relations
– StarsIn(title,year,starName)
– Movie(title,year,len,inColor,studioName)
• and a view
– CREATE VIEW MoviesOf1996 AS
SELECT *
FROM Movie
WHERE year = 1996;
• and the query
– SELECT starName, studioName
FROM MoviesOf1996 NATURAL JOIN StarsIn;
The Straightforward Tree

starName,studioName
year=1996
Movie
StarsIn
Remember the rule
C(R S) = C(R)
S?
The Improved Logical Query Plan

starName,studioName


starName,studioName
starName,studioName
year=1996
year=1996
year=1996 year=1996
StarsIn
Movie
Movie
push selection
up tree
StarsIn
Movie
push selection
down tree
StarsIn
Grouping Assoc/Comm Operators
• Groups together adjacent joins, adjacent unions, and
adjacent intersections as siblings in the tree
• Sets up the logical QP for future optimization when
physical QP is constructed: determine best order for
doing a sequence of joins (or unions or intersections)
U
D
U
A
B
E
U
F
A
C
B
D
C
E
F
Thank You