Transcript Document

Query Compiler
By:Payal Gupta
Shirali Choksi
Professor :Tsau Young Lin
Query Compiler
 The query-compiler is a set of tools for the inspection
of the process of query compilation.
 It shows how a SQL query is parsed, translated in
relational algebra and optimized.
Query Compiler
 Query Compiler perform the following operations :
 parse the query which is represented as a parse tree.
 Represent parse tree as an expression tree of relational
algebra.
 Turn relational algebra into physical query plan.
Query Compiler - Parsing
Query
Parser
Preprocessor
Logical Plan
Generator
Query rewriter
logical query plan
Syntax Analysis And Parse Tree
 The job of a parse tree is:
 It takes text written in SQL language and convert it
into a parse tree whose nodes are correspond to either.
 ATOMS-are keywords, constants, operators, names
and parenthesis.
 Syntax categories : names for families of query’s
subpart.
Grammar
 <Query> ::= <SFW>
 <Query> ::= (<Query>)
Rules
 <SFW> ::= SELECT <SelList> FROM <FromList>
WHERE <Condition>
 Select-List :
<SelList> ::= <Attribute>,<SelList>
<SelList>::= <Attribute>
 From-List :
<FromList>::= <Relation>,<FromList>
<FromList>::= <Relation>
Rules
 Conditions:
<Condition>::= <Condition> AND <Condition>
<Condition>::= <Tuple> IN <Query>
<Condition>::= <Attribute> = <Attribute>
<Condition>::= <Attribute> LIKE
<Pattern>
 Tuple:
<Tuple>::= <Attribute>
Tables
 StarsIn(movieTitle, movieyear, starName)
 MovieStar(name, address, gender, birthdate)
 We want to find titles of movies that have at least one
star born in 1960.
Query should be like….
 SELECT movieTitle
FROM StarsIn
WHERE starName I N (
SELECT name
FROM Moviestar
WHERE birthdate LIKE '%1960'
);
 <Query>
Parse Tree
<SFW>
SELECT <SelList> FROM <FromList> WHERE <Condition>
<Attribute>
movieTitle
<RelName>
StarsIn
<Tuple> IN <Query>
<Attribute>
starName
( <Query> )
<SFW>
SELECT <SelList> FROM <FromList> WHERE <Condition>
<Attribute>
Name
<RelName>
MovieStar
birthdate
<Attribute> LIKE <Pattern>
‘%1960’
Preprocessor
 It does semantic checking.
 Functions of preprocessor:
1. Check relations uses.
2. Check and resolves attribute uses.
3. Check types.
Logical Query Plan
 associative and commutative laws:
RxS=SxR
(R U S) U T = R U (S U T)
 Laws for bags and sets can differ:
 For Ex. For sets,
A ns (B Us C) = (A ns B) Us (A ns C) but this can’s
work for bags.
Selection with binary operator
 Rules:
1. For a union, the selection must be pushed to both
arguments.
2. For a difference, the selection must be pushed to the first
argument and optionally may be pushed to the second.
3. For join and difference, the selection may be pushed to
the first or second argument.
Trivial Rules
 Any selection on an empty relation is empty.
 If C is an always-true condition (e.g., x > 10 on a
relation that forbids x = NULL), then Selection of c(R)
= R.
 If R is empty, then R U S = S.
Pushing Selections
 It is, replacing the left side of one of the rules by its
right side.
 In pushing selections we first a selection as far up the
tree as it would go, and then push the selections down
all possible branches.
 Let’s take an example:
 S t a r s I n ( t i t l e , year, starName)
 Movie(title, year, length, incolor, studioName,
producerC#)
 Define view MoviesOf 1996 by:
CREATE VIEW MoviesOfl996 AS
SELECT *
FROM Movie
,WHERE year = 1996;
 "which stars worked for which studios in 1996?“ can be
given by a SQL Query:
SELECT starName, studioName
FROM MoviesOfl996 NATURAL JOIN StarsIn;
ΠstarName,studioName
O Year=1996
StarsIn
Movie
Logical query plan constructed from definition of a query and view
Improving the query plan by moving selections up and down the tree
ΠstarName,studioName
O Year=1996
Movie
O Year=1996
StarsIn
Laws Involving Projection
 "pushing" projections really involves introducing a
new projection somewhere below an existing
projection.
 projection keeps the number of tuples the same and
only reduces the length of tuples.
 To describe the transformations of extended
projection Consider a term E + x on the list for a
projection, where E is an attribute or an expression
involving attributes and constants and x is an output
attribute.
Example
 Let R(a, b, c) and S(c, d, e) be two relations. Consider
the expression x,+,,,, b+y(R w S). The input attributes of
the projection are a,b, and e, and c is the only join
attribute. We may apply the law for pushing projections
below joins to get the equivalent expression:
Πa+e->x,b->y(Πa,b,c(R)
Πc,e(S))
 Eliminating this projection and getting a third
equivalent expression:Πa+e->x, b->y( R
Πc,e(S))
 In addition, we can perform a projection entirely
before a bag union. That is:
ΠL(R UB S)= ΠL(R) )UB ΠL(S)
Laws About Joins and Products
 laws that follow directly from the definition of the join:
R c S = O c( R * S)
R
S = ΠL( O c ( R * S) ) , where C is the condition that
equates each pair of attributes from R and S with the same
name. and L is a list that includes one attribute from each
equated pair and all the other attributes of R and S.
 We identify a product followed by a selection as a join of
some kind.
Laws Involving Duplicate Elimination
 The operator δ which eliminates duplicates from a bag
can be pushed through many but not all operators.
 In general, moving a δ down the tree reduces the size of
intermediate relations and may therefore beneficial.
 Moreover, sometimes we can move δ to a position
where it can be eliminated altogether,because it is
applied to a relation that is known not to possess
duplicates.
 δ (R)=R if R has no duplicates. Important cases of such
a relation R include:
a) A stored relation with a declared primary key, and
b) A relation that is the result of a γ operation, since
grouping creates a relation with no duplicates.
 Several laws that "push" δ through other operators are:
 δ (R*S)
=δ(R) * δ(S)
 δ (R
S)=δ(R)
δ(S)
 δ (R c S)=δ(R) c δ(S)
 δ ( O c (R))= O c (δ(R))
 We can also move the δ to either or both of the
arguments of an intersection:
 δ (R ∩B S) = δ(R) ∩B S = R ∩B δ (S) = δ(R) ∩B δ (S)
Laws Involving Grouping and Aggregation
 When we consider the operator γ, we find that the
applicability of many transformations depends on the
details of the aggregate operators used. Thus we cannot
state laws in the generality that we used for the other
operators. One exception is that a γ absorbs a δ .
Precisely:
 δ(γL(R))=γL(R)
 let us call an operator γ duplicate-impervious if the
only aggregations in L are MIN and/or MAX then:
 γ L(R) = γ L (δ(R)) provided γL is duplicate-impervious.
Example
 Suppose we have the relations
MovieStar(name , addr , gender, birthdate)
StarsIn(movieTitle, movieyear, starname)
and we want to know for each year the birthdate of the
youngest star to appear in a movie that year. We can
express this query as:
SELECT movieyear, MAX(birth date)
FROM MovieStar, StarsIn
WHERE name = starName
GROUP BY movieyear;
γ movieYear, MAX ( birthdate )
O
name = starName
MovieStar StarsIn
Initial logical query plan for the query
 Some transformations that we can apply to Fig are
1. Combine the selection and product into an equijoin.
2.Generate a δ below the γ , since the γ is duplicateimpervious.
3. Generate a Π between the γ and the introduced δ to
project onto movie-Year and birthdate, the only
attributes relevant to the γ
γ movieYear, MAX ( birthdate )
Π movieYear, birthdate
δ
name = starName
MovieStar StarsIn
Another query plan for the query
γ movieYear, MAX ( birthdate )
Π movieYear, birthdate
name = starName
δ
δ
Π birthdate,name Π movieYear,starname
MovieStar
StarsIn
third query plan for Example
Thank You