CS186 Final Review

Download Report

Transcript CS186 Final Review

CS186 Final Review
Query Optimization
Overview
• Query  Relational Algebra  Tree
– Implementation choices at each operator
– Order of operators
sname
• What is a query plan?
– Tree of R.A. ops (and
some others) with choice
of algorithm for each op.
bid=100
rating > 5
?
sid=sid
Reserves
Sailors
Join
• Joins (Chapter 14.4)
– (Page-oriented) Nested loops join
– Blocked nested loops:
• simple, exploits buffer pages
– Indexed nested loops:
• best if 1 relation small and one indexed
– Sort/Merge Join
• sort cost + merge cost (M+N  M*N)
– Hash Join
• partition cost (2(M+N)) + match cost (M+N)
Query  Tree
SELECT S.sname
FROM Reserves R, Sailors S
WHERE R.sid=S.sid AND
R.bid=100 AND S.rating>5
• |S| = 500, |R|=1000
sname
rating > 5
(On-the-fly)
(Page-Oriented
sid=sid Nested loops)
Sailors
• 100 boats
(On-the-fly)
sname
bid=100
• Rating from 1 to 10
Reserves
(On-the-fly)
(Page-Oriented
sid=sid Nested loops)
rating>5
(On-the-fly)
Sailors
bid=100
(Scan &
Write to
temp T2)
Reserves
Query  Tree
• Left-deep plans
– Fully pipelined plans: Intermediate results not written
to temporary files.
– All left-deep trees are fully pipelined? (SM join?)
• Relational Algebra Equivalences
– Chapter 15.3
• Cost estimation
– Reduction Factors (also called “selectivity”)
• values are uniformly distributed
• histogram
Choice of Algorithm for Operator
• Rating from 1 to 10, 100 boats
• |S| = 500, |R|=1000
(On-the-fly)
sname
(On-the-fly)
sname
(Page-Oriented
sid=sid Nested loops)
(Sort-Merge Join)
bid=100
(On-the-fly)
(Scan &
rating > 5 Write to
temp T2)
Reserves
Sailors
4250 IOs
sid=sid
(Scan;
write to bid=100
temp T1)
Reserves
rating > 5
Sailors
3560 IOs
(Scan;
write to
temp T2)
Choice of Algorithm for Operator
• What affect choices?
– # Buffer Pages
– Statistic of the relation
– Index?
• Clustered
• Unclustered: access each tuple  1 I/O
Conclusion
• What you should know?
– Given a SQL query, how to give the best
(good) query plan?
• Best plan for accessing each relation
• Enumeration to get trees (left-deep join)
• Select algorithm for each operator
– Cost estimation
– # of Buffer?
– Index