Transcript Query Evaluation
Query Evaluation
Donghui Zhang Northeastern University Partially using Prof. Hector Garcia-Molina’s slides (Notes06, Notes07) http://www-db.stanford.edu/~ullman/dscb.html
1
Query Evaluation
SQL Query Query Result SELECT E.Name
FROM Emp E WHERE E.SSN<5000 AND E.Age>50 Michael Jordan Donghui Zhang Server ???
• Produce query result Michael Jordan Donghui Zhang 2
Query Evaluation Steps
• Query Compiling: get logical Q.P.
• Query Optimization: choose a physical Q.P.
• Query Execution: execute 3
SQL query query compiling parse convert parse tree logical query plan apply laws “ improved” l.q.p
estimate result sizes l.q.p. +sizes consider physical plans query optimization {P1,P2,…..} answer statistics query execution execute Pi pick best {(P1,C1),(P2,C2)...} estimate costs 4
Query Compiling Parse SELECT E.Name
FROM Emp E WHERE E.SSN<5000 AND E.Age>50 • Background knowledge: Grammar.
• Input: SQL query.
• Output: a parse tree.
• Start with a simple grammar: – Only SFW (no group by, having, nested query) – Simple AND condition (no OR, UNION, EXISTS, IN, …) – One table (no conditions like E.did=D.did) 5
Query Compiling Grammar Parse SELECT E.Name
FROM Emp E WHERE E.SSN<5000 AND E.Age>50 •
Query Compiling Parse Tree Parse
FROM Emp E WHERE E.SSN<5000 AND E.Age>50 SELECT
< 5000
Query Compiling Convert SELECT E.Name
FROM Emp E WHERE E.SSN<5000 AND E.Age>50 • Input: a parse tree.
• Output: a logical query plan.
• Algorithm: followed by .
• E.Name
( E.SSN<5000 AND E.Age>50 (E) ) • Alternatively, a l.q.p tree.
E.Name
E.SSN<5000 AND E.Age>50 Emp E 8
Query Compiling Apply Laws SELECT E.Name
FROM Emp E WHERE E.SSN<5000 AND E.Age>50 • Replace with , push [and ] down.
• Only used for multiple tables. So skip.
E.Name
E.SSN<5000 AND E.Age>50 Emp E 9
SQL query query compiling parse convert parse tree logical query plan apply laws “ improved” l.q.p
estimate result sizes l.q.p. +sizes consider physical plans query optimization {P1,P2,…..} answer statistics query execution execute Pi pick best {(P1,C1),(P2,C2)...} estimate costs 10
Query Optimization Estimate Result Sizes SELECT E.Name
FROM Emp E WHERE E.SSN<5000 AND E.Age>50 • The size of each input table is stored as meta data.
• Intermediate result: size not known, but needed to estimate I/O cost of physical plan.
• But for the simple case, can be evaluated on the fly. So no need to estimate the size of . So skip.
11
Query Optimization Consider Physical Plans SELECT E.Name
FROM Emp E WHERE E.SSN<5000 AND E.Age>50 • Associate each RA operator with an implementation scheme.
• Multiple implementation schemes? Enumerate all.
Plan 1 (always work!) E.Name
on-the-fly E.SSN<5000 AND E.Age>50 scan Emp E 12
Query Optimization Consider Physical Plans SELECT E.Name
FROM Emp E WHERE E.SSN<5000 AND E.Age>50 • For the other physical plans, need to know what indices exist.
• Primary index: controls the actual storage of a table.
– Suppose a primary B+-tree index exists on SSN.
• Secondary index: built on some other attribute. Does not store the actual record. Each leaf entry stores a set of page IDs in the primary index.
– Suppose a secondary B+-tree index exists on Age.
1 2 3 4 SSN index 5 6 e.g. entry in Age index: Age=50, pageIDs= {1, 4, 6} 13
Query Optimization Consider Physical Plans SELECT E.Name
FROM Emp E WHERE E.SSN<5000 AND E.Age>50 Plan 2 E.Name
E.SSN<5000 AND E.Age>50 on-the-fly range search in SSN index Emp E 14
Query Optimization Consider Physical Plans SELECT E.Name
FROM Emp E WHERE E.SSN<5000 AND E.Age>50 Plan 3 E.Name
E.SSN<5000 AND E.Age>50 on-the-fly range search in Age index, follow pointers to SSN index Emp E 15
Query Optimization Estimate Costs • Estimate #I/Os for each physical plan.
• Pick the cheapest one.
SELECT E.Name
FROM Emp E WHERE E.SSN<5000 AND E.Age>50 • Input: physical plan.
• Additional input: – meta data (e.g. how many levels a B+-tree has) – assumptions (e.g. the root node of every B+-tree is pinned) – memory buffer size.
16
Query Optimization Estimate Costs Meta Data SELECT E.Name
FROM Emp E WHERE E.SSN<5000 AND E.Age>50 • All the database tables.
• For each table R: – Schema – T(R): #records in R – For every attribute A: • V(R, A): #distinct values of A • min(R, A): minimum value of A • max(R, A): maximum value of A – Primary index: #levels, #leaf nodes.
– Secondary index: #levels, #leaf nodes, average #pageIDs per leaf entry.
17
Query Optimization Estimate Costs sample input SELECT E.Name
FROM Emp E WHERE E.SSN<5000 AND E.Age>50 • Assume for table E: – Schema = (SSN: int, Name: string, Age: int, Salary: int) – T(E) = 100 tuples. – For attribute SSN: • V(E, SSN)=100, min(E, SSN)=0000, max(E, SSN)=9999 – For attribute Age: • V(E, Age)=20, min(E, Age)=21, max(E, Age)=60 – Primary index on SSN: 3 level B+-tree, 50 leaf nodes.
– Secondary index on Age: 2 level B+-tree, 10 leaf nodes, every leaf entry points to 3.5 pageIDs (on average).
• Assumptions: all B+-tree roots are pinned. Can reach the first leaf page of a B+-tree directly.
• Memory buffer size: 2 pages.
18
Query Optimization Estimate Costs SELECT E.Name
FROM Emp E WHERE E.SSN<5000 AND E.Age>50 • Cost = 50. (The primary index has 50 leaf nodes. Assume we can reach the first leaf page of a B+-tree directly.) Plan 1 (always work!) E.Name
on-the-fly E.SSN<5000 AND E.Age>50 scan Emp E 19
Query Optimization Estimate Costs SELECT E.Name
FROM Emp E WHERE E.SSN<5000 AND E.Age>50 • Cost = 25. SSN<5000 selects half of the employees, so 50/2=25 leaf nodes. • Note: if condition is E.SSN>5000, needs 1 more I/O.
Plan 2 E.Name
on-the-fly E.SSN<5000 AND E.Age>50 range search in SSN index Emp E 20
Query Optimization Estimate Costs SELECT E.Name
FROM Emp E WHERE E.SSN<5000 AND E.Age>50 • Cost = 10/4 + 20/4 * 3.5
= 21. #I/Os in the Age index Plan 3 E.Name
E.SSN<5000 AND E.Age>50 Emp E #I/Os in the SSN index on-the-fly range search in Age index, follow pointers to SSN index 21
Query Optimization Estimate Costs SELECT E.Name
FROM Emp E WHERE E.SSN<5000 AND E.Age>50 • Cost = 10/4 + 20/4 * 3.5
= 21. Age index has 10 leaf nodes.
Plan 3 E.Name
E.SSN<5000 AND E.Age>50 Emp E Check 1/4 of them, since [51,60] is 1/4 of [21,60].
on-the-fly range search in Age index, follow pointers to SSN index 22
Query Optimization Estimate Costs SELECT E.Name
FROM Emp E WHERE E.SSN<5000 AND E.Age>50 • Cost = 10/4 + 20/4 * 3.5
= 21. 20 distinct ages divided by 4 to get #ages in [51,60].
Plan 3 E.Name
E.SSN<5000 AND E.Age>50 Emp E times 3.5 (#pageIDs per page) to get #I/Os in the SSN index.
on-the-fly range search in Age index, follow pointers to SSN index 23
Query Optimization Pick Best
physical plan
Plan 1: scan Plan 2: range search SSN index Plan 3: range search Age index SELECT E.Name
FROM Emp E WHERE E.SSN<5000 AND E.Age>50
I/O cost
50 25 21 Pick!
24
SQL query query compiling parse convert parse tree logical query plan apply laws “ improved” l.q.p
estimate result sizes l.q.p. +sizes consider physical plans query optimization {P1,P2,…..} answer statistics query execution execute Pi pick best {(P1,C1),(P2,C2)...} estimate costs 25
Another case study: two tables.
• Extended grammar: – Only SFW (no group by, having, nested query) – Simple AND condition (no OR, UNION, EXISTS, IN, …) – Allow two tables (allow conditions like E.did=D.did) • Example query: SELECT E.Name, D.Dname
FROM Emp E, Dept D WHERE E.Did=D.Did AND E.SSN<5000 AND D.budget=1000 26
Query Compiling Grammar Parse SELECT E.Name, D.Dname
FROM Emp E, Dept D WHERE E.Did=D.Did AND E.SSN<5000 AND D.budget=1000 •
Query Compiling Parse Tree Parse
FROM Emp E, Dept D WHERE E.Did=D.Did AND E.SSN<5000 AND D.budget=1000 SELECT
28
Query Compiling Parse Tree Parse
FROM Emp E, Dept D WHERE E.Did=D.Did AND E.SSN<5000 AND D.budget=1000 SELECT
Query Compiling Parse Tree Parse
FROM Emp E, Dept D WHERE E.Did=D.Did AND E.SSN<5000 AND D.budget=1000 SELECT
=
Query Compiling Convert SELECT E.Name, D.Dname
FROM Emp E, Dept D WHERE E.Did=D.Did AND E.SSN<5000 AND D.budget=1000 • Algorithm: then then .
• E.Name. D.Dname
( E.Did=D.Did AND E.SSN<5000 AND D.budget=1000 (E D) ) • The l.q.p tree: E.Name, D.Dname
E.Did=D.Did AND E.SSN<5000 AND D.budget=1000 Emp E Dept D 31
Query Compiling Apply Laws SELECT E.Name, D.Dname
FROM Emp E, Dept D WHERE E.Did=D.Did AND E.SSN<5000 AND D.budget=1000 • Always always: (try to) replace with !
E.Name, D.Dname
E.Did=D.Did AND E.SSN<5000 AND D.budget=1000 Emp E Dept D 32
Query Compiling Apply Laws SELECT E.Name, D.Dname
FROM Emp E, Dept D WHERE E.Did=D.Did AND E.SSN<5000 AND D.budget=1000 • Always always: (try to) replace • Also, push down.
with !
E.Name, D.Dname
E.SSN<5000 AND D.budget=1000 Emp E Dept D 33
Query Compiling Apply Laws SELECT E.Name, D.Dname
FROM Emp E, Dept D WHERE E.Did=D.Did AND E.SSN<5000 AND D.budget=1000 • Always always: (try to) replace • Also, push down.
with !
E.Name, D.Dname
E.SSN<5000 AND D.budget=1000 Emp E Dept D 34
Query Compiling Apply Laws SELECT E.Name, D.Dname
FROM Emp E, Dept D WHERE E.Did=D.Did AND E.SSN<5000 AND D.budget=1000 • Always always: (try to) replace • Also, push down.
with !
E.Name, D.Dname
E.SSN<5000 Emp E D.budget=1000 Dept D 35
Query Compiling Theory Behind Apply Laws SELECT E.Name, D.Dname
FROM Emp E, Dept D WHERE E.Did=D.Did AND E.SSN<5000 AND D.budget=1000 • Let p = predicate with only E attributes q = predicate with only D attributes m = E & D’s common attributes are equal • We have: p q m (E D) = p (E) q (D) 36
SQL query query compiling parse convert parse tree logical query plan apply laws “ improved” l.q.p
estimate result sizes l.q.p. +sizes consider physical plans query optimization {P1,P2,…..} answer statistics query execution execute Pi pick best {(P1,C1),(P2,C2)...} estimate costs 37
Query Optimization Consider Physical Plans SELECT E.Name, D.Dname
FROM Emp E, Dept D WHERE E.Did=D.Did AND E.SSN<5000 AND D.budget=1000 • Because join is so important, let’s skip result size estimation for now, and let’s assume selections are not pushed down.
E.Name, D.Dname
E.SSN<5000 AND D.budget=1000 Emp E Dept D 38
Four Join Algorithms
• Iteration join (nested loop join) • Merge join • Hash join • Join with index 39
Example E D
over common attribute
Did
• E: – T(E)=10,000 – primary index on SSN, 3 levels. – |E|= 1,000 leaf nodes.
• D: – T(D)=5,000 – primary index on Did. 3 levels.
– |D| = 500 leaf nodes.
• Memory available = 101 blocks 40
Iteration Join
• • 1. for every block in E 2.
scan through D ; 3.
join records in the E records in the D block with block.
I/O cost = |
E
| + |
E
| * |
D
| = 1000 + 1000*500 = 501,000.
Works good for small buffer (e.g. two blocks).
41
• Can we do better?
Use our memory (1) Read 100 blocks of E (2) Read all of D (using 1 block) + join (3) Repeat until done • I/O cost = |
E
| + |
E
|/100 * |
D
| = 1000 + 10*500 = 6,000.
42
• Can we do better?
Reverse join order: D E. i.e. For every 100 D blocks, go through E.
• I/O cost = |
D
| + |
D
|/100 * |
E
| = 500 + 5*1000 = 5,500.
43
• Merge join (conceptually) (1) if R1 and R2 not sorted, sort them (2) i 1; j 1; While (i T(R1)) (j T(R2)) do if R1{ i }.C = R2{ j }.C then outputTuples else if R1{ i }.C > R2{ j }.C then j else if R1{ i }.C < R2{ j }.C then i j+1 i+1 44
Procedure Output-Tuples While (R1{ i }.C = R2{ j }.C) [jj j; (i T(R1)) do while (R1{ i }.C = R2{ jj }.C) (jj T(R2)) do [output pair R1{ i }, R2{ jj }; jj jj+1 ] i i+1 ] 45
Example
i 1 2 3 4 5 R1{i}.C
10 20 20 30 40 R2{j}.C
5 20 20 30 30 50 52 j 1 2 3 4 5 6 7 46
Merge Join Cost
• Recall that |E|=1000, |D|=500. And |D| is already sorted on Did.
• External sort E: pass 0, by reading and writing E, produces a file with 10 sorted runs. Another read is enough.
• No need to write! Can pipeline to join operator.
• Cost = 3*1000 + 500 = 3,500.
47
• Hash join (conceptual) – Hash function h, range 0 k – Buckets for R1: G0, G1, ... Gk – Buckets for R2: H0, H1, ... Hk Algorithm (1) Hash R1 tuples into G buckets (2) Hash R2 tuples into H buckets (3) For i = 0 to k do match tuples in Gi, Hi buckets 48
Simple example
R1 R2 2 4 3 12 5 8 9 5 4 3 13 8 11 14
hash: even/odd
Even Odd: Buckets 2 4 8 R1 4 12 8 14 R2 3 5 9 5 3 13 11 49
Hash Join Cost
• Read + write both E and D for partitioning, then read to join.
• Cost = 3 * (1000 + 500) = 4,500.
50
• Join with index (Conceptually) For each r E do Find the corresponding D tuple by probing index.
• Assuming the root is pinned in memory, Cost = |E| + T(E)*2 = 1000 + 10,000*2 = 21,000.
51
Note: • The costs are different if integrate selection conditions!
• E.g. for the index join, only check half of E. So should be 500+5,000*2=10,500.
• Selection condition which is not used during join should be evaluated to filter the join result. E.g. index join checked D without evaluating the selection condition on D.
52
physical plan with selections being pushed down SELECT E.Name, D.Dname
FROM Emp E, Dept D WHERE E.Did=D.Did AND E.SSN<5000 AND D.budget=1000 • Finally, let’s consider pushing down selections.
• Now that the join operator takes intermediate results (which could be written to disk), we need to estimate their sizes… E.Name, D.Dname
E.SSN<5000 Emp E D.budget=1000 Dept D 53
SQL query query compiling parse convert parse tree logical query plan apply laws “ improved” l.q.p
estimate result sizes l.q.p. +sizes consider physical plans query optimization {P1,P2,…..} answer statistics query execution execute Pi pick best {(P1,C1),(P2,C2)...} estimate costs 54
Estimating result size
• Keep statistics for relation R – T(R) : # tuples in R – S(R) : # of bytes in each R tuple – V(R, A) : # distinct values in R for attribute A – min(R, A) – max(R, A) 55
Example R A B C D cat 1 10 a cat 1 20 b dog 1 30 a dog 1 40 c bat 1 50 d A: 20 byte string B: 4 byte integer C: 8 byte date D: 5 byte string T(R) = 5 S(R) = 37 V(R,A) = 3 V(R,C) = 5 V(R,B) = 1 V(R,D) = 4 56
Size estimates for W = R1 x R2
T(W) = T(R1) T(R2) S(W) = S(R1) + S(R2) 57
Size estimate for W =
A=a
(R)
S(W) = S(R) T(W) = ?
58
Example R A B C D cat 1 10 a cat 1 20 b dog 1 30 a dog 1 40 c bat 1 50 d W = z=val (R) T(W) = T(R) V(R,Z) V(R,A)=3 V(R,B)=1 V(R,C)=5 V(R,D)=4 59
Assumption:
Values in select expression Z = val are uniformly distributed over possible V(R,Z) values.
60
What about W =
z val (R) ?
T(W) = ?
• T(W) = T(R)/2?
61
• Solution: Estimate values in range Example R f = 5 20 Z Min=1 V(R,Z)=10 W= z 16 (R) Max=20 (fraction of range) T(W) = f T(R) 62
Size estimate for W = R1 R2
Let x = attributes of R1 y = attributes of R2 Case 1
X
Y =
Same as R1 x R2 63
Case 2
W = R1 R2 X
Y = A
R1 A B C R2 A D Assumption: V(R1,A) V(R2,A) V(R2,A) V(R1,A) Every A value in R1 is in R2 Every A value in R2 is in R1 64
Computing T(W) when V(R1,A)
V(R2,A)
R1 A B C R2 A D Take 1 tuple Match 1 tuple matches with T(R2) V(R2,A) tuples...
so T(W) = T(R2) V(R2, A) T(R1) 65
• V(R1,A) V(R2,A) T(W) = T(R2) T(R1) V(R2,A) • V(R2,A) V(R1,A) T(W) = T(R2) T(R1) V(R1,A) [A is common attribute] 66
In general W = R1 R2
T(W) = T(R2) T(R1) max{ V(R1,A), V(R2,A) } 67
S(W) = S(R1) + S(R2) - S(A)
size of attribute A 68
Note: for complex expressions, need intermediate T,S,V results.
E.g. W = [ A=a (R1) ] R2 Treat as relation U T(U) = T(R1)/V(R1,A) S(U) = S(R1) Also need V (U, *) !! 69
To estimate Vs
E.g., U = A=a (R1) Say R1 has attribs A,B,C,D V(U, A) = V(U, B) = V(U, C) = V(U, D) = 70
Example R1 A B C D cat 1 10 10 cat 1 20 20 dog 1 30 10 dog 1 40 30 cat 1 50 10 V(R1,A)=3 V(R1,B)=1 V(R1,C)=5 V(R1,D)=3 U = A=a (R1) V(U,A) =1 V(U,B) =1 V(U,C) = T(R1) V(R1,A) V(U,D) ... somewhere in between 71
For an arbitrary attribute D other than A (the attribute being selected)
V(R1,D) ranges from 1 to T(R1), and V(U,D) ranges from 1 to T(R1)/V(R1,A).
Let’s make
V
(
R
1 ,
D
)
T
(
R
1 )
V
(
U
,
D
)
T
(
R
1 ) /
V
(
R
1 ,
A
) Or, V(U,D) = V(R1,D)/V(R1,A) 72
For Joins U = R1(A,B) R2(A,C)
V(U,A) = min { V(R1, A), V(R2, A) } V(U,B) = V(R1, B) V(U,C) = V(R2, C) 73
Example:
Z = R1(A,B) R2(B,C) R3(C,D) R1 R2 R3 T(R1) = 1000 V(R1,A)=50 V(R1,B)=100 T(R2) = 2000 V(R2,B)=200 V(R2,C)=300 T(R3) = 3000 V(R3,C)=90 V(R3,D)=500 74
Partial Result: U = R1 R2
T(U) = 1000 2000 V(U,A) = 50 200 V(U,B) = 100 V(U,C) = 300 75
Z = U R3
T(Z) = 1000 2000 3000 V(Z,A) = 50 200 300 V(Z,B) = 100 V(Z,C) = 90 V(Z,D) = 500 76
• E: • D:
Example
– T(E)=10,000 – primary index on SSN, 3 levels. – |E|= 1,000 leaf nodes.
SELECT E.Name, D.Dname
FROM Emp E, Dept D WHERE E.Did=D.Did AND E.SSN<5000 AND D.budget=1000 – V(E,SSN)=10,000: from 0000 to 9999.
– T(D)=5,000 – primary index on Did. 3 levels.
– |D| = 500 leaf nodes.
– V(D,budget)=20: from 100 to 10,000.
• Memory available = 11 blocks • ?? What’s the best physical plan?
Note: |E’| = 500 |D’| = 25 77
l.q.p
E.Name, D.Dname
E.SSN<5000 Emp E D.budget=1000 Dept D SELECT E.Name, D.Dname
FROM Emp E, Dept D WHERE E.Did=D.Did AND E.SSN<5000 AND D.budget=1000 78
p.q.p #1
SELECT E.Name, D.Dname
FROM Emp E, Dept D WHERE E.Did=D.Did AND E.SSN<5000 AND D.budget=1000 iteration join; D is outer table E.Name, D.Dname
E.SSN<5000 range search D.budget=1000 scan Emp E Dept D Cost = 500 (read D) + 25 (write D’) + 25 + ceiling(25/10)*500 = 2050 79
p.q.p #2
SELECT E.Name, D.Dname
FROM Emp E, Dept D WHERE E.Did=D.Did AND E.SSN<5000 AND D.budget=1000 sort merge E.Name, D.Dname
E.SSN<5000 range search Emp E D.budget=1000 scan Dept D Cost = 5*500 (sort E’; no write) + 500 (read D) = 3000 80
p.q.p #3
SELECT E.Name, D.Dname
FROM Emp E, Dept D WHERE E.Did=D.Did AND E.SSN<5000 AND D.budget=1000 E.SSN<5000 range search E.Name, D.Dname
hash join Emp E D.budget=1000 scan Dept D Cost = 3*500 (for E’) + 500 (read D) + 25 (write D’) + 3*25 (for D’) = 3000 Note: M should be bigger than sqrt(min{|E’|, |D’|})+1. - Why?
- What if not?
81
p.q.p #4
E.Name, D.Dname
D.budget=1000 index nested loop join E.SSN<5000 range search Dept D Emp E SELECT E.Name, D.Dname
FROM Emp E, Dept D WHERE E.Did=D.Did AND E.SSN<5000 AND D.budget=1000 Cost = 500 (scan E’) + 5000*(3-1) (for D) = 10,500 82
Some notes
• For BNL, merge, hash joins: always push selection! • For index join, do not push selection on the inner table (the one whose primary key is involved in the join condition).
• For BNL, make the smaller table be the outer table – join could be free if it fits in memory!
83