Nested Loops Joins

Download Report

Transcript Nested Loops Joins

Query Execution
Nested Loop Joins (15.2)
Chinmay Bhawe( 119 )
April 16, 2012
Topic to be covered
O Tuple-Based Nested-Loop Join
O An Iterator for Tuple-Based Nested-Loop Join
O A Block-Based Nested-Loop Join Algorithm
O Analysis of Nested-Loop Join
15.3.1
Tuple-Based Nested-Loop Join
O The simplest variation of nested-loop join has
loops that range over individual tuples of the
relations involved. In this algorithm, which we call
tuple-based nested-loop join, we compute the join
as follows
R
S
Continued
O For each tuple s in S DO
For each tuple r in R Do
if r and s join to make a tuple t THEN
output t;
O If we are careless about how the buffer the
blocks of relations R and S, then this
algorithm could require as many as T(R)T(S)
disk .there are many situations where this
algorithm can be modified to have much
lower cost.
Continued
O One case is when we can use an index on
the join attribute or attributes of R to find
the tuples of R that match a given tuple of S,
without having to read the entire relation R.
O The second improvement looks much more
carefully at the way tuples of R and S are
divided among blocks, and uses as much of
the memory as it can to reduce the number
of disk I/O's as we go through the inner loop.
We shall consider this block-based version
of nested-loop join.
15.3.2
An Iterator for Tuple-Based Nested-Loop Join
O Open() {
O R.Open();
O S.open();
O A:=S.getnext();
}
GetNext() {
Repeat {
r:= R.Getnext();
IF(r= Not found) {/* R is exhausted for the
current s*/
R.close();
s:=S.Getnext();

IF( s= Not found) RETURN Not Found;
/* both R & S are exhausted*/
R.Close();
r:= R.Getnext();
}
}
until ( r and s join)
RETURN the join of r and s;
}
Close() {
R.close ();
S.close ();
}
15.3.3
A Block-Based Nested-Loop Join Algorithm
We can Improve Nested loop Join by compute R
|><| S.
1.
Organizing access to both argument
relations by blocks.
2.
Using as much main memory as we can to
store tuples belonging to the relation S, the
relation of the outer loop.
The nested-loop join algorithm
FOR each chunk of M-1 blocks of S DO BEGIN
read these blocks into main-memory buffers;
organize their tuples into a search structure whose
search key is the common attributes of R and S;
FOR each block b of R DO BEGIN
read b into main memory;
FOR each tuple t of b DO BEGIN
find the tuples of S in main memory that
join with t ;
output the join of t with each of these
tuples;
END ;
END ;
END ;
15.3.4
Analysis of Nested-Loop Join
Assuming S is the smaller relation, the
number of chunks or iterations of outer loop
is B(S)/(M - 1). At each iteration, we read hf 1 blocks of S and
B(R) blocks of R. The number of disk I/O's is
thus
B(S)/M-1(M-1+B(R)) or B(S)+B(S)B(R)/M-1
Continued
Assuming all of M, B(S), and B(R) are large,
but M is the smallest of these, an
approximation to the above formula is
B(S)B(R)/M. That is, cost is proportional to
the product of the sizes of the two relations,
divided by the amount of available main
memory.
Example
O
B(R) = 1000, B(S) = 500, M = 101
O
O
O
O
O
O
O
O
Important Aside: 101 buffer blocks is not as unrealistic
as it sounds. There may be many queries at the same
time, competing for main memory buffers.
Outer loop iterates 5 times
At each iteration we read M-1 (i.e. 100) blocks of S and
all of R (i.e. 1000) blocks.
Total time: 5*(100 + 1000) = 5500 I/O’s
Question: What if we reversed the roles of R and S?
We would iterate 10 times, and in each we would read
100+500 blocks, for a total of 6000 I/O’s.
Compare with one-pass join, if it could be done!
We would need 1500 disk I/O’s if B(S) M-1
Continued…….
1.
2.
The cost of the nested-loop join is not
much greater than the cost of a one-pass
join, which is 1500 disk 110's for this
example. In fact.if B(S) 5 lZI - 1, the nestedloop join becomes identical to the one-pass
join algorithm of Section 15.2.3
Nested-loop join is generally not the most
efficient join algorithm.
Summary of the topic
In This topic we have learned about how the
nested tuple Loop join are used in database
using query execution and what is the process
for that.
Any Questions
?
THANK YOU