Why Not Store Everything in Main Memory? Why use disks?

Download Report

Transcript Why Not Store Everything in Main Memory? Why use disks?

09_Queries_LECTURE2
CODE GENERATION
implements the operator, JOIN (equi-join, we will use  for it.)
J1. NESTED LOOP
JOIN S on S.S#=E.S#
For each record in S, (the outer loop over records from the outer or driver relation),
retrieve every record from E, (the inner loops over the inner relation),
test join condition,
if it's true, concatenate the tuples (project off unwanted columns) and output,
else go to next E (inner-relation) record.
SELECT SNAME,C#,GRADE
S
FROM S=STUDENT,E=ENROLL
E
S#|SNAME |LCODE
25|CLAY
|NJ5101
32|THAISZ|NJ5102
S.S#=E.S#
FALSE
S#|C#|GR
32|8 |89
ResultSNAME
CLAY
CLAY
WHERE S.S#=E.S#;
|C# |GRADE
| 7 | 68
| 6 | 76
32|7 |91
38|GOOD
|FL6321
25|7 |68
17|BAID
|NY2091
25|6 |76
57|BROWN |NY2092
32|6 |62
Section 9
# 19
CODE GENERATION
implements the operator, JOIN (equi-join, we will use  for it.)
J1. NESTED LOOP
Second inner loop pass:
SELECT SNAME,C#,GRADE
S
FROM S=STUDENT,E=ENROLL
E
S#|SNAME |LCODE
25|CLAY
|NJ5101
S#|C#|GR
32|8 |89
32|THAISZ|NJ5102
32|7 |91
S.S#=E.S
TRUE
38|GOOD
|FL6321
25|7 |68
17|BAID
|NY2091
25|6 |76
57|BROWN |NY2092
32|6 |62
WHERE S.S#=E.S#;
res SNAME |C# |GRADE
CLAY |
CLAY |
THAISZ|
THAISZ|
THAISZ|
7
6
8
7
6
|
|
|
|
|
68
76
89
91
62
Section 9
# 20
CODE GENERATION
implements the operator, JOIN (equi-join, we will use  for it.)
J1. NESTED LOOP
Third inner loop pass:
SELECT SNAME,C#,GRADE
S
FROM S=STUDENT,E=ENROLL
E
S#|SNAME |LCODE
25|CLAY
S#|C#|GR
|NJ5101
32|8 |89
32|THAISZ|NJ5102
32|7 |91
38|GOOD
17|BAID
|FL6321
|NY2091
57|BROWN |NY2092
25|7 |68
S.S#=E.S
FALSE
WHERE S.S#=E.S#;
res SNAME |C# |GRADE
CLAY |
CLAY |
THAISZ|
THAISZ|
THAISZ|
7
6
8
7
6
|
|
|
|
|
68
76
89
91
62
25|6 |76
32|6 |62
Section 9
# 21
CODE GENERATION
implements the operator, JOIN (equi-join, we will use  for it.)
J1. NESTED LOOP
4th inner loop pass:
SELECT SNAME,C#,GRADE
S
FROM S=STUDENT,E=ENROLL
E
S#|SNAME |LCODE
25|CLAY
S#|C#|GR
|NJ5101
32|8 |89
32|THAISZ|NJ5102
32|7 |91
38|GOOD
|FL6321
25|7 |68
17|BAID
|NY2091
57|BROWN |NY2092
S.S#=E.S
FALSE
WHERE S.S#=E.S#;
res SNAME |C# |GRADE
CLAY |
CLAY |
THAISZ|
THAISZ|
THAISZ|
7
6
8
7
6
|
|
|
|
|
68
76
89
91
62
25|6 |76
32|6 |62
Section 9
#9
CODE GENERATION
implements the operator, JOIN (equi-join, we will use  for it.)
J1. NESTED LOOP
R 5th and last inner loop pass:
SELECT SNAME,C#,GRADE
S
FROM S=STUDENT,E=ENROLL
E
S#|SNAME |LCODE
25|CLAY
S#|C#|GR
|NJ5101
32|8 |89
32|THAISZ|NJ5102
32|7 |91
38|GOOD
|FL6321
25|7 |68
17|BAID
|NY2091
25|6 |76
57|BROWN |NY2092
S.S#=E.S
FALSE
WHERE S.S#=E.S#;
res SNAME |C# |GRADE
CLAY |
CLAY |
THAISZ|
THAISZ|
THAISZ|
7
6
8
7
6
|
|
|
|
|
68
76
89
91
62
32|6 |62
Section 9
# 23
CODE GENERATION
implements the operator, JOIN (equi-join, we will use  for it.)
J2. When there is an Index on one join attribute the join can be done in one pass (called
Indexed Nested Loop.
If there is an index on E.S#, get r in S, get matching E-tuples using the index (need not scan
entire inner relation, E, each time as was necessary with J1) .
R=SELECT SNAME,C#,GRADE
S
FROM S=STUDENT,E=ENROLL
S#|SNAME |LCODE
25|CLAY |NJ5101
32|THAISZ|NJ5102
38|GOOD |FL6321
17|BAID |NY2091
57|BROWN |NY2092
E
RRN|S#|C#|GR
0 |32|8 |89
1 |32|7 |91
2 |25|7 |68
3 |25|6 |76
4 |32|6 |62
WHERE S.S#=E.S#;
R
SNAME |C#
CLAY | 7
CLAY | 6
THAISZ| 8
THAISZ| 7
THAISZ| 6
|GRADE
| 68
| 76
| 89
| 91
| 62
RRN |S#
2,3 |25
0,1,4|32
Dense Index on E.S#
Section 9
# 24
CODE GENERATION
implements the operator, JOIN (equi-join, we will use  for it.)
J3. MERGE JOIN: If both S.S# and E.S# are clustered, then scan both S and E once in order,
keeping in mind that S.S# is the primary key (uniqueness property), but E.S# is not.
R=SELECT SNAME,C#,GRADE
S
S#|SNAME |LCODE
17|BAID |NY2091
25|CLAY |NJ5101
32|THAISZ|NJ5102
38|GOOD |FL6321
57|BROWN |NY2092
FROM S=STUDENT,E=ENROLL
E
S.S#=E.S#
S.S#=E.S#
FALSE
S.S#=E.S#
TRUE
TRUE
S.S#=E.S#
FALSE
S#|C#|GR
25|7 |68
25|6 |76
32|6 |62
32|8 |89
32|7 |91
WHERE S.S#=E.S#;
R
SNAME |C#
CLAY | 7
CLAY | 6
THAISZ| 6
THAISZ| 8
THAISZ| 7
|GRADE
| 68
| 76
| 62
| 89
| 91
J3'. SORT-MERGE JOIN: If R.A and S.B are not ordered, sort them first (into R' clustered on A
and S' clustered on B), then apply MERGE (J2 above).
Section 9
# 25
CODE GENERATION
implements the operator, JOIN
J4. HASH-JOIN: RIDs hashed to buckets (pages). Corresponding buckets retrieved and scanned
GRACE JOIN: (first example of a hash-join technique): Allocate M pages of memory to the join
process. Partition the M page frames as follows: One page frame for new pages as they are
read from disk (called that page frame, IN), then the remaining B+1 become hash buckets
R0,..,RB ( Note: B = M-2 ). Use hash function, h(S#)=MODB(S#)
Partial Sort (Build) Phase:
Partially-Sort-S: Read each S-page to IN, hash each record using h(S#) to R0,...,RB. If the
bucket, say Ri, is full (collision), flush it to temporary disk file named Si.
Partially-Sort-E: Read each E-page into IN, hash each record with h(S#). Upon collision in a
bucket, say Rj, flush its' contents to a temporary disk file, named Ej.
Probe Phase: With each pair of temporary files, S0 & E0, S1 & E2, S2 & E2, ... in turn, do as
follows: Re-partition memory with two page frames called IN and OUT. With the rest, form
one large hash area. For Si, construct an internal a hash table in the hash area using another
hash function, k(S#) FOR Ei, PROBE the hash table using k(S#) for matches, output the join
of all matches to OUT.
Section 9
# 26
CODE GENERATION
implements the operator, JOIN
J4. HASH-JOIN: RIDs hashed to buckets (pages). Matching buckets are retrieved and scanned.
R=SELECT SNAME,C#,GRADE
FROM S=STUDENT,E=ENROLL
WHERE S.S#=E.S#;
GRACE JOIN: (first example of a hash-join technique): Allocate M=4 pages of memory to the
join process. Partition the M pages as follows: - One page for IN (putting new pages as they
are read from disk), M-1=3 pages for buckets R0,..,R2. Build Phase.
Use hash function, h(S#)=MOD3(S#)
S
RID|S#|SNAME | LCODE
R0
1,0|17|BAID |NY2091
S0 57|BROWN |NY2092
1,1|25|CLAY |NJ5101
R1
2,0|32|THAISZ|NJ5102
2,1|57|BROWN |NY2092
3,0|38|GOOD |FL6321
R2
2,0|32|THAISZ|NJ5102
3,0|38|GOOD
|FL6321
1,0|17|BAID
|NY2091
2,1|57|BROWN |NJ5101
|NY2092
1,1|25|CLAY
E
RID|S#|C#|GR| LCODE
1,0|17|5 |96|NJ5101
1,1|25|7 |68|ND4456
2,0|25|6 |76|NY2091
2,1|32|8 |89|NY2091
3,0|32|7 |91|FL6320
3,1|34|6 |62|ND4456
IN
Collision! Dump R2 Then flush all.
S1
25|CLAY
S2
17|BAID |NY2091
32|THAISZ|NJ5102
38|GOOD |FL6321
Do the same with E.
Partial-Sort-S:
Read each S-page to IN, hash
each record using h(S#) to R0,R1,R2.
Upon collision in any of the buckets,
flush to temporary disk file,
called S0,S1,S2.
|NJ5101
E0
25|7
25|6
34|6
17|5
E2 32|8
32|7
E1
|68|ND4456
|76|NY2091
|62|ND4456
|96|NJ5101
|89|NY2091
|91|FL6320
Section 9
# 27
CODE GENERATION
implements the operator, GRACE
JOIN
Hash Phase: With each pair of temporary files, S0 & E0, S1 & E2, S2 & E2,... in turn, do as follows:
Re-partition memory into IN, OUT and one large hash area.
Probe Phase: For Si, BUILD internal a hash table in the hash area using another hash function, k(S#)=MOD4
(open addr for collisions)
FOR Ei, PROBE hash table using k for matches,
output join of matches to OUT.
Start with S0 and E0. But E0 empty (no output will be produced) so skip.
PROBE S1 and E1: 1. Read S1 page-1 to IN. 2. Hash IN to HASH.
OUT
0
1
2
3
25|CLAY
|NJ5101
25|CLAY
|NJ5101
S0
57|BROWN |NY2092
S1
25|CLAY
S2
17|BAID |NY2091
32|THAISZ|NJ5102
38|GOOD |FL6321
E0
25|7
25|6
34|6
17|5
E2 32|8
32|7
Hash E1
IN
|NJ5101
|68|ND4456
|76|NY2091
|62|ND4456
|96|NJ5101
|89|NY2091
|91|FL6320
Section 9
# 28
CODE GENERATION
implements GRACE JOIN
PROBE S1, E1: 1. Read S1 page-1 to IN. 2. Hash IN to HASH. 3. Read E1 to IN 4. Concatenate to OUT.
CLAY|7 |68
CLAY|6 |76
0
1 25|CLAY |NJ5101
2
3
25|7 |68|ND4456
25|6 |76|NY2091
OUT
S0
57|BROWN |NY2092
S1
25|CLAY
S2
17|BAID |NY2091
32|THAISZ|NJ5102
38|GOOD |FL6321
E0
25|7
25|6
34|6
17|5
E2 32|8
32|7
Hash E1
IN
|NJ5101
|68|ND4456
|76|NY2091
|62|ND4456
|96|NJ5101
|89|NY2091
|91|FL6320
Section 9
# 29
CODE GENERATION
PROBE S1, E1:
implements GRACE JOIN
1. Read S1 pg1 to IN. 2. Hash IN to HASH. 3. Read E1 to IN 4. Concatenate to OUT.
5. Read (next page of) E1 to IN. 6. Hash to HASH.
7. If match, Concatenate to OUT (Since OUT is full, flush OUT first.). But no match!
8. Flush HASH and IN when done with S1, E1 Probe (before starting Probe S2, E2).
CLAY|7 |68
CLAY|6 |76
0
1 25|CLAY |NJ5101
2
3
34|6 |62|ND4456
OUT
S0
57|BROWN |NY2092
S1
25|CLAY
S2
17|BAID |NY2091
32|THAISZ|NJ5102
38|GOOD |FL6321
E0
25|7
25|6
34|6
17|5
E2 32|8
32|7
Hash E1
IN
|NJ5101
|68|ND4456
|76|NY2091
|62|ND4456
|96|NJ5101
|89|NY2091
|91|FL6320
Section 9
# 30
CODE GENERATION
PROBE S1, E1:
implements GRACE JOIN
1. Read S1 pg1 to IN. 2. Hash IN to HASH. 3. Read E1 to IN 4. Concatenate to OUT.
5. Read (next page of) E1 to IN. 6. Hash to HASH.
7. If match, Concatenate to OUT... 8. Flush HASH and IN when done Probing E1.
PROBE S2, E2:
1. Read S2 pg1 to IN 2. MOD3 hash IN to HASH (open addressing for collisions).
CLAY|7 |68
CLAY|6 |76
0
1
2
3
17|BAID |NY2091
32|THAISZ|NJ5102
OUT
S0
57|BROWN |NY2092
S1
25|CLAY
S2
17|BAID |NY2091
32|THAISZ|NJ5102
38|GOOD |FL6321
E0
25|7
25|6
34|6
17|5
E2 32|8
32|7
Hash E1
IN
|NJ5101
|68|ND4456
|76|NY2091
|62|ND4456
|96|NJ5101
|89|NY2091
|91|FL6320
Section 9
# 31
CODE GENERATION
PROBE S1, E1:
implements GRACE JOIN
1. Read S1 pg1 to IN. 2. Hash IN to HASH. 3. Read E1 to IN 4. Concatenate to OUT.
5. Read (next page of) E1 to IN. 6. Hash to HASH.
7. If match, Concatenate to OUT... 8. Flush HASH and IN when done Probing E1.
PROBE S2, E2:
1. Read S2 pg1 to IN 2. MOD3 hash IN to HASH (open addressing for collisions).
3. Read S2 pg2 to IN 4. MOD3 hash IN to HASH (open addressing for collisions).
CLAY|7 |68
CLAY|6 |76
0
1
2 17|BAID |NY2091
3 32|THAISZ|NJ5102
38|GOOD |FL6321
OUT
S0
57|BROWN |NY2092
S1
25|CLAY
S2
17|BAID |NY2091
32|THAISZ|NJ5102
38|GOOD |FL6321
E0
25|7
25|6
34|6
17|5
E2 32|8
32|7
Hash E1
IN
|NJ5101
|68|ND4456
|76|NY2091
|62|ND4456
|96|NJ5101
|89|NY2091
|91|FL6320
Section 9
# 32
CODE GENERATION
PROBE S1, E1:
PROBE S2, E2:
implements GRACE JOIN
1. Read S1 pg1 to IN. 2. Hash IN to HASH. 3. Read E1 to IN 4. Concatenate to OUT.
5. Read (next page of) E1 to IN. 6. Hash to HASH.
7. If match, Concatenate to OUT... 8. Flush HASH and IN when done Probing E1.
1. Read S2 pg1 to IN 2. MOD3 hash IN to HASH (open addressing for collisions).
3. Read S2 pg2 to IN 4. MOD3 hash IN to HASH (open addressing for collisions).
5.Read E2 to IN. 6.Hash to HASH. 7. If match, Concatenate to OUT...
CLAY|7 |68BAID |5|96
CLAY|6 |76
0 38|GOOD |FL6321
1
2 17|BAID |NY2091
3 32|THAISZ|NJ5102
17|5 |96|NJ5101
32|8 |89|NY2091
OUT
S0
57|BROWN |NY2092
S1
25|CLAY
S2
17|BAID |NY2091
32|THAISZ|NJ5102
38|GOOD |FL6321
E0
25|7
25|6
34|6
17|5
E2 32|8
32|7
Hash E1
IN
|NJ5101
|68|ND4456
|76|NY2091
|62|ND4456
|96|NJ5101
|89|NY2091
|91|FL6320
Section 9
# 33
CODE GENERATION
PROBE S1, E1:
PROBE S2, E2:
CLAY|7 |68
CLAY|6 |76
implements GRACE JOIN
1. Read S1 pg1 to IN. 2. Hash IN to HASH. 3. Read E1 to IN 4. Concatenate to OUT.
5. Read (next page of) E1 to IN. 6. Hash to HASH.
7. If match, Concatenate to OUT... 8. Flush HASH and IN when done Probing E1.
1. Read S2 pg1 to IN 2. MOD3 hash IN to HASH (open addressing for collisions).
3. Read S2 pg2 to IN 4. MOD3 hash IN to HASH (open addressing for collisions).
5.Read E2 to IN. 6.Hash to HASH. 7. If match, Concatenate to OUT...
BAID |5|96
THAISZ|8|89
38|GOOD |FL6321
0
1
2 17|BAID |NY2091
3 32|THAISZ|NJ5102
32|8 |89|NY2091
OUT
S0
57|BROWN |NY2092
S1
25|CLAY
S2
17|BAID |NY2091
32|THAISZ|NJ5102
38|GOOD |FL6321
E0
25|7
25|6
34|6
17|5
E2 32|8
32|7
Hash E1
IN
|NJ5101
|68|ND4456
|76|NY2091
|62|ND4456
|96|NJ5101
|89|NY2091
|91|FL6320
Section 9
# 34
CODE GENERATION
PROBE S1, E1:
PROBE S2, E2:
CLAY |7|68
CLAY |6|76
implements GRACE JOIN
1. Read S1 pg1 to IN. 2. Hash IN to HASH. 3. Read E1 to IN 4. Concatenate to OUT.
5. Read (next page of) E1 to IN. 6. Hash to HASH.
7. If match, Concatenate to OUT... 8. Flush HASH and IN when done Probing E1.
1. Read S2 pg1 to IN 2. MOD3 hash IN to HASH (open addressing for collisions).
3. Read S2 pg2 to IN 4. MOD3 hash IN to HASH (open addressing for collisions).
5.Read E2 to IN. 6.Hash to HASH.
7. If match, Concatenate to OUT...(repeat until E2 empty)
8. Flush HASH and IN when done Probing E1.
THAISZ|7|91
BAID |5|96
THAISZ|8|89
0 38|GOOD |FL6321
1
2 17|BAID |NY2091
3 32|THAISZ|NJ5102
32|7 |91|FL6320
OUT
S0
57|BROWN |NY2092
S1
25|CLAY
S2
17|BAID |NY2091
32|THAISZ|NJ5102
38|GOOD |FL6321
E0
25|7
25|6
34|6
17|5
E2 32|8
32|7
Hash E1
IN
|NJ5101
|68|ND4456
|76|NY2091
|62|ND4456
|96|NJ5101
|89|NY2091
|91|FL6320
Section 9
# 35
CODE GENERATION
implements the operator, HYBRID
HASH JOIN
J4. HASH-JOIN: (a better way than GRACE JOIN):
HYBRID HASH JOIN of SS#E (developed by former chair of NDSU CS Dept., Dr. L. Shapiro):
Partition the M pages of main memory allocated to the join process as follows: One page for the IN, 0ne
page for the OUT, B pages for hash buckets, R1..RB, Leave the rest for a large internal hash table, R0.
PARTIAL SORT S: Read each S page to IN, hash each record using h(S#)=MODB(S#) to R0..RB. If a record
hashes to R0, apply an internal hash function, k(S#) which hashes the record to a slot in R 0. Use open
addressing for k-collisions. When an h-collision occurs in any page Ri i=1..B, flush that page to a disk
file, called Si i=1..B.
PARTIALLY SORT E: Read each E page into IN, hash each record with h(S#) to R0..RB If record hashes to
R0, apply internal hash function k(S#), and concatenate records that match to OUT. If a collision occurs
in any page, Ri i=1..B, flush to temporary disk file, Ei i=1..B.
PROBE PHASE for pairs, Si and Ei i=1..B, do the same as in Grace Join.
Hybrid Hash Join can be done with Bit Filtering to eliminate non-participating tuples early and avoid wasted
processing of non-participating tuples Much more detail and example walkthroughs can be found in the
HTML version of these notes (also available from "Other Materials"
http://www.cs.ndsu.nodak.edu/~perrizo/classes/765/09query.html
Section 9
# 36
CODE GENERATION
h=MOD3 k=MOD4
implements the operator, HYBRID HASH JOIN
SELECT SNAME,C#,GRADE FROM S,E WHERE S.S#=E.S#;
Read each S-page to IN, hash each record using h(S#)=MOD3(S#) to R0,R1,R2. If h(S#)=0, k(S#) determines
R0 slot (open addressing for collisions in R0). h collisions in R1, R2 flush to file, S1, S2 resp.
Similarly (note all h hashes goes to R1 or R2, So all E records flush to E1 and E2 and R0 is flushed too.
S
E
RID|S#|SNAME | LCODE
1,0|17|BAID |NY2091
1,1|25|CLAY |NJ5101
2,0|32|THAISZ|NJ5102
2,1|57|BROWN |NY2092
3,0|38|GOOD |FL6321
RID|S#|C#|GR| LCODE
1,0|17|5 |96|NJ5101
1,1|25|7 |68|ND4456
2,0|25|6 |76|NY2091
2,1|32|8 |89|NY2091
3,0|32|7 |91|FL6320
3,1|34|6 |62|ND4456
S0
OUT
0
1 57|BROWN |NY2092
2
3
R0
S1
25|CLAY
S2
17|BAID |NY2091
32|THAISZ|NJ5102
38|GOOD |FL6321
E0
R1
25|7
25|6
34|6
17|5
E2 32|8
32|7
E1
R2
2,0|32|THAISZ|NJ5102
3,0|38|GOOD
|FL6321
1,0|17|BAID
|NY2091
2,1|57|BROWN |NJ5101
|NY2092
1,1|25|CLAY
|NJ5101
IN
|68|ND4456
|76|NY2091
|62|ND4456
|96|NJ5101
|89|NY2091
|91|FL6320
Section 9
# 37
CODE GENERATION
implements HYBRID JOIN
PROBE S1, E1: 1. Read S1 page-1 to IN. 2. Hash IN to HASH. 3. Read E1 to IN 4. Concatenate to OUT.
S0
CLAY|7 |68
CLAY|6 |76
0
1 25|CLAY |NJ5101
2
3
25|7 |68|ND4456
25|6 |76|NY2091
OUT
S1
25|CLAY
S2
17|BAID |NY2091
32|THAISZ|NJ5102
38|GOOD |FL6321
E0
25|7
25|6
34|6
17|5
E2 32|8
32|7
Hash E1
IN
|NJ5101
|68|ND4456
|76|NY2091
|62|ND4456
|96|NJ5101
|89|NY2091
|91|FL6320
Section 9
# 38
CODE GENERATION
PROBE S1, E1:
implements HYBRID JOIN
1. Read S1 pg1 to IN. 2. Hash IN to HASH. 3. Read E1 to IN 4. Concatenate to OUT.
5. Read (next page of) E1 to IN. 6. Hash to HASH.
7. If match, Concatenate to OUT (Since OUT is full, flush OUT first.). But no match!
8. Flush HASH and IN when done with S1, E1 Probe (before starting Probe S2, E2).
S0
CLAY|7 |68
CLAY|6 |76
0
1 25|CLAY |NJ5101
2
3
34|6 |62|ND4456
OUT
S1
25|CLAY
S2
17|BAID |NY2091
32|THAISZ|NJ5102
38|GOOD |FL6321
E0
25|7
25|6
34|6
17|5
E2 32|8
32|7
Hash E1
IN
|NJ5101
|68|ND4456
|76|NY2091
|62|ND4456
|96|NJ5101
|89|NY2091
|91|FL6320
Section 9
# 39
CODE GENERATION
PROBE S1, E1:
implements HYBRID JOIN
1. Read S1 pg1 to IN. 2. Hash IN to HASH. 3. Read E1 to IN 4. Concatenate to OUT.
5. Read (next page of) E1 to IN. 6. Hash to HASH.
7. If match, Concatenate to OUT... 8. Flush HASH and IN when done Probing E1.
PROBE S2, E2:
1. Read S2 pg1 to IN 2. MOD3 hash IN to HASH (open addressing for collisions).
S0
CLAY|7 |68
CLAY|6 |76
0
1
2
3
17|BAID |NY2091
32|THAISZ|NJ5102
OUT
S1
25|CLAY
S2
17|BAID |NY2091
32|THAISZ|NJ5102
38|GOOD |FL6321
E0
25|7
25|6
34|6
17|5
E2 32|8
32|7
Hash E1
IN
|NJ5101
|68|ND4456
|76|NY2091
|62|ND4456
|96|NJ5101
|89|NY2091
|91|FL6320
Section 9
# 40
CODE GENERATION
PROBE S1, E1:
implements HYBRID JOIN
1. Read S1 pg1 to IN. 2. Hash IN to HASH. 3. Read E1 to IN 4. Concatenate to OUT.
5. Read (next page of) E1 to IN. 6. Hash to HASH.
7. If match, Concatenate to OUT... 8. Flush HASH and IN when done Probing E1.
PROBE S2, E2:
1. Read S2 pg1 to IN 2. MOD3 hash IN to HASH (open addressing for collisions).
3. Read S2 pg2 to IN 4. MOD3 hash IN to HASH (open addressing for collisions).
S0
CLAY|7 |68
CLAY|6 |76
0
1
2 17|BAID |NY2091
3 32|THAISZ|NJ5102
38|GOOD |FL6321
OUT
S1
25|CLAY
S2
17|BAID |NY2091
32|THAISZ|NJ5102
38|GOOD |FL6321
E0
25|7
25|6
34|6
17|5
E2 32|8
32|7
Hash E1
IN
|NJ5101
|68|ND4456
|76|NY2091
|62|ND4456
|96|NJ5101
|89|NY2091
|91|FL6320
Section 9
# 41
CODE GENERATION
PROBE S1, E1:
PROBE S2, E2:
implements HYBRID JOIN
1. Read S1 pg1 to IN. 2. Hash IN to HASH. 3. Read E1 to IN 4. Concatenate to OUT.
5. Read (next page of) E1 to IN. 6. Hash to HASH.
7. If match, Concatenate to OUT... 8. Flush HASH and IN when done Probing E1.
1. Read S2 pg1 to IN 2. MOD3 hash IN to HASH (open addressing for collisions).
3. Read S2 pg2 to IN 4. MOD3 hash IN to HASH (open addressing for collisions).
5.Read E2 to IN. 6.Hash to HASH. 7. If match, Concatenate to OUT...
S0
CLAY|7 |68BAID |5|96
CLAY|6 |76
0 38|GOOD |FL6321
1
2 17|BAID |NY2091
3 32|THAISZ|NJ5102
17|5 |96|NJ5101
32|8 |89|NY2091
OUT
S1
25|CLAY
S2
17|BAID |NY2091
32|THAISZ|NJ5102
38|GOOD |FL6321
E0
25|7
25|6
34|6
17|5
E2 32|8
32|7
Hash E1
IN
|NJ5101
|68|ND4456
|76|NY2091
|62|ND4456
|96|NJ5101
|89|NY2091
|91|FL6320
Section 9
# 42
CODE GENERATION
PROBE S1, E1:
PROBE S2, E2:
implements HYBRID JOIN
1. Read S1 pg1 to IN. 2. Hash IN to HASH. 3. Read E1 to IN 4. Concatenate to OUT.
5. Read (next page of) E1 to IN. 6. Hash to HASH.
7. If match, Concatenate to OUT... 8. Flush HASH and IN when done Probing E1.
1. Read S2 pg1 to IN 2. MOD3 hash IN to HASH (open addressing for collisions).
3. Read S2 pg2 to IN 4. MOD3 hash IN to HASH (open addressing for collisions).
5.Read E2 to IN. 6.Hash to HASH. 7. If match, Concatenate to OUT...
S0
CLAY|7 |68
CLAY|6 |76
BAID |5|96
THAISZ|8|89
38|GOOD |FL6321
0
1
2 17|BAID |NY2091
3 32|THAISZ|NJ5102
32|8 |89|NY2091
OUT
S1
25|CLAY
S2
17|BAID |NY2091
32|THAISZ|NJ5102
38|GOOD |FL6321
E0
25|7
25|6
34|6
17|5
E2 32|8
32|7
Hash E1
IN
|NJ5101
|68|ND4456
|76|NY2091
|62|ND4456
|96|NJ5101
|89|NY2091
|91|FL6320
Section 9
# 43
CODE GENERATION
PROBE S1, E1:
PROBE S2, E2:
implements HYBRID JOIN
1. Read S1 pg1 to IN. 2. Hash IN to HASH. 3. Read E1 to IN 4. Concatenate to OUT.
5. Read (next page of) E1 to IN. 6. Hash to HASH.
7. If match, Concatenate to OUT... 8. Flush HASH and IN when done Probing E1.
1. Read S2 pg1 to IN 2. MOD3 hash IN to HASH (open addressing for collisions).
3. Read S2 pg2 to IN 4. MOD3 hash IN to HASH (open addressing for collisions).
5.Read E2 to IN. 6.Hash to HASH.
7. If match, Concatenate to OUT...(repeat until E2 empty)
8. Flush HASH and IN when done Probing E1.
S0
CLAY |7|68
CLAY |6|76
THAISZ|7|91
BAID |5|96
THAISZ|8|89
0 38|GOOD |FL6321
1
2 17|BAID |NY2091
3 32|THAISZ|NJ5102
32|7 |91|FL6320
OUT
S1
25|CLAY
S2
17|BAID |NY2091
32|THAISZ|NJ5102
38|GOOD |FL6321
E0
25|7
25|6
34|6
17|5
E2 32|8
32|7
Hash E1
IN
|NJ5101
|68|ND4456
|76|NY2091
|62|ND4456
|96|NJ5101
|89|NY2091
|91|FL6320
Section 9
# 44
CODE generation of HYBRID HASH JOIN using R2 as internal hash bucket!
h=MOD3 k=MOD4
SELECT SNAME,C#,GRADE
FROM S,E WHERE S.S#=E.S#;
BUILD S: Read each S-page to IN, hash each record using h(S#)=MOD3(S#) to R0,R1,R2. If h(S#)=2, k(S#)
determines R2 slot (open addressing for R2). h collisions in R1, R0 flush to file, S1, S0 resp.
S
E
RID|S#|SNAME | LCODE
1,0|17|BAID |NY2091
1,1|25|CLAY |NJ5101
2,0|32|THAISZ|NJ5102
2,1|57|BROWN |NY2092
3,0|38|GOOD |FL6321
RID|S#|C#|GR| LCODE
1,0|17|5 |96|NJ5101
1,1|25|7 |68|ND4456
2,0|25|6 |76|NY2091
2,1|32|8 |89|NY2091
3,0|32|7 |91|FL6320
3,1|34|6 |62|ND4456
S0
S1
57|BROWN |NY2092
25|CLAY
|NJ5101
OUT
R0
R1
R2
0
1
2
3
2,0|32|THAISZ|NJ5102
1,0|17|BAID |FL6321
|NY2091
3,0|38|GOOD
2,1|57|BROWN |NJ5101
|NY2092
1,1|25|CLAY
E0
E1
IN
Section 9
# 45
CODE generation implements HYBRID HASH JOIN using R2 as internal hash bucket!
h=MOD3 k=MOD4 SELECT SNAME,C#,GRADE
FROM S,E WHERE S.S#=E.S#;
BUILD E: Read first E-page to IN, hash each record using h(S#)=MOD3(S#) to R0,R1,R2. If h(S#)=2, k(S#)
determines R2 slot (open addressing for R2). If match, concatenate to OUT. h collisions in R1, R0 flush
to file, E1, E0 respectively.
S
E
RID|S#|SNAME | LCODE
1,0|17|BAID |NY2091
1,1|25|CLAY |NJ5101
2,0|32|THAISZ|NJ5102
2,1|57|BROWN |NY2092
3,0|38|GOOD |FL6321
RID|S#|C#|GR| LCODE
1,0|17|5 |96|NJ5101
1,1|25|7 |68|ND4456
2,0|25|6 |76|NY2091
2,1|32|8 |89|NY2091
3,0|32|7 |91|FL6320
3,1|34|6 |62|ND4456
57|BROWN |NY2092
S0
S1 S1
BAID
|5 |96
OUT
25|CLAY
|NJ5101
S2
R0
R1
R2
0
1
2
3
E0
32|THAISZ|NJ510
2
17|BAID
|NY2091
38|GOOD |FL6321
1,0|17|5 |96|NJ5101
1,1|25|7 |68|ND4456
E1
IN
Section 9
# 46
CODE GEN
implements HYBRID
h=MOD3 k=MOD4 SELECT SNAME,C#,GRADE
HASH JOIN using R2 as internal hash bucket!
FROM S,E WHERE S.S#=E.S#;
BUILD E: Read second E-page to IN, hash each record using h(S#)=MOD3(S#) to R0,R1,R2. If h(S#)=2,
k(S#) determines R2 slot (open addressing for R2). If match, concatenate to OUT. h collisions in R1, R0
flush to file, E1, E0 respectively.
S
E
RID|S#|SNAME | LCODE
1,0|17|BAID |NY2091
1,1|25|CLAY |NJ5101
2,0|32|THAISZ|NJ5102
2,1|57|BROWN |NY2092
3,0|38|GOOD |FL6321
RID|S#|C#|GR| LCODE
1,0|17|5 |96|NJ5101
1,1|25|7 |68|ND4456
2,0|25|6 |76|NY2091
2,1|32|8 |89|NY2091
3,0|32|7 |91|FL6320
3,1|34|6 |62|ND4456
57|BROWN |NY2092
S0
S1 S1
BAID |5 |96
THAISZ|8 |89
OUT
25|CLAY
|NJ5101
S2
R0
25|7 |68|ND4456
R1
R2
0
1
2
3
E0
32|THAISZ|NJ510
2
17|BAID
|NY2091
38|GOOD |FL6321
2,0|25|6 |76|NY2091
2,1|32|8 |89|NY2091
E1
IN
Section 9
# 47
CODE GEN
implements HYBRID
h=MOD3 k=MOD4 SELECT SNAME,C#,GRADE
HASH JOIN using R2 as internal hash bucket!
FROM S,E WHERE S.S#=E.S#;
BUILD E: Read third E-page to IN, hash each record using h(S#)=MOD3(S#) to R0,R1,R2. If h(S#)=2, k(S#)
determines R2 slot (open addressing for R2). If match, concatenate to OUT. h collisions in R1, R0 flush
to file, E1, E0 respectively. When done building E, flush R2.
S
E
RID|S#|SNAME | LCODE
1,0|17|BAID |NY2091
1,1|25|CLAY |NJ5101
2,0|32|THAISZ|NJ5102
2,1|57|BROWN |NY2092
3,0|38|GOOD |FL6321
RID|S#|C#|GR| LCODE
1,0|17|5 |96|NJ5101
1,1|25|7 |68|ND4456
2,0|25|6 |76|NY2091
2,1|32|8 |89|NY2091
3,0|32|7 |91|FL6320
3,1|34|6 |62|ND4456
57|BROWN |NY2092
S0
S1 S1
BAID |5 |96
THAISZ|7 |91
THAISZ|8 |89
OUT
25|CLAY
|NJ5101
S2
R0
R1
R2
0
1
2
3
25|7 |68|ND4456
25|6 |76|NY2091
32|THAISZ|NJ510
2
17|BAID
|NY2091
38|GOOD |FL6321
3,0|32|7 |91|FL6320
3,1|34|6 |62|ND4456
E0
E1
IN
25|7 |68|ND4456
25|6 |76|NY2091
34|6 |62|ND4456
Section 9
# 48
CODE GEN
implements HYBRID
HASH JOIN using R2 as internal hash bucket probe:
Note: If memory allocation is static, use all Ri pages for internal hash function! So k=MOD 8
h=MOD3 k=MOD4 SELECT SNAME,C#,GRADE
FROM S,E WHERE S.S#=E.S#;
PROBE S:
Notice how much more efficient the probe phase of HH JOIN is than Grace JOIN when the internal
Hash table is chosen to apply to the right bucket!
(And how it may not be faster, if that decision is badly made!)
57|BROWN |NY2092
S0
S1 S1
BAID |5 |96
THAISZ|8 |89
THAISZ|6
THAISZ|7 |76
|91
CLAY |7 |68
R2
OUT
0
1
2
3
4
5
6
7
25|CLAY
|NJ5101
S2
E0
E1
25|7 |68|ND4456
25|6 |76|NY2091
IN
25|7 |68|ND4456
25|6 |76|NY2091
34|6 |62|ND4456
Section 9
# 49