CS 579 Database Systems

Download Report

Transcript CS 579 Database Systems

Theory, Practice & Methodology
of Relational Database
Design and Programming
Copyright © Ellis Cohen 2002-2008
Outer Joins
These slides are licensed under a Creative Commons
Attribution-NonCommercial-ShareAlike 2.5 License.
For more information on how you may use them,
please see http://www.openlineconsult.com/db
1
Overview of Lecture
Natural Outer Joins
1:M Outer Joins
Un-Natural Outer Joins
Outer Join Conditions
© Ellis Cohen 2001-2008
2
Natural Outer Joins
© Ellis Cohen 2001-2008
3
Inner Joins are Restrictive
CoolEmps
empno
7782
7698
Given Emps, a table of employees, and
CoolEmps, a list of the cool employees,
what are the jobs of the cool employees
Restricts Emps information to just
those employees in CoolEmps
Emps
empno
ename
job
deptno
7782
CLARK
CLERK
10
7369
SMITH
CLERK
20
7499
ALLEN
SALESMAN
30
7698
BLAKE
MANAGER
30
EMPNO
----7782
7698
JOB
--------CLERK
MANAGER
SELECT empno, job FROM
Emps NATURAL JOIN CoolEmps
© Ellis Cohen 2001-2008
4
Joins Do Matching
CoolEmps
empno
7782
If CoolEmps can contain employees who
are no longer with the company,
7401
the join shows information only about
employees who are in both tables
7698
8315
Emps
empno
ename
job
deptno
7782
CLARK
CLERK
10
7369
SMITH
CLERK
20
7499
ALLEN
SALESMAN
30
7698
BLAKE
MANAGER
30
EMPNO
----7782
7698
© Ellis Cohen 2001-2008
JOB
--------CLERK
MANAGER
5
Can We Make Joins Inclusive?
CoolEmps
empno
7782
Suppose we want our result to include a
tuple for every employee in CoolEmps,
whether they still are employed or not.
If they are still employed, we still want to
show their job.
7401
7698
If they aren't still employed, they don't have
a job, so just show NULL for their job
8315
How do we generate this?
Emps
empno
ename
job
deptno
7782
CLARK
CLERK
10
7369
SMITH
CLERK
20
7499
ALLEN
SALESMAN
30
7698
BLAKE
MANAGER
30
EMPNO
----7782
7401
7698
8315
© Ellis Cohen 2001-2008
JOB
--------CLERK
MANAGER
6
Outer Join = Inner Join + Remainder
CoolEmps
Inner
Join
empno
7782
7401
7698
CoolEmps
not in Inner
Join
8315
Emps
empno
ename
job
deptno
7782
CLARK
CLERK
10
7369
SMITH
CLERK
20
7499
ALLEN
SALESMAN
30
7698
BLAKE
MANAGER
30
EMPNO
----7782
7698
JOB
--------CLERK
MANAGER
EMPNO JOB
----- --------7401
8315
EMPNO
----7782
7401
7698
8315
© Ellis Cohen 2001-2008
JOB
--------CLERK
MANAGER
7
Outer Joins are Inclusive
CoolEmps
Include all tuples in the left operand
empno
7782
7401
What are the jobs of the cool employees
(showing jobs of NULL for those who
are no longer with the company)?
7698
8315
Emps
empno
ename
job
deptno
7782
CLARK
CLERK
10
7369
SMITH
CLERK
20
7499
ALLEN
SALESMAN
30
7698
BLAKE
MANAGER
30
EMPNO
----7782
7401
7698
8315
JOB
--------CLERK
MANAGER
SELECT empno, job FROM
CoolEmps NATURAL LEFT JOIN Emps
© Ellis Cohen 2001-2008
8
Right vs Left Joins
SELECT empno, job
FROM Emps NATURAL RIGHT JOIN CoolEmps
Equivalent
SELECT empno, job FROM
CoolEmps NATURAL LEFT JOIN Emps
© Ellis Cohen 2001-2008
9
1:M Outer Joins
© Ellis Cohen 2001-2008
10
Including all Children in 1:M
Show the names and department names of all employees.
Include employees who are unassigned
Emps
*
Depts
empno
ename
… deptno
7782
CLARK
…
10
10
ACCOUNTING
…
7499
ALLEN
…
30
20
RESEARCH
…
8214
JOAN
…
30
SALES
…
7698
BLAKE
…
40
OPERATIONS
…
deptno
30
dname
…
included in inner join
* not included in inner join,
but wanted in the result set
The outer join provides extended information about a tuple
(e.g. an employee), without losing information about tuples
which do no match (e.g. are unassigned)
© Ellis Cohen 2001-2008
11
1:M Child Outer Joins
Emps
*
Depts
empno
ename
… deptno
7782
CLARK
…
10
10
ACCOUNTING
…
7499
ALLEN
…
30
20
RESEARCH
…
8214
JOAN
…
30
SALES
…
7698
BLAKE
…
40
OPERATIONS
…
Show the names
and department
names of
all employees.
Include employees
who are
unassigned.
30
deptno
dname
ENAME
----CLARK
ALLEN
JOAN
BLAKE
…
DNAME
---------ACCOUNTING
SALES
SALES
SELECT ename, dname
FROM Emps NATURAL LEFT JOIN Depts
© Ellis Cohen 2001-2008
12
Including all Parents in 1:M
Show names and department names of the employees in each
department.
Include departments with no employees.
Depts
deptno
*
*
dname
…
Emps
empno
ename
… deptno
10
ACCOUNTING
…
7782
CLARK
…
10
20
RESEARCH
…
7499
ALLEN
…
30
30
SALES
…
8214
JOAN
…
40
OPERATIONS
…
7698
BLAKE
…
30
included in inner join
* not included in inner join,
but wanted in the result set
© Ellis Cohen 2001-2008
13
1:M Parent Outer Joins
Depts
deptno
*
*
dname
…
Emps
empno
ename
… deptno
10
ACCOUNTING
…
7782
CLARK
…
10
20
RESEARCH
…
7499
ALLEN
…
30
30
SALES
…
8214
JOAN
…
40
OPERATIONS
…
7698
BLAKE
…
DNAME
-----------ACCOUNTING
RESEARCH
SALES
SALES
OPERATIONS
ENAME
-----CLARK
ALLEN
BLAKE
30
Show names and
department names of
the employees
in each department.
Include departments
with no employees.
SELECT dname, ename
FROM Depts NATURAL LEFT JOIN Emps
© Ellis Cohen 2001-2008
14
Outer Join = Inner Join + Remainder
Depts
deptno
*
*
dname
Emps
…
empno
ename
… deptno
10
ACCOUNTING
…
7782
CLARK
…
10
20
RESEARCH
…
7499
ALLEN
…
30
30
SALES
…
8214
JOAN
…
40
OPERATIONS
…
7698
BLAKE
…
DNAME
-----------ACCOUNTING
SALES
SALES
ENAME
-----CLARK
ALLEN
BLAKE
DNAME
ENAME
------------ -----RESEARCH
OPERATIONS
Inner
Join
Depts
not in Inner
Join
© Ellis Cohen 2001-2008
30
DNAME
-----------ACCOUNTING
RESEARCH
SALES
SALES
OPERATIONS
ENAME
-----CLARK
ALLEN
BLAKE
15
Types of Joins
Inner Join - when joining A & B, a row in A
has a corresponding row in the result only if
it matches some row in B (& vice versa)
Outer Joins:
LEFT JOIN - join A & B, but every row in A
has at least one corresponding row in the
result
RIGHT JOIN - join A & B, but every row in B
has at least one corresponding row in the
result
FULL JOIN - join A & B, but every row in A
and every row in B has at least one
corresponding row in the result
© Ellis Cohen 2001-2008
16
Including All Parents & Children
Show the names and department names of
all employees in all departments,
including employees who are unassigned,
and
including departments with no employees
Depts
deptno
*
*
dname
…
Emps
empno
ename
… deptno
10
ACCOUNTING
…
7782
CLARK
…
10
20
RESEARCH
…
7499
ALLEN
…
30
30
SALES
…
8214
JOAN
…
40
OPERATIONS
…
7698
BLAKE
…
© Ellis Cohen 2001-2008
*
30
17
1:M Full Join
Depts
deptno
*
*
dname
…
Emps
empno
ename
… deptno
10
ACCOUNTING
…
7782
CLARK
…
10
20
RESEARCH
…
7499
ALLEN
…
30
30
SALES
…
8214
JOAN
…
40
OPERATIONS
…
7698
BLAKE
…
*
30
DNAME
ENAME
------------ -----JOAN
ACCOUNTING
CLARK
RESEARCH
SALES
ALLEN
SALES
BLAKE
OPERATIONS
SELECT dname, ename
FROM Depts NATURAL FULL JOIN Emps
© Ellis Cohen 2001-2008
18
Grouping with Outer Joins Question
SELECT deptno, count(*) AS knt
FROM Emps
GROUP BY deptno
Counts # of employees in each
department (that has employees)
SELECT deptno, count(empno) AS knt
FROM (Depts NATURAL LEFT JOIN Emps)
GROUP BY deptno
What does this return?
Why use count(empno) instead of count(*)?
© Ellis Cohen 2001-2008
19
Outer Join & Group
Depts
deptno
*
*
dname
…
Emps
empno
ename
… deptno
10
ACCOUNTING
…
7782
CLARK
…
10
20
RESEARCH
…
7499
ALLEN
…
30
30
SALES
…
8214
JOAN
…
40
OPERATIONS
…
7698
BLAKE
…
Depts NATURAL LEFT JOIN Emps
(Just show deptno & empno)
DEPTNO EMPNO
------ -----10
7782
20
30
7499
30
7698
40
30
GROUP BY deptno
deptno
10
20
30
40
count(*)
1
1
2
1
count(
empno)
1
0
2
0
# of employees in each dept!
© Ellis Cohen 2001-2008
20
Count Unassigned Employees
Depts
deptno
*
*
dname
…
Emps
empno
ename
… deptno
10
ACCOUNTING
…
7782
CLARK
…
10
20
RESEARCH
…
7499
ALLEN
…
30
30
SALES
…
8214
JOAN
…
40
OPERATIONS
…
7698
BLAKE
…
Depts NATURAL FULL JOIN Emps
DEPTNO EMPNO
------ -----8214
10
7782
20
30
7499
30
7698
40
*
30
GROUP BY deptno
deptno
10
20
30
40
© Ellis Cohen 2001-2008
count(*)
count(
empno)
1
1
1
2
1
1
1
0
2
0
21
Un-Natural Outer Joins
© Ellis Cohen 2001-2008
22
Un-Natural Outer Joins
For each project, list its name, and the name of its
manager (show NULL if there is no manager)
SELECT pname, ename
FROM (Projs LEFT JOIN Emps ON pmgr = empno)
Projs
pno
pname
pmgr
persons
budget
pstart
pend
Emps
empno
ename
job
mgr
hiredate
sal
comm
deptno
© Ellis Cohen 2001-2008
23
Counts & Outer Joins?
What's the difference between the various counts?
SELECT count(*) AS tknt,
count(pno) AS pknt,
count(pmgr) AS pmknt,
count(empno) AS eknt,
count(DISTINCT pmgr) AS xpmknt,
count(DISTINCT empno) AS xeknt
FROM (Projs LEFT JOIN Emps ON pmgr = empno)
PNO
--1
2
3
4
5
6
PMGR
---2311
2955
4001
2311
2311
Emps
Projs
pno
pname
pmgr
persons
budget
pstart
pend
empno
ename
job
mgr
hiredate
sal
comm
deptno
100 employees
© Ellis Cohen 2001-2008
24
Counts & Outer Joins
SELECT count(*) AS tknt,
count(pno) AS pknt,
count(pmgr) AS pmknt,
count(empno) AS eknt,
count(DISTINCT pmgr) AS xpmknt,
count(DISTINCT empno) AS xeknt
FROM (Projs LEFT JOIN Emps ON pmgr = empno)
PNO
--1
2
3
4
5
6
PMGR
---2311
2955
4001
2311
2311
tknt
pknt
pmknt
eknt
xpmknt
xeknt
# projects: 6
# projects: 6
# managed projects: 5
# managed project s: 5
# project managers: 3
# project managers: 3
© Ellis Cohen 2001-2008
25
Reverse Outer Joins
For each employee, list their name
and the name of the projects they manage,
including employees who manage no projects
SELECT pname, ename
FROM (Emps LEFT JOIN Projs ON pmgr = empno)
Emps
empno
ename
job
mgr
hiredate
sal
comm
deptno
Projs
pno
pname
pmgr
persons
budget
pstart
pend
© Ellis Cohen 2001-2008
26
Counts & Reverse Outer Joins?
What's the difference between the various counts?
SELECT count(*) AS tknt,
count(pno) AS pknt,
count(pmgr) AS pmknt,
count(empno) AS eknt,
count(DISTINCT pmgr) AS xpmknt,
count(DISTINCT empno) AS xeknt
FROM (Emps LEFT JOIN Projs ON pmgr = empno)
Emps
empno
ename
job
mgr
hiredate
sal
comm
deptno
Projs
pno
pname
pmgr
persons
budget
pstart
pend
PNO
--1
2
3
4
5
6
PMGR
---2311
2955
4001
2311
2311
100 employees
© Ellis Cohen 2001-2008
27
Counts & Reverse Outer Joins
SELECT count(*) AS tknt,
count(pno) AS pknt,
count(pmgr) AS pmknt,
count(empno) AS eknt,
count(DISTINCT pmgr) AS xpmknt,
count(DISTINCT empno) AS xeknt
FROM (Emps LEFT JOIN Projs ON pmgr = empno)
tknt
pknt
pmknt
eknt
xpmknt
xeknt
# employees: 100
# managed projects: 5
# managed projects: 5
# employees: 100
# project managers: 3
# employees: 100
© Ellis Cohen 2001-2008
28
Grouped Counts &
Reverse Outer Joins?
What's the difference between the various counts?
(Assume that some projects are unmanaged)
SELECT deptno,
count(*) AS tknt,
count(pmgr) AS pmknt,
count(DISTINCT pmgr) AS xpmknt
FROM (Emps LEFT JOIN Projs ON pmgr = empno)
GROUP BY deptno
Emps
empno
ename
job
mgr
hiredate
sal
comm
deptno
Projs
pno
pname
pmgr
persons
budget
pstart
pend
© Ellis Cohen 2001-2008
29
Grouped Counts &
Reverse Outer Joins
SELECT deptno,
count(*) AS tknt,
count(pmgr) AS pmknt,
count(DISTINCT pmgr) AS xpmknt
FROM (Emps LEFT JOIN Projs ON pmgr = empno)
GROUP BY deptno
For each department
tknt
# employees in that department
pknt
# projects managed by employees in
that department
xpmknt
# project managers in that department
© Ellis Cohen 2001-2008
30
Outer Join
Conditions
© Ellis Cohen 2001-2008
31
Outer Join Conditions vs. Restrictions?
For each budgeted project, list its name, and the
name of its manager (show NULL if there is no
manager)
SELECT pname, ename
FROM (Projs LEFT JOIN Emps ON pmgr = empno)
WHERE budget IS NOT NULL
What is the meaning of this?
SELECT pname, ename
FROM (Projs LEFT JOIN Emps
ON (pmgr = empno) AND
(budget IS NOT NULL))
© Ellis Cohen 2001-2008
32
JOIN + WHERE
Projs
Emps
pno
pname
budget
pmgr
empno
ename
1
P1
1000
1111
1111
ALLEN
2
3
P2
P3
2000
3333
2222
3333
MARTIN
BLAKE
4
P4
4444
KING
Projs LEFT JOIN Emps ON pmgr = empno
PNAME BUDGET ENAME
----- ------ ----P1
1000 ALLEN
P2
2000
P3
BLAKE
P4
PNAME ENAME
----- ----P1
ALLEN
P2
WHERE budget IS NOT NULL
© Ellis Cohen 2001-2008
33
Just JOIN ON
Projs
Emps
pno
pname
budget
pmgr
empno
ename
1
P1
1000
1111
1111
ALLEN
2
3
P2
P3
2000
3333
2222
3333
MARTIN
BLAKE
4
P4
4444
KING
Projs LEFT JOIN Emps ON
pmgr = empno AND budget IS NOT NULL
PNAME ENAME
----- ----P1
ALLEN
P2
P3
P4
Lists all projects
ENAME shown only if
• project is managed
• project has budget
© Ellis Cohen 2001-2008
34
Outer Join Conditions vs. Restrictions
For each budgeted project, list its name, and the
name of its manager (show NULL if there is no
manager)
SELECT pname, ename
FROM (Projs LEFT JOIN Emps ON pmgr = empno)
WHERE budget IS NOT NULL
For each project, list its name. If the project is
budgeted, also show the name of the manager if
there is one (else shown NULL)
SELECT pname, ename
FROM (Projs LEFT JOIN Emps
ON (pmgr = empno) AND
(budget IS NOT NULL))
© Ellis Cohen 2001-2008
35
Aggregation Problem
Write SQL
For each employee,
list the # of budgeted projects
they manage (show 0 for
employees that don't manage
budgeted projects)
© Ellis Cohen 2001-2008
36
Answer: Aggregation Problem
For each employee, list the # of budgeted projects
they manage (show 0 for employees that don't
manage budgeted projects)
SELECT empno, count(pno) AS knt
FROM (Emps LEFT JOIN Projs
ON empno = pmgr
AND budget IS NOT NULL)
GROUP BY empno
SELECT empno, count(
CASE WHEN budget IS NOT NULL
THEN pno END)
FROM (Emps LEFT JOIN Projs
ON empno = pmgr)
GROUP BY empno
© Ellis Cohen 2001-2008
37
Related Problems & Answers
For every employee that manages
budgeted projects, list the # they
manage
SELECT pmgr, count(*) FROM Projs
WHERE budget IS NOT NULL
GROUP BY pmgr
For every employee that manages
projects, list the # of budgeted ones
they manage
SELECT pmgr, count(
CASE WHEN budget IS NOT NULL
THEN pno END)
FROM Projs
GROUP BY pmgr
© Ellis Cohen 2001-2008
38