Learning Through Writing
Download
Report
Transcript Learning Through Writing
Banner and the SQL Select
Statement: Part Four (Multiple
Connected Select Statements)
Mark Holliday
Department of Mathematics and
Computer Science
Western Carolina University
18 November 2005
(updated: 18 November 2005)
Outline
The Goal
The Concepts
A First Example
Single Table Selects
Joins
Multiple Connected Select Statements
A First Example
Outline
The Relational Model: Single Table
Lab 1: TOAD, Schema Browser
Some Structured Query Language (SQL)
Basics
Lab 2: TOAD, SQL Editor
Single Table Selects
Outline
WHERE clause: single condition, multiple
conditions
Lab 3:
Order By; Aggregate Functions
Lab 4:
Group By; Having
Lab 5:
Joins
Outline
Why multiple tables?
Inner Joins
Lab 6:
Outer joins
Lab 7:
Multiple Connected Select Statements
Outline
Set Operators
Lab 8:
Subqueries
Use directly: FROM clause
Use as a set: new operators
Use as a single value: aggregate functions
Lab 9:
A Query Development Methodology
SET Operators
Intuition:
A SQL Select statement returns a
table
A table is a set
we can put a set operator in between
two select statements to create a new
set (that is, table)
SET Operators
(franz)
Types of Set Operators :
UNION – return all rows, exclusing
duplicates
UNION ALL -- returns all rows, including
duplicates
INTERSECT -- returns rows retrieved in
both queries
MINUS -- returns remaining rows when
results of second query are subtracted
from the first query
Set Operators
Two restrictions of set operators are:
The two tables must contain the same
number of columns.
All corresponding columns in the two
tables need to be of the same data
type.
Example Tables
Relation (stvrelt) table (13 rows)
(franz)
Legacy (stvlgcy) table (10 rows)
UNION
(franz)
The purpose of the SQL UNION command
is to combine the results of two
queries.
In this respect, UNION is similar to
JOIN
they are both used to combine related
information from multiple tables.
UNION
(franz)
When using UNION, only distinct values are
selected (similar to SELECT DISTINCT).
The syntax is as follows:
[SQL Statement 1]
UNION
[SQL Statement 2]
UNION
SELECT stvrelt_code, stvrelt_desc
varchar(1); varchar(30)
(franz)
FROM stvrelt
Relation table
UNION
SELECT stvlgcy_code, stvlgcy_desc
varchar(1); varchar(30)
FROM stvlgcy
Legacy
table
Note: The default resultant set from UNION is
DISTINCT rows.
UNION ALL
(franz)
The difference between UNION ALL and
UNION is that,
while UNION only selects distinct values,
UNION ALL selects all values.
The syntax for UNION ALL is as follows:
[SQL Statement 1]
UNION ALL
[SQL Statement 2]
UNION ALL
(franz)
Here is our example using Banner
tables:
SELECT stvrelt_code, stvrelt_desc
FROM stvrelt
UNION ALL
SELECT stvlgcy_code, stvlgcy_desc
FROM stvlgcy
INTERSECT
(franz)
Similar to the UNION command, INTERSECT
operates on two SQL statements.
The difference is that, while UNION essentially
acts as an OR operator
(value is selected if it appears in either the first or
the second statement),
the INTERSECT command acts as an AND
operator
(value is selected only if it appears in both
statements).
INTERSECT
The syntax is as follows:
[SQL Statement 1]
INTERSECT
[SQL Statement 2]
(franz)
INTERSECT
(franz)
SELECT stvrelt_code, stvrelt_desc
FROM stvrelt
INTERSECT
SELECT stvlgcy_code, stvlgcy_desc
FROM stvlgcy
MINUS
(franz)
The MINUS operates on two SQL
statements.
It takes all the results from the first
SQL statement, and then subtract out
the ones that are present in the second
SQL statement to get the final answer.
If the second SQL statement includes
results not present in the first SQL
statement, such results are ignored.
MINUS
The syntax is as follows:
[SQL Statement 1]
MINUS
[SQL Statement 2]
(franz)
MINUS
(franz)
SELECT stvrelt_code, stvrelt_desc
FROM stvrelt
MINUS
SELECT stvlgcy_code, stvlgcy_desc
FROM stvlgcy
SET Operators
(franz)
A note of special importance:
When using the various SQL SET operators
that we have covered,
it is especially important that you
understand your data!
In the last example,
if we reversed the order of the MINUS
operator,
we would have completely different
results.
SET Operators
SELECT stvlgcy_code, stvlgcy_desc
FROM stvlgcy
MINUS
SELECT stvrelt_code, stvrelt_desc
FROM stvrelt
(franz)
SET Operators
would return FIVE rows.
Starting from the
“stvlgcy” table, we
would look at “stvrelt”.
Any records in “stvrelt”
that were not in
“stvlgcy” would be
ignored.
The records in “stvrelt”
that were the same as
“stvlgcy” would be
removed.
The items grayed out in
the “stvlgcy” table
would be selected.
(franz)
Laboratory Eight
Objectives:
Develop competence with set operators
Steps:
First Query
Laboratory Eight
Problem: Find the phone numbers
of the people who do not live in
the area code 828 region.
Laboratory Eight
Answer:
SELECT sprtele_phone_number
FROM sprtele
MINUS
SELECT sprtele_phone_number
FROM sprtele
WHERE sprtele_area_code = 828
Subquery
Intuition:
The Set Operators used so far (UNION, UNION
ALL, INTERSECT, MINUS) operate on output tables
(i.e. sets) but only in between select statement
Question: Can we use set operations so as to
operate on an output table within another select
statement?
Answer: Yes!
The select statement inside the outer select
statement is called a subquery or a nested query.
Subquery
Where in a select statement can we nest a
subquery (i.e. where do we use a set)?
The FROM clause since the FROM clause lists
tables
just have one of those tables be
generated from the subquery
FROM Clause Subquery
(franz)
Suppose we want to select non-busineses from ‘spriden’, including
a count of addresses, where there is more than one address
type for that pidm:
Example of a subquery in the FROM clause:
SELECT spriden_id, spriden_last_name, spriden_first_name, p_cnt
FROM spriden,
(SELECT spraddr_pidm, count(spraddr_atyp_code) p_cnt
FROM spraddr
GROUP BY spraddr_pidm)
WHERE spriden_pidm = spraddr_pidm
and p_cnt > 1
and spriden_entity_ind = 'P‘
Subquery
Can we use a subquery anywhere else?
Yes, but we need some help.
The conditions in WHERE, and HAVING
clauses we have seen all use
comparison operators that work on single
values (=, <>, >, <, <=, >=) or
multiple values only in restricted ways (e.g.
LIKE)
Subquery
We need operators to compare a value
with a set of values
the set of values will be the output table of the
subquery
the resulting expressions can be conditions in
the WHERE and HAVING clauses
Introduce new operators that work with
subqueries
IN, NOT IN, ANY, ALL, EXISTS, NOT EXISTS
Special operators for subqueries
(franz)
IN/NOT IN -- Check to see if a value is in a
specified list of values returned in the subquery.
ANY -- Compare a value with any value in a
list.
ALL -- Compare a value with all values in a list.
EXISTS/NOT EXISTS -- Check for the
existence of rows returned by a subquery.
IN
(franz)
Check to see if a value is in a specified list of
values returned in the subquery
SELECT product_id, name
FROM products
WHERE product_id IN
(SELECT product_id
FROM product
WHERE name LIKE ‘%e%’)
NOT IN
(franz)
Check to see if a value is not in a specified
list of values returned in the subquery.
SELECT product_id, name
FROM products
WHERE product_id NOT IN
(SELECT product_id
FROM purchases)
ANY
(franz)
Compare a value with any value in a list.
You have to place an “=, <>, >, <, <=, or >=“
operator before ANY in the query.
SELECT employee_id, last_name
FROM employees
WHERE salary < ANY
(SELECT low_salary
FROM salary_grades)
ALL
(franz)
Compare a value with all values in a list.
You have to place an “=, <>, >, <, <=, or
>=“ operator before ALL in the query.
SELECT employee_id, last_name
FROM employees
WHERE salary > ALL
(SELECT high_salary
FROM salary_grades)
Correlated Subquery
If the table variable declared in the outer
query is used in the subquery
the subquery is said to be correlated
(otherwise, it is uncorrelated)
In an uncorrelated subquery, the
subquery is just evaluated once during
the outer query.
Correlated Subquery
In a correlated subquery, the set of rows
output from the subquery can vary for
each value of the outer table variable.
the subquery is reevaluated for each
value of the outer table variable.
EXISTS and NOT EXISTS tend to be
used in correlated subqueries
as in the examples following
EXISTS
(franz)
Check for the existence of rows returned by
a subquery.
SELECT employee_id, last_name
FROM employees outer
WHERE EXISTS
(SELECT employee_id
FROM employees inner
WHERE inner.manager_id =
outer.employee_id)
EXISTS
(franz)
‘EXISTS’ just checks for the existence of
rows returned by the subquery, not the
actual values.
To make your query run faster, you can
just return a literal value.
EXISTS
(franz)
A re-write of our previous example:
SELECT employee_id, last_name
FROM employees outer
WHERE EXISTS
(SELECT 1
FROM employees inner
WHERE inner.manager_id =
outer.employee_id)
NOT EXISTS
(franz)
Retrieve products that have not been
purchased:
SELECT product_id, name
FROM products outer
WHERE NOT EXISTS
(SELECT 1
FROM purchases inner
WHERE
inner.product_id = outer.product_id)
A Multi-Condition WHERE Clause Subquery
(franz)
The WHERE clause can have conditions besides the one
using the subquery.
Suppose we want to select non-busineses from ‘spriden’ that
do NOT have a record in ‘spbpers’:
SELECT spriden_id, spriden_last_name,
spriden_first_name
FROM spriden
WHERE
spriden_entity_ind = 'P'
AND spriden_pidm not in
(SELECT spbpers_pidm
FROM spbpers)
Single Value Subquery
Question: Can we do even more with subqueries?
More Specific Question: Can we use the output
table generated by a subquery in conditions that
use the single value comparison operators (=, >,
…)?
Answer: Yes! (sometimes)
Some SQL select statements are guaranteed to
return an output table that is a set with only one
value.
Which ones? Those with an aggregate function in
the SELECT clause.
Single Value Subquery
(franz)
SELECT "column_name1"
FROM "table_name"
WHERE "column_name2"
[Comparison Operator]
(SELECT AGGREGATE
FUNCTION("column_name1“)
FROM "table_name"
WHERE [Condition])
[Comparison Operator] can be =, >, <, >=, <=.
or "LIKE."
Single Value Subquery and
HAVING Clause Subquery
(franz)
This is not an example from Banner, but from a made-up table.
SELECT product_type_id, AVG(price)
FROM products
GROUP BY product_type_id
HAVING AVG(price) <
(SELECT MAX(AVG(price))
FROM products
GROUP BY product_type_id)
The following data illustrates this subquery example...
Single Value Subquery
The same table alias is not used in both the
outer query and the subquery
=> the query is uncorrelated
=> the subquery only needs to be evaluated
once
Single Value Subquery
(franz)
In the products table, there are multiple rows
for each product_type_id.
Each row has a price, along with other
information (which is not shown in our
example).
Single Value Subquery
(franz)
For each product_type_id in the products table:
•we sum the prices (and divide by the total
number of prices)
•then determine the average price for
product_type_id.
•Each row has an average price, unique to the
product_type_id [GROUP BY].
Single Value Subquery
SELECT product_type_id, AVG(price)
FROM products
GROUP BY product_type_id
(franz)
Single Value Subquery
(franz)
Among these average prices, we locate the maximum price out
of all the averages – which is $26.22.
Single Value Subquery
From the product_type_id average prices,
select the ones that are less than $26.22.
HAVING AVG(price) <
(SELECT MAX(AVG(price))
FROM products
GROUP BY product_type_id)
(franz)
Our original query
(franz)
SELECT product_type_id, AVG(price)
FROM products
GROUP BY product_type_id
HAVING AVG(price) <
(SELECT MAX(AVG(price))
FROM products
GROUP BY product_type_id)
Break it down into smaller steps. Then put the larger query together.
Laboratory Nine
Objectives:
Develop competence with subqueries
Steps:
First Query
Second Query
Laboratory Nine
First Query
Problem: Find the phone numbers of the
people who do not live in the area code
828 region.
Requirement: Use an uncorrelated
subquery.
Laboratory Nine
Solution:
SELECT s1.sprtele_phone_number
FROM sprtele s1
WHERE s1.sprtele_pidm NOT IN
(SELECT s2.sprtele_pidm
FROM sprtele s2
WHERE s2.sprtele_area_code = 828)
Laboratory Nine
Second Query
Problem: Find the phone numbers of the
people who do not live in the area code
828 region.
Requirement: Use an correlated
subquery.
Laboratory Nine
Solution:
SELECT s1.sprtele_phone_number
FROM sprtele s1
WHERE NOT EXISTS
(SELECT s2.sprtele_pidm
FROM sprtele s2
WHERE s1.sprtele_pidm = s2.sprtele_pidm
and s2.sprtele_area_code = 828)
A Query Development Methodology
Problem: queries can be complicated
=> easy to introduce bugs
=> difficult to find bugs
=> complexity of the select statement
=> large size of input table
=> limited number of instances of input table tested
Solution:
be aware of types of bugs
incremental development
multiple small input data sets with correct
output known
A Query Development Methodology
Types of Bugs:
Compile-time: Banner emits error
message about illegal syntax when
query is submitted
Run-time: Banner emits error
message when query is executing
that an illegal operation occurred
Logic: No error message!
A Query Development Methodology
Logic Errors
Your query has legal syntax and does
not cause any illegal operations =>
result table is generated
But your query is not doing what you
think it is doing
Hard to detect (the result table may
be the correct result table for this
particular input table instance)
A Query Development Methodology
Incremental Development:
Don’t try to come up with a complete solution
(select statement) all at once.
Develop a select statement for a small part of the
query
test that partial solution using all the input data
sets to make sure it works
if it does not work, edit it, and try again
Extend the select statement to include more
of the complete query and repeat
A Query Development Methodology
Multiple small input data sets
Develop instances of the input table that
are
small (so understandable)
test all the cases for the possible real input
to check the tentative partial select statement is
correct
usually artificial (made just for testing) to satisfy
the previous constraints