Learning Through Writing

Download Report

Transcript Learning Through Writing

Banner and the SQL Select
Statement: Part Four (Multiple
Connected Select Statements)
Mark Holliday
Department of Mathematics and
Computer Science
Western Carolina University
18 November 2005
(updated: 18 November 2005)
Outline
 The Goal
 The Concepts




A First Example
Single Table Selects
Joins
Multiple Connected Select Statements
A First Example
 Outline
 The Relational Model: Single Table
 Lab 1: TOAD, Schema Browser
 Some Structured Query Language (SQL)
Basics
 Lab 2: TOAD, SQL Editor
Single Table Selects
 Outline
 WHERE clause: single condition, multiple
conditions
 Lab 3:
 Order By; Aggregate Functions
 Lab 4:
 Group By; Having
 Lab 5:
Joins
 Outline





Why multiple tables?
Inner Joins
Lab 6:
Outer joins
Lab 7:
Multiple Connected Select Statements
 Outline
 Set Operators
 Lab 8:
 Subqueries
 Use directly: FROM clause
 Use as a set: new operators
 Use as a single value: aggregate functions
 Lab 9:
 A Query Development Methodology
SET Operators
Intuition:
A SQL Select statement returns a
table
A table is a set
we can put a set operator in between
two select statements to create a new
set (that is, table)
SET Operators
(franz)
Types of Set Operators :
 UNION – return all rows, exclusing
duplicates
 UNION ALL -- returns all rows, including
duplicates
 INTERSECT -- returns rows retrieved in
both queries
 MINUS -- returns remaining rows when
results of second query are subtracted
from the first query
Set Operators
Two restrictions of set operators are:
The two tables must contain the same
number of columns.
All corresponding columns in the two
tables need to be of the same data
type.
Example Tables
Relation (stvrelt) table (13 rows)
(franz)
Legacy (stvlgcy) table (10 rows)
UNION
(franz)
The purpose of the SQL UNION command
is to combine the results of two
queries.
In this respect, UNION is similar to
JOIN
 they are both used to combine related
information from multiple tables.
UNION
(franz)
When using UNION, only distinct values are
selected (similar to SELECT DISTINCT).
The syntax is as follows:
[SQL Statement 1]
UNION
[SQL Statement 2]
UNION
SELECT stvrelt_code, stvrelt_desc
varchar(1); varchar(30)
(franz)

FROM stvrelt

Relation table
UNION
SELECT stvlgcy_code, stvlgcy_desc

varchar(1); varchar(30)
FROM stvlgcy
 Legacy
table
Note: The default resultant set from UNION is
DISTINCT rows.
UNION ALL
(franz)
 The difference between UNION ALL and
UNION is that,
 while UNION only selects distinct values,
 UNION ALL selects all values.
The syntax for UNION ALL is as follows:
[SQL Statement 1]
UNION ALL
[SQL Statement 2]
UNION ALL
(franz)
Here is our example using Banner
tables:
SELECT stvrelt_code, stvrelt_desc
FROM stvrelt
UNION ALL
SELECT stvlgcy_code, stvlgcy_desc
FROM stvlgcy
INTERSECT
(franz)
Similar to the UNION command, INTERSECT
operates on two SQL statements.
The difference is that, while UNION essentially
acts as an OR operator
 (value is selected if it appears in either the first or
the second statement),
the INTERSECT command acts as an AND
operator
 (value is selected only if it appears in both
statements).
INTERSECT
The syntax is as follows:
[SQL Statement 1]
INTERSECT
[SQL Statement 2]
(franz)
INTERSECT
(franz)
SELECT stvrelt_code, stvrelt_desc
FROM stvrelt
INTERSECT
SELECT stvlgcy_code, stvlgcy_desc
FROM stvlgcy
MINUS
(franz)
 The MINUS operates on two SQL
statements.
 It takes all the results from the first
SQL statement, and then subtract out
the ones that are present in the second
SQL statement to get the final answer.
 If the second SQL statement includes
results not present in the first SQL
statement, such results are ignored.
MINUS
The syntax is as follows:
[SQL Statement 1]
MINUS
[SQL Statement 2]
(franz)
MINUS
(franz)
SELECT stvrelt_code, stvrelt_desc
FROM stvrelt
MINUS
SELECT stvlgcy_code, stvlgcy_desc
FROM stvlgcy
SET Operators
(franz)
A note of special importance:
 When using the various SQL SET operators
that we have covered,
 it is especially important that you
understand your data!
In the last example,
 if we reversed the order of the MINUS
operator,
 we would have completely different
results.
SET Operators
SELECT stvlgcy_code, stvlgcy_desc
FROM stvlgcy
MINUS
SELECT stvrelt_code, stvrelt_desc
FROM stvrelt
(franz)
SET Operators
would return FIVE rows.
 Starting from the
“stvlgcy” table, we
would look at “stvrelt”.
 Any records in “stvrelt”
that were not in
“stvlgcy” would be
ignored.
 The records in “stvrelt”
that were the same as
“stvlgcy” would be
removed.
 The items grayed out in
the “stvlgcy” table
would be selected.
(franz)
Laboratory Eight
 Objectives:
 Develop competence with set operators
 Steps:
 First Query
Laboratory Eight
Problem: Find the phone numbers
of the people who do not live in
the area code 828 region.
Laboratory Eight
Answer:
SELECT sprtele_phone_number
FROM sprtele
MINUS
SELECT sprtele_phone_number
FROM sprtele
WHERE sprtele_area_code = 828
Subquery
Intuition:
 The Set Operators used so far (UNION, UNION
ALL, INTERSECT, MINUS) operate on output tables
(i.e. sets) but only in between select statement
 Question: Can we use set operations so as to
operate on an output table within another select
statement?
 Answer: Yes!
 The select statement inside the outer select
statement is called a subquery or a nested query.
Subquery
 Where in a select statement can we nest a
subquery (i.e. where do we use a set)?
 The FROM clause since the FROM clause lists
tables
 just have one of those tables be
generated from the subquery
FROM Clause Subquery
(franz)
Suppose we want to select non-busineses from ‘spriden’, including
a count of addresses, where there is more than one address
type for that pidm:
Example of a subquery in the FROM clause:
SELECT spriden_id, spriden_last_name, spriden_first_name, p_cnt
FROM spriden,
(SELECT spraddr_pidm, count(spraddr_atyp_code) p_cnt
FROM spraddr
GROUP BY spraddr_pidm)
WHERE spriden_pidm = spraddr_pidm
and p_cnt > 1
and spriden_entity_ind = 'P‘
Subquery
 Can we use a subquery anywhere else?
 Yes, but we need some help.
 The conditions in WHERE, and HAVING
clauses we have seen all use
 comparison operators that work on single
values (=, <>, >, <, <=, >=) or
 multiple values only in restricted ways (e.g.
LIKE)
Subquery
 We need operators to compare a value
with a set of values
 the set of values will be the output table of the
subquery
 the resulting expressions can be conditions in
the WHERE and HAVING clauses
 Introduce new operators that work with
subqueries
 IN, NOT IN, ANY, ALL, EXISTS, NOT EXISTS
Special operators for subqueries
(franz)
 IN/NOT IN -- Check to see if a value is in a
specified list of values returned in the subquery.
 ANY -- Compare a value with any value in a
list.
 ALL -- Compare a value with all values in a list.
 EXISTS/NOT EXISTS -- Check for the
existence of rows returned by a subquery.
IN
(franz)
Check to see if a value is in a specified list of
values returned in the subquery
SELECT product_id, name
FROM products
WHERE product_id IN
(SELECT product_id
FROM product
WHERE name LIKE ‘%e%’)
NOT IN
(franz)
Check to see if a value is not in a specified
list of values returned in the subquery.
SELECT product_id, name
FROM products
WHERE product_id NOT IN
(SELECT product_id
FROM purchases)
ANY
(franz)
 Compare a value with any value in a list.
 You have to place an “=, <>, >, <, <=, or >=“
operator before ANY in the query.
SELECT employee_id, last_name
FROM employees
WHERE salary < ANY
(SELECT low_salary
FROM salary_grades)
ALL
(franz)
 Compare a value with all values in a list.
 You have to place an “=, <>, >, <, <=, or
>=“ operator before ALL in the query.
SELECT employee_id, last_name
FROM employees
WHERE salary > ALL
(SELECT high_salary
FROM salary_grades)
Correlated Subquery
If the table variable declared in the outer
query is used in the subquery
the subquery is said to be correlated
(otherwise, it is uncorrelated)
In an uncorrelated subquery, the
subquery is just evaluated once during
the outer query.
Correlated Subquery
In a correlated subquery, the set of rows
output from the subquery can vary for
each value of the outer table variable.
the subquery is reevaluated for each
value of the outer table variable.
EXISTS and NOT EXISTS tend to be
used in correlated subqueries
as in the examples following
EXISTS
(franz)
Check for the existence of rows returned by
a subquery.
SELECT employee_id, last_name
FROM employees outer
WHERE EXISTS
(SELECT employee_id
FROM employees inner
WHERE inner.manager_id =
outer.employee_id)
EXISTS
(franz)
‘EXISTS’ just checks for the existence of
rows returned by the subquery, not the
actual values.
To make your query run faster, you can
just return a literal value.
EXISTS
(franz)
A re-write of our previous example:
SELECT employee_id, last_name
FROM employees outer
WHERE EXISTS
(SELECT 1
FROM employees inner
WHERE inner.manager_id =
outer.employee_id)
NOT EXISTS
(franz)
Retrieve products that have not been
purchased:
SELECT product_id, name
FROM products outer
WHERE NOT EXISTS
(SELECT 1
FROM purchases inner
WHERE
inner.product_id = outer.product_id)
A Multi-Condition WHERE Clause Subquery
(franz)
The WHERE clause can have conditions besides the one
using the subquery.
Suppose we want to select non-busineses from ‘spriden’ that
do NOT have a record in ‘spbpers’:
SELECT spriden_id, spriden_last_name,
spriden_first_name
FROM spriden
WHERE
spriden_entity_ind = 'P'
AND spriden_pidm not in
(SELECT spbpers_pidm
FROM spbpers)
Single Value Subquery
 Question: Can we do even more with subqueries?
 More Specific Question: Can we use the output
table generated by a subquery in conditions that
use the single value comparison operators (=, >,
…)?
 Answer: Yes! (sometimes)
 Some SQL select statements are guaranteed to
return an output table that is a set with only one
value.
 Which ones? Those with an aggregate function in
the SELECT clause.
Single Value Subquery
(franz)
SELECT "column_name1"
FROM "table_name"
WHERE "column_name2"
[Comparison Operator]
(SELECT AGGREGATE
FUNCTION("column_name1“)
FROM "table_name"
WHERE [Condition])
[Comparison Operator] can be =, >, <, >=, <=.
or "LIKE."
Single Value Subquery and
HAVING Clause Subquery
(franz)
This is not an example from Banner, but from a made-up table.
SELECT product_type_id, AVG(price)
FROM products
GROUP BY product_type_id
HAVING AVG(price) <
(SELECT MAX(AVG(price))
FROM products
GROUP BY product_type_id)
The following data illustrates this subquery example...
Single Value Subquery
 The same table alias is not used in both the
outer query and the subquery
 => the query is uncorrelated
 => the subquery only needs to be evaluated
once
Single Value Subquery
(franz)
 In the products table, there are multiple rows
for each product_type_id.
 Each row has a price, along with other
information (which is not shown in our
example).
Single Value Subquery
(franz)
For each product_type_id in the products table:
•we sum the prices (and divide by the total
number of prices)
•then determine the average price for
product_type_id.
•Each row has an average price, unique to the
product_type_id [GROUP BY].
Single Value Subquery
SELECT product_type_id, AVG(price)
FROM products
GROUP BY product_type_id
(franz)
Single Value Subquery
(franz)
Among these average prices, we locate the maximum price out
of all the averages – which is $26.22.
Single Value Subquery
From the product_type_id average prices,
select the ones that are less than $26.22.
HAVING AVG(price) <
(SELECT MAX(AVG(price))
FROM products
GROUP BY product_type_id)
(franz)
Our original query
(franz)
SELECT product_type_id, AVG(price)
FROM products
GROUP BY product_type_id
HAVING AVG(price) <
(SELECT MAX(AVG(price))
FROM products
GROUP BY product_type_id)
Break it down into smaller steps. Then put the larger query together.
Laboratory Nine
 Objectives:
 Develop competence with subqueries
 Steps:
 First Query
 Second Query
Laboratory Nine
First Query
Problem: Find the phone numbers of the
people who do not live in the area code
828 region.
Requirement: Use an uncorrelated
subquery.
Laboratory Nine
Solution:
SELECT s1.sprtele_phone_number
FROM sprtele s1
WHERE s1.sprtele_pidm NOT IN
(SELECT s2.sprtele_pidm
FROM sprtele s2
WHERE s2.sprtele_area_code = 828)
Laboratory Nine
Second Query
Problem: Find the phone numbers of the
people who do not live in the area code
828 region.
Requirement: Use an correlated
subquery.
Laboratory Nine
Solution:
SELECT s1.sprtele_phone_number
FROM sprtele s1
WHERE NOT EXISTS
(SELECT s2.sprtele_pidm
FROM sprtele s2
WHERE s1.sprtele_pidm = s2.sprtele_pidm
and s2.sprtele_area_code = 828)
A Query Development Methodology
 Problem: queries can be complicated
=> easy to introduce bugs
=> difficult to find bugs
 => complexity of the select statement
 => large size of input table
 => limited number of instances of input table tested
 Solution:
 be aware of types of bugs
 incremental development
 multiple small input data sets with correct
output known
A Query Development Methodology
Types of Bugs:
Compile-time: Banner emits error
message about illegal syntax when
query is submitted
Run-time: Banner emits error
message when query is executing
that an illegal operation occurred
Logic: No error message!
A Query Development Methodology
Logic Errors
Your query has legal syntax and does
not cause any illegal operations =>
result table is generated
But your query is not doing what you
think it is doing
Hard to detect (the result table may
be the correct result table for this
particular input table instance)
A Query Development Methodology
 Incremental Development:
 Don’t try to come up with a complete solution
(select statement) all at once.
 Develop a select statement for a small part of the
query
 test that partial solution using all the input data
sets to make sure it works
if it does not work, edit it, and try again
 Extend the select statement to include more
of the complete query and repeat
A Query Development Methodology
 Multiple small input data sets
 Develop instances of the input table that
are
 small (so understandable)
 test all the cases for the possible real input
to check the tentative partial select statement is
correct
 usually artificial (made just for testing) to satisfy
the previous constraints