Transcript Document
SQL
• SQL is a very-high-level language, in which the programmer is able to avoid specifying a lot of data-manipulation details that would be necessary in languages like C++.
• What makes SQL viable is that its queries are “optimized” quite well, yielding efficient query executions.
Select-From-Where Statements
• The principal form of a query is:
SELECT
desired attributes
FROM
one or more tables
WHERE
condition about tuples of the tables
Our Running Example
• Our
SQL
schema.
queries will be based on the following database
Movie
(title, year, length, inColor, studioName, producerC)
StarsIn
(movieTitle, movieYear, starName)
MovieStar
(name, address, gender, birthdate)
MovieExec
(name, address, cert#, netWorth)
Studio
(name, address, cert#, netWorth) Find all movies produced by Disney Studios in 1990.
SELECT * FROM Movie WHERE studioName = 'Disney' AND year = 1990;
Meaning of Single-Relation Query
• Begin with the relation in the
FROM
clause.
• Apply the selection indicated by the
WHERE
clause.
• Apply the extended projection indicated by the
SELECT
clause.
(Extended) Projection in SQL
SELECT title, length FROM Movie WHERE studioName = 'Disney' AND year = 1990; SELECT title AS name, length AS duration FROM Movie WHERE studioName = 'Disney' AND year = 1990; SELECT title AS name, length*0.016667 AS lenghtInHours FROM Movie WHERE studioName = 'Disney' AND year = 1990; SELECT title AS name, length/60 AS length, 'hrs.' AS inHours FROM Movie WHERE studioName = 'Disney' AND year = 1990;
Selection in SQL
• The selection of the relational algebra is available through the WHERE clause of SQL. • We may build expressions by using the operators:
= <> < > <= >=
• The string constants are surrounded by single quotes . – studioName = 'Disney' • Numeric constants are for e.g.: -12.34, 1.23E45
• Boolean operators are:
AND
,
OR
,
NOT
.
SELECT title FROM Movie WHERE (year > 1970) AND NOT (inColor='C');
Selection in SQL (Cont.)
• Which Disney movies are after 1970 or have length greater than 90 mins?
SELECT title FROM Movie WHERE (year > 1970 OR length < 90) AND studioName='Disney'; • The parenthesis are needed because the precedence of OR is less than that of AND.
Comparision of strings
• Strings can as well be compared (lexicographically) with the same operators:
= <> < > <= >=
• For instance ‘fodder’<‘foo’ ‘bar’ < ‘bargain’
•
Patterns
WHERE clauses can have conditions in which a string is compared with a pattern, to see if it matches.
• General form:
LIKE
NOT LIKE
%
= “any string”
_
= “any character.”
Examples
SELECT title FROM Movie . Suppose we remember a movie “Star WHERE title LIKE 'Star %';
something
”.
SELECT title FROM Movie WHERE title LIKE '%''s%'; Two consecutive apostrophes in a string represent itself and not the end of the string.
Comparison of Strings (Continued)
• What if the pattern we wish to use in a
LIKE
expression involves the characters
%
or
_
? • We should “escape” their special meaning proceeding them by some escape character. • • In UNIX and C we use backslash \ as the escape character.
SQL
allows us to use any character we like.
• s
LIKE
'x%%x%%'
ESCAPE
'x' –
x
will be the escape character.
– A string that is matched by this pattern is for example: %aaaa% SELECT title FROM Movie WHERE title
LIKE
'x%%x%%'
ESCAPE
'x';
Ordering the Input
• We may ask the tuples produced by a query to be presented in sorted order. •
ORDER BY
•
Example
. Find the Disney movies of 1990.
–
Movie(
title, year, length, inColor, studioName, producerC#
)
– To get the movies listed by length, shortest first, and among movies of equal length, sort alphabetically:
SELECT
*
FROM
Movie
WHERE
studioName = 'Disney'
ORDER BY
length, title; • Ordering is ascending, unless you specify the DESC keyword to
an attribute
.
• Ties are broken by the second attribute on the ORDER BY list, etc.
NULL Values
• Tuples in SQL relations can have NULL as a value for one or more components.
• Meaning depends on context. Two common cases: –
Missing value
: e.g., we know the
length
some value, but we don’t know what it is.
has –
Inapplicable
: e.g., the value of attribute
spouse
for an unmarried person.
Comparing NULL’s to Values
• The logic of conditions in SQL is really 3 valued logic: TRUE , FALSE , UNKNOWN .
• When any value is compared with NULL , the truth value is UNKNOWN .
• But a query only produces a tuple in the answer if its truth value for the WHERE clause is TRUE (not FALSE or UNKNOWN).
Three-Valued Logic
• To understand how AND, OR, and NOT work in 3-valued logic, think of – TRUE = 1, FALSE = 0, and UNKNOWN = ½.
• • •
AND = MIN OR = MAX NOT(x) = 1-x
• Example: TRUE
AND
(FALSE
OR NOT
(UNKNOWN)) = MIN(1, MAX(0, (1 - ½ ))) = MIN(1, MAX(0, ½ )) = MIN(1, ½ ) = ½.
Surprising Example
SELECT
*
FROM
Movie
WHERE
length <=120
OR
length > 120; • Suppose that we have some NULL values in the length.
• What’s the result?
Products and Joins in SQL
• SQL has a simple way to couple relations in one query: list each relation in the FROM clause.
– All the relations in the FROM clause are coupled through Cartesian product – Then we can put conditions in the WHERE clause in order to get the various kinds of join.
•
Example
. We want to know the name of the producer of
Star Wars
. • To answer we need the information from both of the relations: – –
Movie(
title, year, length, inColor, studioName, producerC
) MovieExec
(name, address, cert, netWorth) SELECT name FROM Movie, MovieExec WHERE title = 'Star Wars' AND producerC = cert;
Disambiguating Attributes
• When we involve
two or more
relations in a query, we can have attributes with the
same
name among these relations. • We solve the problem of
disambiguating
between them by putting the name of the relation followed by a dot and then the name of the attribute.
• Example. Suppose we wish to find pairs (star, movie executive) living in the same address.
– –
MovieStar(
name, address, gender, birthdate
) MovieExec
(name, address, cert, netWorth)
SELECT
MovieStar.name, MovieExec.name
FROM
MovieStar, MovieExec
WHERE
MovieStar.address = MovieExec.address;
Tuple Variables
• Sometimes we need to ask a query that involves two or more tuples from the same relation. We may list a relation R as many times we want in the from clause but
we need a way to refer to each occurrence of R
.
• SQL allows us to define, for each occurrence in the FROM clause, an alias which we call “
tuple variable
.” •
Example
. We like to know about two stars who share an address.
SELECT
Star1.name, Star2.name
FROM
MovieStar
AS
Star1, MovieStar
AS
Star2
WHERE
Star1.address = Star2.address
AND
Star1.name < Star2.name;
AS
is not supported in Oracle.
In Oracle
SELECT
Star1.name, Star2.name
FROM
MovieStar Star1, MovieStar Star2
WHERE
Star1.address = Star2.address
AND
Star1.name <> Star2.name;
Tuple Variables (Continued)
• Why we have the condition –
Star1.name < Star2.name
?
• Without this condition we would produce also pairs of identical star names. • This conditions forces us to produce each pair of stars with a common address
only
. • Why we used the operator
<
and not
<>
?
• If we had used <> the we would have produced pairs of married stars twice, like:
Star1.name
Alec Baldwin
Star2.name
Kim Basinger Kim Basinger Alec Baldwin
Conversion to Relational Algebra
• Another approach to interpret SQL queries is to relate them to relational algebra. • Start with the relations in the
FROM
clause and take their
Cartesian Product
.
• Having created the product, we apply a
selection
operator to it by converting the
WHERE
clause to a selection condition.
• Finally from the list of attributes in the
SELECT
clause we do a
projection
.
An Unintuitive Consequence of SQL semantics • Suppose
R
,
S
,
T
are
unary relations
• We wish to compute
R
(S
T)
. each having attribute
A
alone. • We might expect the following SQL query to do the job.
SELECT
R.A
FROM
R, S, T
WHERE
R.A = S.A
OR
R.A = T.A
• However, consider the situation in which
T
is empty. Since
R.A = T.A
never be satisfied, we might expect the query to produce exactly
R
T
. can • But using the interpretation the result is empty.
– If we use the conversion to RA, the Cartesian product
R x S x T is
.
Union, Intersection, and Difference of Queries • If two SQL queries produce relations with the same set of attributes then we can combine the queries using the set operations:
UNION
,
INTERSECT
and
EXCEPT
. •
Example
. Suppose we want the names and addresses of all female movie stars who are also movie executives with a net worth over $1,000,000.
– –
MovieStar(
name, address, gender, birthdate
) MovieExec
(name, address, cert, netWorth) (SELECT name, address FROM MovieStar WHERE gender = 'F')
INTERSECT
(SELECT name, address FROM MovieExec WHERE netWorth > 1000000);
Union, Intersection, and Difference of Queries (Continued) • Example. Give the names and addresses of movie stars
who are not
also movie executives.
(SELECT name, address FROM MovieStar)
EXCEPT
(SELECT name, address FROM MovieExec); • In
ORACLE
the EXCEPT is
MINUS
.
• Example. We want all the titles and years of movies that appeared in either the Movie or StarsIn relation.
(SELECT title, year FROM Movie)
UNION
(SELECT title, year FROM StarsIn);