Data Modeling - Hiram College

Download Report

Transcript Data Modeling - Hiram College

SQL Data Manipulation
(CB Chapter 5)
CPSC 356 Database
Ellen Walker
Hiram College
(Includes figures from Database Systems by Connolly & Begg, © Addison Wesley 2002)
SQL
• Widely used (only?) standard query language for
relational databases
• Once SEQUEL (Structured English QUEry
Language), now Structured Query Language
• Still evolving (SQL-92, SQL:1999, SQL:2003)
– Vendors have their own versions that (mostly) follow
standards
• Objectives
– Easy to learn, easy to use
– Create and modify the database and query from it
• DDL defines database schemas
• DML manipulates database instances
SQL is Declarative, not Procedural
• Statements describe the desired results
• Statements do not specify a sequence of
operations to get those results
• (Contrast with relational algebra)
SQL Data Manipulation Language
•
•
•
•
•
Select
Insert
Delete
Update
Add clauses to specify constraints
– From (tables)
– Where (condition)
Basic Select Statement
• SELECT attributes FROM relations WHERE
conditions
• e.g. Create a table of hotel names, room numbers
and room types for all hotels
SELECT hotelName, RoomNo, RoomType
FROM Hotel, Room
WHERE Hotel.HotelNo = Room.HotelNo;
SQL SELECT: Naïve implementation
1. Make a temporary table consisting of the
Cartesian Product of all the tables in the
FROM clause
2. Copy rows that match the condition in the
WHERE clause (relational algebra select
operation)
3. Choose columns specified in SELECT
statement (relational algebra project
operation)
Example Translation
SELECT P.Name
FROM Professor P, Teaching T
WHERE P.Id = T.ProfId AND T.Semester=‘F1994’
pName(sId=ProfID & Semester=‘F1994’ (Professor x Teaching)
Or
pName(sSemester=‘F1994’ (Professor |x|Id=ProfID Teaching)
Three-valued Logic
• SELECT ... WHERE grade = ‘A’
– If grade is A, expression is TRUE
– If grade is B, C, D or F, expression is FALSE
– If grade is NULL , expression is UNKNOWN
• Once one value is UNKNOWN, the
expression is
– TRUE AND UNKNOWN = UNKNOWN
– FALSE AND UNKNOWN = UNKNOWN
– (same for OR and NOT)
Removing Duplicates
• SELECT hotelNo FROM Booking;
– Returns as many hotelNo’s as bookings, including
duplicates
• SELECT DISTINCT hotelNo FROM Booking;
– Returns a list of unique hotelNo’s (no more than
the number of hotels!)
Complex Condition in WHERE
• Create a table of hotel names, room numbers and
room types for all hotels but the Ritz
SELECT hotelName, roomNo, roomType
FROM Hotel, Room
WHERE Hotel.hotelNo = Room.roomNo and hotelName < >
‘Ritz’;
• Constants like ‘Ritz’ go in single quotes.
• Operators include =, < >, < , >, < =, > =
• AND, OR, and NOT (and parentheses) to combine
expressions
More WHERE conditions
SELECT staffNO, fName, lName, salary
FROM Staff
WHERE…
…salary BETWEEN 20,000 and 30,000
…position IN {‘Manager’, ‘Supervisor’}
…fName LIKE {‘Al%n’} Allen or Alan or Allison …
…lName LIKE {‘_ _ _ _’} all 4-letter names
…mName IS NOT NULL has a middle name
Sorting the Results
• List all rooms, sorted by price (most
expensive first)
SELECT hotelName, roomNo, roomType, price
FROM Hotel, Room
WHERE Hotel.hotelNo = Room.hotelNo
ORDER BY price DESC;
• Multiple sort keys can be specified, e.g.
…ORDER BY lName, fName ASC;
Creating Calculated Attributes
• Get room numbers, starting date and length of stay
for all rooms in the Ritz (Hotel #1).
SELECT roomNo, DateFrom,
(DateTo-DateFrom) AS lengthOfStay
FROM Booking
WHERE hotelNo = 1;
• SQL supports the usual mathematical operations,
also Now().
• AS allows you to name a calculated attribute
Aggregate “Attributes”
• Aggregate functions
–
–
–
–
–
COUNT
SUM
AVG
MIN
MAX
• Used in SELECT clause
• Operate on non-NULL values of the given
attribute
3 Ways to Count
• COUNT (attribute)
– Count the number of tuples that have non-NULL
values of attribute
• COUNT DISTINCT (attribute)
– Count the number of different values of attribute
that appear in the relation
• COUNT (*)
– Count the total number of tuples that appear in the
relation (its cardinality)
Aggregation Example
• How many rooms does the Ritz have?
SELECT count (*) as RitzRooms
FROM Hotel, Room
WHERE Hotel.hotelNo=Room.hotelNo and
Hotel.hotelName=‘Ritz’
More Aggregation Examples
• What is the average price of all rooms in the
database?
SELECT AVG(Price) AS averageRoomPrice
FROM Room;
• How much is the least expensive room in the
database?
SELECT MIN(Price) AS minRitzPrice
FROM Room;
Grouped Queries
• Results are grouped by one or more
attributes specified in GROUP BY clause.
• For each “group”, a single output tuple is
generated.
• Every attribute that is not aggregated must be
mentioned in the GROUP BY clause, or must
have the same value in every tuple in the
group.
Grouped Query Example
•
List the minimum and maximum room price
for each hotel
SELECT hotelName,
MIN(price) as MinPrice, MAX(price) as MaxPrice
FROM Hotel, Room
WHERE Hotel.hotelNo = Room.hotelNo
GROUP BY HotelName
Grouping by Multiple Fields
• Create a table of Hotel, Room, Number of
Days Booked for all rooms
SELECT hotelName, roomNo,
SUM(dateTo – dateFrom) AS daysBooked
FROM Hotel AS h, Booking AS b
WHERE h.hotelNo = b.hotelNo
GROUP BY hotelName, roomNo;
HAVING
• HAVING lets you put conditions on groups,
like WHERE lets you put conditions on tuples
• HAVING doesn’t make sense without GROUP
BY
• HAVING conditions must refer to aggregates
or you could put them in a WHERE clause!
Revised SQL Evaluation
1. Make a temporary table consisting of the Cartesian
Product of all the tables in the FROM clause
2. Keep only the rows that match the condition in the
WHERE clause (relational algebra select operation)
3. Split result of step 2 into groups of tuples that agree
on all attributes of the GROUP BY attribute list
4. Keep only those groups that satisfy the HAVING
condition
5. Choose columns specified in SELECT statement
(relational algebra project operation)
6. Reorder the tuples according to the ORDER BY
directive
HAVING example
• List all hotels whose average room price is
greater than 100 along with their average
room prices.
SELECT hotelName, AVG(price) as avgPrice
FROM Hotel, Room
WHERE Hotel.hotelNo=Room.hotelNo
GROUP BY hotelName
HAVING avgPrice > 100;
• WHERE cannot refer to avgPrice because it
works on a tuple-by-tuple basis.
Implicit vs. Explicit Joins
• Simple SELECT with multiple tables is an
implicit Join
– SELECT hotelName,roomNo FROM Hotel,Room
WHERE Hotel.hotelNo = Room.hotelNo
• Or we can specify explicit Joins
– … FROM Hotel JOIN Room WHERE … ;
– … FROM Hotel NATURAL JOIN Room;
– … FROM Hotel LEFT JOIN Room …
Includes hotels with no rooms!
Nested Query (attribute)
• Subquery serves as “attribute” in WHERE to use with
standard comparison operators
– SELECT attributes
– FROM tables
– WHERE attribute op ( SELECT … )
• Inner select is often an aggregate function
• Can add “ALL” or “SOME” before inner select, if
result is not a 1x1 table.
• Don’t overdo it! (p. 159)
Nested Query (attribute) Example
• Find all schools whose average test score is
below the district average
• Assume average scores have already been
computed and stored for each school
SELECT schoolNo, testAverage
FROM TestResultsBySchool
WHERE testAverage < (SELECT AVG(score) from
TestResults)
Nested Query (attribute) in HAVING
• Find all schools whose average test score is
below the district average
• Compute school averages on the fly
SELECT schoolNo, AVG(score) AS testAverage
FROM TestResults
GROUP BY schoolNo
HAVING testAverage < (SELECT AVG(score) from
Results)
No WHERE clause is required here.
Nested Query (set)
• Subquery serves as “set” in WHERE to use
with operators IN and NOT IN, EXISTS and
NOT EXISTS
– SELECT attributes
– FROM tables
– WHERE attribute IN ( SELECT … )
• Inner select is a single column for IN
• Inner select can be any shape for EXISTS
Nested Query (set) Example
• Find all guests who will be staying over New
Year’s eve, 2003
SELECT guestName FROM Guest
WHERE guestNo IN (
SELECT guestNo FROM Booking
WHERE dateFrom <= 12/31/2002 AND
dateTo >= 1/1/2003 );
Nested Query (set) Example
• Find all rooms that are not booked for tonight
SELECT RoomNo as r1, HotelNo as h1
FROM Booking
WHERE NOT EXISTS (
SELECT RoomNo as r2, HotelNo as h2
FROM Booking
WHERE r1=r2 AND h1=h2 AND
dateFrom <= Now() and dateTo > Now()
);
• Note the use of attributes from “outer” SELECT in
condition of “inner” SELECT.
• This is like nested loops in C++
Straight Set Operations
• If tables are the same “shape” (same set of
attributes), we can use UNION, INTERSECT,
and EXCEPT (set-difference) directly.
• Examples that follow assume we have four
tables: Student, Alumni, Faculty, and Staff,
and that each individual has a single unique
personal ID number that appears in all tables.
Intersection Example
• List names of all Student Staff
SELECT fName,lName
FROM Student
WHERE PID IN (
(SELECT PID from Student) INTERSECT
(SELECT PID from Staff)
);
Union Example
• List names of everyone who is either Faculty or Staff
SELECT fName,lName from (
(SELECT PID, fName,lName FROM Faculty)
UNION
(SELECT PID, fName,lName FROM Staff)
);
• (Why do we need the nesting?)
Set Difference Example
• List names of all Faculty members who are
not also Alumni.
SELECT fName,lName from
(SELECT PID, fName,lName FROM Faculty)
EXCEPT
(SELECT PID, fName,lName FROM Alumni)
);
Inserting into the Database
• INSERT with complete tuple (in order)
INSERT into Hotel
VALUES (5, ‘Hyatt’, ‘Chicago’);
• INSERT with subset of attributes
INSERT into Hotel(hotelName, hotelNo)
VALUES ( ‘Hyatt’, 5); city is set to NULL (or default)
• INSERT computed tuples
INSERT into Hotel VALUES (SELECT … )
Views as Macros
• Create views with CREATE VIEW
• Use views in SELECT as if they were tables
• DBMS will substitute SQL for view creation
where view’s name is used
– Like a function in a programming language
(technically a macro)
Updating the Database
•
Update selected rows
UPDATE table
SET attribute = value, a2 = v2, …
…WHERE conditions
•
Update all rows
– UPDATE table SET attribute = computed
attribute
– (e.g. … SET salary = salary*1.03) give 3% raise
Deleting from the Database
• DELETE FROM table
• WHERE condition
• All of the variations of WHERE clauses that
are allowed in SELECT are allowed in
DELETE.