No Slide Title

Download Report

Transcript No Slide Title

MIS 3053
Database Design & Applications
The University of Tulsa
Professor: Akhilesh Bajaj
RM/SQL Lecture 5
© Akhilesh Bajaj, 2000, 2002, 2003, 2004. All Rights Reserved.
Some Questions We May Have
Can we use multiple distincts in a SELECT?
No. We use only one distinct at most. The distinct gives us
unique rows only in the result. Not using distinct returns all rows,
including duplicates.
What is the keyword for descending order in order by?
Desc
Are string comparisons case sensitive?
Yes. So ‘George Jones’ is different from ‘george jones’.
Is BETWEEN inclusive of the end values?
Yes.
Review of SQL in Lecture 3
Basics of SQL:
- the SELECT clause
- the WHERE clause
- the FROM clause
- the AS clause
- the LIKE operator
- the ORDER BY clause
- the UNION operation
- the INTERSECT operation
- the MINUS operation
Goals for Today
• Aggregate functions:
- Average: AVG
-Minimum: MIN
- Maximum: MAX
- Total: SUM
- Count: COUNT
• The GROUP BY clause.
• The HAVING clause.
• The IS NULL / IS NOT NULL conditions.
Relational Schema For Example SQL Queries
Professors (f_id, f_name, f_address, f_specialty, highest_degree)
Courses (c_id, c_name, num_credits)
Sections (c_id, s_id, time_held, week_days_held,
date_began, date_ended, cl_id)
Students (st_id, st_name, gender, st_gpa)
Grades (st_id, c_id, s_id, grade)
Classrooms (cl_id, capacity, location)
Teach (f_id, c_id, s_id)
Structured Query Language (SQL): Query Part
Aggregate functions:
• These are functions that take values (possibly
duplicated), one from each row in a group of rows, and return a
single value for that group of rows.
• The basic aggregate functions are: AVG, MIN, MAX, SUM,
COUNT. AVG and SUM have to work on groups of number
values. MIN, MAX, COUNT can work on non-numeric types also.
Example:
The * operator here simply counts the
SELECT count(*)
number of rows returned in the result, incl nulls.
FROM grades AS G
SELECT COUNT(st_id) will count all st_ids that
WHERE G.c_id = ‘MIS3053’
AND G.s_id = ‘AFall2006’; are non null, duplicates included.
SELECT COUNT (DISTINCT st_id) will count
All st_ids that are non null and distinct (no duplicates)
Structured Query Language (SQL): Query Part
In Class Examples:
1. The number of sections ever offered for MIS3053.
-use sections OR teach OR grades
2. Find the average GPA of students in MIS3053Fall2010A.
3. Find the maximum GPA amongst all male students.
4. Business Question: The pass percentage of course MIS3053.
Business Analyst translates to:
-((distinct students enrolled / total students enrolled) * 100)
-(Number of ‘F’s in MIS3053/Total number enrolled in MIS3053)
Which is better?
SELECT (COUNT(DISTINCT G.st_id) /COUNT(*) ) *100 AS 'DB Pass Percentage'
FROM grades AS G
WHERE G.c_id='MIS3053';
Using ((distinct students enrolled / total students enrolled) * 100)
Using (Number of ‘F’s in MIS3053/Total number enrolled in
MIS3053):
Using 1 nested query or temp table (T1)
SELECT (( T1.Fcount/
COUNT(*)) *100) AS ‘DB Fail Percentage’
FROM grades AS g,
(SELECT COUNT(*) AS FCount FROM grades AS g1 WHERE g1.c_id='MIS3053' AND g1.grade='F') AS T1
WHERE g.c_id='MIS3053';
SELECT (T1.Fcount/T2.all_db_students ) *100 AS 'DB Fail Percentage'
FROM (SELECT COUNT(*) AS Fcount FROM grades AS G1 WHERE G1.c_id='MIS3053' AND G1.grade='F') AS T1,
(SELECT COUNT(*) AS all_db_students FROM grades AS G2 WHERE G2.c_id='MIS3053') AS T2
;
Using 2 nested queries, or temp tables, T1 and T2
Calculating Pass Percentage in MIS 3053
Structured Query Language (SQL): Query Part
The GROUP BY clause:
• This clause allows us to apply the aggregate functions to
groups of rows, so that one aggregate value is returned per
group.
Example:
SELECT G.c_id, G.s_id , MAX(grade)
FROM grades As G
GROUP BY G.c_id, G.s_id;
Structured Query Language (SQL): Query Part
The HAVING clause:
• This clause allows us to apply condition(s) to each group, and only
select those groups that satisfy the condition(s).
As before, the groups are formed using the GROUP BY
clause. HAVING is unlike WHERE (in WHERE the condition applies
to each row, as opposed to each group).
Example:
SELECT COUNT(*)
FROM grades AS G
GROUP BY G.c_id, G.s_id
HAVING G.c_id = ‘MIS3053’;
Only groups with
c_id = “MIS3053”
are chosen.
Practice Examples:
1. The number of Professors whose specialty is
‘Databases’ or ‘Computer Science’ or ‘MIS’.
2. The specialties which have at least 5 Professors. (using GROUP BY
and then using a nested query)
SELECT DISTINCT P.F_Specialty
FROM Professors AS P
GROUP BY P.F_Specialty
HAVING COUNT(*)>=5;
SELECT *
FROM [SELECT P1.F_specialty AS Specialty ,COUNT(*) AS NumberProfessors
FROM Professors AS P1
GROUP BY P1.F_Specialty] AS T1
WHERE T1.NumberProfessors>=5;
Example of nested query
Structured Query Language (SQL): Query Part
The HAVING clause:
• Consider a query of the form:
SELECT …………
FROM ……….
WHERE ……..
GROUP BY ………
HAVING …………;
Here is a conceptual picture of what happens:
The cartesian product is formed first (FROM) --> T1.
Some rows are selected from this product (WHERE) --> T2.
The GROUP BY is applied to the rows in T2. Some groups are
formed -->T3.
The HAVING clause is applied to T3, and only some of the groups
are selected --> RESULT.
[N.B.: T1, T2, … RESULT are temporary tables to help us visualize the process.]
Structured Query Language (SQL): Query Part
The IS NULL / IS NOT NULL condition:
• This allows us to test whether one or more attributes have
null values.
Example:
SELECT *
FROM professors As P
WHERE P.f_specialty IS NULL;
• Aggregate functions except COUNT(*) ignore null values when
computing the aggregate value. COUNT(*) counts all rows including
ones with NULL. COUNT (clmn) returns count of rows that have
non-null values for clmn.
• The result of a comparison involving NULL values is treated
as FALSE.
NB:COUNT (DISTINCT Clmn) is not yet supported on MS Access.
Structured Query Language (SQL): Query Part
Differences between WHERE and HAVING:
• The WHERE clause conditions apply to all the rows in
the cartesian product in the FROM clause.
• The HAVING clause conditions apply to all the groups that
have been formed, using the GROUP BY clause. Only groups
that satisfy the HAVING conditions are shown in the result.
• If a WHERE and HAVING are used in the same query, then
this is what happens: the conditions in the WHERE clause are
applied first. The rows that satisfy these conditions are taken
and grouped, based on the GROUP BY clause. Each group
in the GROUP BY clause is then tested to see if it satisfies
the conditions in the HAVING clause. Groups that do not are
removed.
Structured Query Language (SQL): Query Part
Example1
The names and st_ids of all students with GPA higher than the
average GPA of students taking MIS3053Fall2010A.
-Do Using subquery
-Do using nested query
Structured Query Language (SQL): Query Part
Example 2
The name and section count of the course with the highest
number of sections that have been offered.
-We cannot use nested aggregate operators like MAX(COUNT(*))
etc.
Using Subquery:
SELECT C.C_name, COUNT(*)
FROM Sections AS s, courses AS c
WHERE s.c_id=c.c_id
GROUP BY c.c_id, c.c_name
HAVING COUNT(*)>=
(SELECT COUNT(*)
FROM sections AS S1
GROUP BY S1.c_id);
------------------------------------------------------------------------
SELECT C.C_name, COUNT(*)
FROM Sections AS s, courses AS c,
(SELECT MAX(T1.Scount) AS MAXC
FROM (SELECT S1.c_id AS CNum, COUNT(*) AS Scount
FROM sections AS S1
GROUP BY S1.C_id) AS T1) AS T2
WHERE s.c_id=c.c_id
GROUP BY c.c_id, c.c_name,T2.MAXC
HAVING COUNT(*)>=T2.MAXC;
-----------------------------------------------------------------------SELECT C.Cname, T1.scount
FROM courses AS c,
(SELECT S1.Cid AS CNum, COUNT(*) AS Scount
FROM sections AS S1
GROUP BY S1.Cid) AS T1,
(
SELECT MAX(T2.Scount) AS MAXC
FROM (SELECT S2.Cid AS CNum, COUNT(*) AS Scount
FROM sections AS S2
GROUP BY S2.Cid) AS T2) AS T3
WHERE c.cid=T1.cnum
AND T1.scount=T3.MAXC
GROUP BY c.cname,T1.scount
;
____________________________________
SELECT TOP 1 C.c_name AS ‘CourseName’, COUNT(*)
FROM Sections AS s, courses AS c
WHERE s.c_id=c.c_id
GROUP BY c.c_id, c.c_name
ORDER BY COUNT(*) DESC;
//ALTERNATE OF TOP 1 for MYSQL is LIMIT 1 at the bottom of the query
____________________________________________________
Structured Query Language (SQL): Query Part
Example 3
The names of all female students in the school with GPAs higher
than the average student GPA.
-Calculate the average student GPA in a sub query.
Structured Query Language (SQL): Query Part
Example 4
The average enrollment per section for course MIS3053.
Do using a nested query:
Create a temporary table that looks like this:
T1
Section_id
Enrollment
Fall2010 A 17
Fall2011A
19
Fall2012A
21
Fall2013A
24
Then take the average of the enrollment column in T1!

Structured Query Language (SQL): Insertion Part
• SQL uses the INSERT statement. We can either insert one
row (the values have to be specified for each column) or a
set of rows (using a SELECT statement to create the set of rows).
In the large majority of applications, one row is inserted at
a time, and we shall concentrate on that.
E.g., INSERT INTO Professors
(f_id, f_name, f_address, f_specialty, highest_degree)
VALUES (1,’Akhilesh’, ‘200 Forbes Ave.’, ‘MIS’, ‘Ph.D.’);
If a value is unknown, the keyword NULL is used in lieu of the
value. Note that some columns (like primary key columns) have
NULL values prohibited.
Multiple row inserts (http://en.wikipedia.org/wiki/Insert_(SQL)):
INSERT INTO phone_book2
SELECT * FROM phone_book
WHERE name IN ('John Doe', 'Peter Doe');
Structured Query Language (SQL): Updation Part
• SQL uses the UPDATE statement.
E.g., UPDATE students
SET st_gpa = 1.05*st_gpa
WHERE st_gpa>=3.0;
UPDATE students
SET st_gpa=1.05*st_gpa
WHERE st_gpa>=SELECT AVG(st_gpa)
FROM students;
Update multiple columns at same time (important for our project):
UPDATE Table1
SET Names1 = ‘Akhilesh’, Degree = ‘PhD’
WHERE Names2 = ‘Bajaj’;
Structured Query Language (SQL): Deletion Part
• SQL uses the DELETE statement.
E.g., DELETE FROM students;
This deletes all rows from the students table.
Potentially dangerous!
DELETE FROM students
WHERE st_gpa<2.0;
DELETE FROM students
WHERE st_gpa < SELECT AVG(st_gpa)
FROM students;