CS 122 Week 5: Functions
Download
Report
Transcript CS 122 Week 5: Functions
CS 122 Week 5:
Functions
Tarik Booker
California State University, Los Angeles
October 21, 2014
What We Will Cover…
Functions (interspersed in Chapter 3, Chapter 5)
What is a function?
String Functions
p 117 - 120
Date and Time Functions
p 120 - 123
Control Flow Functions
p 125
Comparison Functions
p 125-126
Numerical Functions
(not in book)
Aggregate Functions
Chapter 5
Definition of a Function
What is a function?
Relation between a set of inputs, and outputs
Each input is related to exactly one output
Typically an output f corresponds to an input x
f(x)
Seen this before?
Ex: f(x) = x2
Only one answer!
(say “f of x”)
What’s the answer?
Inputs are sometimes referred to as argument(s) of the function
SQL Functions?
Built-in code in MySQL used to perform a specific task.
Consists of:
Zero or more input values
A named function identifier
One output value
With SQL Functions, we can:
Format and extract data
Use mathematical computing
Convert data
Function Syntax
How do I write a function in MySQL?
Function_name(argument_list)
Argument_list = [arg1, arg2, arg3, …]
Note: (For MySQL in Windows, at least) function_name is case insensitive
Ex:
SELECT LENGTH(‘Hello’);
The answer is 5. Why?
SELECT NOW();
What is the answer? Why?
Function Syntax (2)
Tips:
Don’t try to memorize every function!
Different functions are used by different DBMS!
Vendor lock-in
Look them up (in book or internet)
Check your parentheses!!!
Keep careful watch on parentheses, or:
Type out parentheses first, then add arguments
Nested Functions
Can type each function separately in Notepad++
Substitute functions back in
Function Syntax (3)
You can use functions in most SQL clauses:
Yes:
SELECT, ON, WHERE, GROUP BY (later), HAVING (later), ORDER BY
NO:
FROM
Strings and String Functions
Strings
In SQL, strings start with an INDEX OF 1, NOT ZERO!!!!!
THIS IS NOT LIKE C, C++, C#, Java!!! (201-203 people, take heed!)
Ex:
‘hello’ in SQL Starts with index value 1
‘hello’ in C,C++,Java starts with index value of 0
SQL: ‘hello’ = ‘1, 2, 3, 4, 5’
SUBSTRING(‘hello’, 2, 5)
What is the answer?
‘ello’
String Functions
Common String Functions in MySQL:
CHAR_LENGTH(str)
Character Length
LENGTH(str)
String Length
CONCAT (str1, str2, …)
Concatenate
CONCAT_WS(separator, str1, str2, …)
Concatenate with Separator (a string)
FORMAT (number, decimal_places)
Numerical Formatting
LEFT(str, len), RIGHT(str, len)
Take the L or R of a String of Length len
LOCATE(substr, str), LOCATE (substr, str, pos) Return the position of a substring (0
if not matching)
LOWER(str), UPPER(str)
Convert to lowercase or uppercase
String Functions (2)
SUBSTRING(str, pos, len)
Get the Substring at that Position and Length
TRIM(str), LTRIM(str), RTRIM(str)
Remove spaces (both or on a side)
Can find more at:
http://dev.mysql.com/doc/refman/5.6/en/string-functions.html
Date and Time Functions
CURDATE()
Current Date
DATEDIFF(date1, date2)
Difference between 2 dates (in days)
DATE_ADD(date, INTERVAL value type)
Add an INTERVAL to a date (as date)
DATE_SUB(date, INTERVAL value type)
Subtract an INTERVAL from a date
DATE_FORMAT(date, format_string)
Select a format for a date
SELECT date_format(‘2014-01-31’, ‘%m%d%y’)
DAY(date)
Returns the day (number)
WEEK(date)
Returns the week (52 total)
MONTH(date)
Returns the month (number)
YEAR(date)
Returns the year (number)
Date and Time Functions (2)
DAYOFMONTH(date)
DAYOFWEEK(date)
DAYOFYEAR(date)
NOW()
TO_DAYS(date)
Date and time
What happens if you: SELECT TO_DAYS(‘2014-10-21’);
From Year 0. What is that?
There are many more date/time functions!!!
http://dev.mysql.com/doc/refman/5.6/en/functions.html
http://dev.mysql.com/doc/refman/5.6/en/date-and-time-functions.html
The INTERVAL Keyword
You can perform math on dates using the INTERVAL keyword!
Ex:
Using the ADDDATE function:
SELECT EntryDate, DATE_ADD(entrydate, INTERVAL 10 year) from Artists
SELECT EntryDate, DATE_SUB(entrydate, INTERVAL 10 year) from Artists
Can use DAY, MONTH, and YEAR in your intervals
Ex:
SELECT artistname, DATEDIFF(‘2007-12-31 23:59:59’, entrydate) from Artists;
Control Flow Functions
IF(condition, true_expression, false_expression)
SELECT IF(M.Lastname = ‘Sanders’, ‘yes’, ‘no’)
FROM Members M;
IFNULL(expr1, expr2)
If expr1 is NOT NULL, IFNULL() returns expr1. Otherwise, expr2.
SELECT IFNULL(webaddress, “NONE”)
FROM Artists;
Comparison Functions
GREATEST(comma separated list of values)
LEAST(comma separated list of values)
What happens if any of the values are null?
Numerical Functions
ABS (value)
Absolute Value
CEIL(value), CEILING(value), FLOOR(value)
Ceiling/Floor
SIN(radians), COS(radians), TAN(radians)
Trigonometric Functions
PI()
pi
POW(X,Y)
Exponent (XY)
SQRT(X)
Square Root
ROUND(value), ROUND(value, decimal_places)
TRUNCATE(value, decimal_places)
Numerical Functions (2)
Formatting vs. Truncating
Formatting Rounds the Value
Truncating Cuts off the Value (at the position)
SELECT FORMAT (3.14159, 4);
Ans: 3.1416
SELECT TRUNCATE (3.14159, 4);
Ans: 3.1415
Note: FORMAT is a string function and returns a string!!
TRUNCATE is a mathematical function, and returns a number!!
http://dev.mysql.com/doc/refman/5.6/en/numeric-functions.html
Datatype Conversion
MySQL is not very strict about data types!
SELECT T.LengthSeconds/60 FROM Tracks T;
Lengthseconds is actually an int!
This is not C++/Java!
Nested Functions
Nested Functions are Quite Common in SQL
Nested Functions = Functions within Functions!
For Strings:
SELECT CONCAT(LEFT(M.Firstname, 1), LEFT(M.LastName, 1)) FROM Members M
Dates:
SELECT DAY (Curdate());
Control Flow Functions:
IF(Grade>90, ‘A’, IF(Grade>80, ‘B’, IF(Grade >70, ‘C’,’NC’)));
SELECT Lastname, Birthday, IF(YEAR(Birthday) <1965, ‘old’,
IF(YEAR(Birthday)<1975,’medium’, ‘young’))
FROM Members;
Functions and Column Aliases
You can use column aliases in any of these clauses:
GROUP BY (later)
HAVING (MySQL yes, PostgresSQL no)
ORDER BY
Note: You can’t refer to column aliases in the following clauses:
SELECT
FROM
WHERE
Functions and Column Aliases (2)
Note: If you have a long function expression and want to use it in the WHERE
clause as well, a column alias won’t work!
This will NOT WORK:
SELECT ArtistName, YEAR(EntryDate) AS Y
FROM Artists
WHERE Y > 2000;
You have to manually duplicate (copy and paste) the expression in the WHERE
clause:
This WILL WORK:
SELECT ArtistName, YEAR(EntryDate) AS Y
FROM Artists
WHERE YEAR(EntryDate) > 2000;
Aggregate Functions
Aggregate – to combine (many parts added together)
Aggregate Functions / Non-aggregate Functions
Similarities:
Both take (some type) of input
Both perform operations on the input
Both have a single output
Differences:
Input to aggregate function is a set of data
Aggregate Functions (2)
Column
Non-aggregate function:
Row
SELECT UPPER(LastName)
Row
FROM Members;
Row
Runs once for each row
Aggregate Function:
SELECT COUNT(*)
Column
FROM Members;
All rows are part of input for one function!
Row
Row
Row
Aggregate Functions (3)
Five Main Aggregate Functions:
COUNT(*)
COUNT(expression)
AVG(expression)
MIN(expression)
MAX(expression)
SUM(expression)
Others, but we won’t cover in class (you are not responsible for them)
Aggregate Arguments
Every aggregate function takes a single argument
May be another function, but not another aggregate function!
Argument also may be from a CASE Statement
Aggregate Arguments (2)
Ex:
SELECT SUM(lengthseconds BETWEEN 60 AND 120) FROM Tracks;
This actually produces the same results as:
SELECT COUNT(TrackNum)
FROM Tracks
WHERE Lengthseconds BETWEEN 60 AND 120;
Why? What does the first query do? The second? Why do they give the same
result?
COUNT
COUNT(*)
Counts the number of rows in a table
Excludes NULLS (doesn’t count them)
SELECT COUNT(*) AS ‘Number of Titles’
FROM Titles;
Returns 6
COUNT (expressions)
Counts the number of rows in a table (same)
SELECT COUNT(titleID) AS ‘Number of Titles’
FROM Titles;
More COUNT Examples
SELECT COUNT(*) FROM Artists;
SELECT COUNT(artistID) FROM Artists;
SELECT COUNT(WebAddress) FROM Artists;
SELECT COUNT(*) FROM Artists
WHERE !(WebAddress is NULL);
SELECT COUNT(*) FROM Artists
WHERE WebAddress is NULL;
Note : Why does this give you 0 (zero)?
SELECT COUNT(WebAddress) FROM Artists
WHERE Webaddress IS NULL;
Warnings with COUNT!
Be Careful what you Aggregate!
Ex: Show the number of artists who have recorded in the alternative genre:
Wrong:
SELECT COUNT(A.ArtistID)
FROM Artists A JOIN Titles T ON (A.ArtistID = T.ArtistID)
WHERE T.Genre = ‘Alternative’;
Why? This counts duplicates!
Correct:
SELECT COUNT(DISTINCT A.ArtistID)
FROM Artists A JOIN Titles T ON (A.ArtistID = T.ArtistID)
WHERE T.Genre = ‘Alternative’;
SUM (Aggregate Function)
SUM(expression)
Sums all data under expression
Excludes NULLs (doesn’t count NULL as 0)
Ex: Take a total of all runtimes:
SELECT SUM(Lengthseconds) AS ‘Total Length’
FROM Tracks;
Ex (2): Figure out how long the studio “MakeTrax” has been recording music(for tax
purposes). Present the output in minutes – call it “MakeTraxMinutes”
SELECT SUM(Lengthseconds/60) AS MakeTraxMinutes
FROM Tracks JOIN Titles USING(TitleID) JOIN Studios USING(StudioID)
WHERE StudioName = 'maketrax';
SUM (Aggregate Function) Steps
Beginning Steps:
Query Construction:
1.
Where the studios at?
Studios
1.
What's my order? SELECT, FROM, WHERE
2.
Where the track lengths at?
Tracks
2.
What do I select?
3.
What's between?
4.
So join Studios, titles, tracks.
5.
What do I join on? StudioID TitleID
6.
Then? My results. Lengthseconds.
7.
Do I want it in seconds? No. Minutes: Lengthseconds /
60
8.
What's my filter condition?
The studio name must be Maketrax.
9.
How do I check for studio name?
There's a column called 'studioname'. Therefore Must check
studioname = 'maketrax'
Results: Lengthseconds / 60.
Titles
3.
What do I call it?
MakeTraxMinutes
4.
What do I join from?
Tracks JOIN Titles USING(TitleID) JOIN Studios USING(StudioID)
5.
What do I check my filter condition with?
WHERE clause.
6.
What? studio name
7.
How?
WHERE studioname = 'maketrax'
I also need to sum my results: Results are SELECT
Lengthseconds/60 - So SELECT SUM(Lengthseconds/60)
Final Query:
SELECT SUM(Lengthseconds / 60) AS MakeTraxMinutes
FROM Tracks JOIN Titles USING(TitleID) JOIN Studios USING(StudioID)
WHERE Studioname = 'Maketrax';
AVG (Aggregate Function)
AVG (expression)
Takes the mean (average)
Averages all data under expression
Excludes NULLs (doesn’t count NULL as 0)
Ex: Average all track lengths that are listed:
SELECT AVG(LengthSeconds)
FROM Tracks;
Ex2: What’s the average length of track produced by studio “MakeTrax”?
AVG (Aggregate Function) (2)
Find the average age in years of all members:
SELECT AVG(DATEDIFF(CURDATE(), Birthday))/365.25
FROM Members;
Why?
DATEDIFF – Years between now and birthday
Values are returned in days
Take the AVG of that
Divide by 365.25
365.25? What happens every 4 years?
Excluding Data From Aggregation
To include or exclude items from being aggregated, use the WHERE clause
Ex: Count the number of male members:
SELECT COUNT(*)
FROM Members
WHERE Gender = ‘M’;
Ex: Count the number of female members:
SELECT COUNT(*)
FROM Members
WHERE Gender = ‘F’;
AVG Example
create table AvgDemo(
id int primary key,
name varchar(30) not null,
sex char,
age int);
insert into AvgDemo values(1, "Sue", 'f', null);
insert into AvgDemo values(2, "Fred", 'm', 20);
insert into AvgDemo values(3, "Mary", 'f', 30);
insert into AvgDemo values(4, "Marion", null, 40);
insert into AvgDemo values(5, "Bob", 'm', 50);
insert into AvgDemo values(6, "Jack", 'm', 60);
// in a real data base, we wouldn’t use an “age” field
// -why not?
AVG Example (2)
Lets run:
SELECT AVG(age) FROM avgdemo;
Do we need to run:
SELECT AVG(Age) FROM AvgDemo
WHERE Age is NOT NULL;
SELECT AVG(Age) FROM AvgDemo
WHERE Sex=‘F’;
SELECT AVG(Age) FROM AvgDemo
WHERE Sex=‘M’;
MIN and MAX
MIN(expression)
Returns the minimum value under the expression
Ex: Return the minimum runtime
SELECT MIN(LengthSeconds) AS ‘Shortest Track’
FROM Tracks;
MAX(expression)
Returns the maximum value under the expression
Ex: Return the maximum runtime
SELECT MAX(LengthSeconds) AS ‘Longest Track’
FROM Tracks;
More Aggregate Functions
There are many statistical aggregate functions:
STDDEV_POP(expression)
STDDEV_SAMP(expression)
Ex: SELECT STDDEV_POP(Age) FROM AvgDemo;
VAR_POP(expression)
VAR_SAMP(expression)
There are many, many more…
Aggregate Function Usage
You can use aggregate functions directly in these clauses:
SELECT clause
HAVING clause (later)
ORDER BY clause
Aggregates and the WHERE Clause
The WHERE clause filters data from being aggregated
Filters the table produced by the FROM clause
The WHERE clause filters data BEFORE it is aggregated (counted, summed, …)
So, YOU MAY NOT USE AGGREGATE FUNCTIONS DIRECTLY IN THE WHERE
CLAUSE
We will account for this later…
Aggregates and the WHERE Clause
WRONG!!!!!
SELECT TrackTitle
FROM Tracks
WHERE LengthSeconds > AVG(LengthSeconds);
This actually tries to filter out something it hasn’t calculated yet
(AVG(LengthSeconds))
DO NOT DO THIS!!!!!
How Aggregate Queries Work:
It may be helpful to think of Aggregate Queries as a two-step process:
Step 1: Filter out things you don’t want to aggregate.
Step 2: Aggregate (Count, sum, avg, min, max, etc.)
Because the WHERE Clause operates in the FIRST STEP, you can’t use it in the
second step.
We will address this later…