CS 122 Week 5: Functions

Download Report

Transcript CS 122 Week 5: Functions

CS 122 Week 5:
Functions
Tarik Booker
California State University, Los Angeles
October 21, 2014
What We Will Cover…

Functions (interspersed in Chapter 3, Chapter 5)

What is a function?

String Functions
p 117 - 120

Date and Time Functions
p 120 - 123

Control Flow Functions
p 125

Comparison Functions
p 125-126

Numerical Functions
(not in book)

Aggregate Functions
Chapter 5
Definition of a Function

What is a function?

Relation between a set of inputs, and outputs

Each input is related to exactly one output


Typically an output f corresponds to an input x

f(x)

Seen this before?

Ex: f(x) = x2


Only one answer!
(say “f of x”)
What’s the answer?
Inputs are sometimes referred to as argument(s) of the function
SQL Functions?

Built-in code in MySQL used to perform a specific task.


Consists of:

Zero or more input values

A named function identifier

One output value
With SQL Functions, we can:

Format and extract data

Use mathematical computing

Convert data
Function Syntax

How do I write a function in MySQL?
Function_name(argument_list)
Argument_list = [arg1, arg2, arg3, …]


Note: (For MySQL in Windows, at least) function_name is case insensitive
Ex:
SELECT LENGTH(‘Hello’);

The answer is 5. Why?
SELECT NOW();

What is the answer? Why?
Function Syntax (2)

Tips:

Don’t try to memorize every function!

Different functions are used by different DBMS!




Vendor lock-in
Look them up (in book or internet)
Check your parentheses!!!

Keep careful watch on parentheses, or:

Type out parentheses first, then add arguments
Nested Functions

Can type each function separately in Notepad++

Substitute functions back in
Function Syntax (3)

You can use functions in most SQL clauses:

Yes:


SELECT, ON, WHERE, GROUP BY (later), HAVING (later), ORDER BY
NO:

FROM
Strings and String Functions

Strings

In SQL, strings start with an INDEX OF 1, NOT ZERO!!!!!

THIS IS NOT LIKE C, C++, C#, Java!!! (201-203 people, take heed!)

Ex:

‘hello’ in SQL Starts with index value 1

‘hello’ in C,C++,Java starts with index value of 0

SQL: ‘hello’ = ‘1, 2, 3, 4, 5’

SUBSTRING(‘hello’, 2, 5)

What is the answer?
‘ello’
String Functions

Common String Functions in MySQL:

CHAR_LENGTH(str)
Character Length

LENGTH(str)
String Length

CONCAT (str1, str2, …)
Concatenate

CONCAT_WS(separator, str1, str2, …)
Concatenate with Separator (a string)

FORMAT (number, decimal_places)
Numerical Formatting

LEFT(str, len), RIGHT(str, len)
Take the L or R of a String of Length len

LOCATE(substr, str), LOCATE (substr, str, pos) Return the position of a substring (0
if not matching)

LOWER(str), UPPER(str)
Convert to lowercase or uppercase
String Functions (2)

SUBSTRING(str, pos, len)
Get the Substring at that Position and Length

TRIM(str), LTRIM(str), RTRIM(str)
Remove spaces (both or on a side)

Can find more at:

http://dev.mysql.com/doc/refman/5.6/en/string-functions.html
Date and Time Functions

CURDATE()
Current Date

DATEDIFF(date1, date2)
Difference between 2 dates (in days)

DATE_ADD(date, INTERVAL value type)
Add an INTERVAL to a date (as date)

DATE_SUB(date, INTERVAL value type)
Subtract an INTERVAL from a date

DATE_FORMAT(date, format_string)
Select a format for a date
SELECT date_format(‘2014-01-31’, ‘%m%d%y’)

DAY(date)
Returns the day (number)

WEEK(date)
Returns the week (52 total)

MONTH(date)
Returns the month (number)

YEAR(date)
Returns the year (number)
Date and Time Functions (2)

DAYOFMONTH(date)

DAYOFWEEK(date)

DAYOFYEAR(date)

NOW()

TO_DAYS(date)

Date and time

What happens if you: SELECT TO_DAYS(‘2014-10-21’);

From Year 0. What is that?
There are many more date/time functions!!!

http://dev.mysql.com/doc/refman/5.6/en/functions.html

http://dev.mysql.com/doc/refman/5.6/en/date-and-time-functions.html
The INTERVAL Keyword

You can perform math on dates using the INTERVAL keyword!

Ex:



Using the ADDDATE function:

SELECT EntryDate, DATE_ADD(entrydate, INTERVAL 10 year) from Artists

SELECT EntryDate, DATE_SUB(entrydate, INTERVAL 10 year) from Artists
Can use DAY, MONTH, and YEAR in your intervals
Ex:

SELECT artistname, DATEDIFF(‘2007-12-31 23:59:59’, entrydate) from Artists;
Control Flow Functions

IF(condition, true_expression, false_expression)
SELECT IF(M.Lastname = ‘Sanders’, ‘yes’, ‘no’)
FROM Members M;

IFNULL(expr1, expr2)

If expr1 is NOT NULL, IFNULL() returns expr1. Otherwise, expr2.
SELECT IFNULL(webaddress, “NONE”)
FROM Artists;
Comparison Functions

GREATEST(comma separated list of values)

LEAST(comma separated list of values)

What happens if any of the values are null?
Numerical Functions

ABS (value)
Absolute Value

CEIL(value), CEILING(value), FLOOR(value)
Ceiling/Floor

SIN(radians), COS(radians), TAN(radians)
Trigonometric Functions

PI()
pi

POW(X,Y)
Exponent (XY)

SQRT(X)
Square Root

ROUND(value), ROUND(value, decimal_places)

TRUNCATE(value, decimal_places)
Numerical Functions (2)

Formatting vs. Truncating

Formatting Rounds the Value

Truncating Cuts off the Value (at the position)

SELECT FORMAT (3.14159, 4);


Ans: 3.1416
SELECT TRUNCATE (3.14159, 4);

Ans: 3.1415

Note: FORMAT is a string function and returns a string!!

TRUNCATE is a mathematical function, and returns a number!!

http://dev.mysql.com/doc/refman/5.6/en/numeric-functions.html
Datatype Conversion

MySQL is not very strict about data types!
SELECT T.LengthSeconds/60 FROM Tracks T;

Lengthseconds is actually an int!

This is not C++/Java!
Nested Functions

Nested Functions are Quite Common in SQL

Nested Functions = Functions within Functions!

For Strings:
SELECT CONCAT(LEFT(M.Firstname, 1), LEFT(M.LastName, 1)) FROM Members M

Dates:
SELECT DAY (Curdate());

Control Flow Functions:
IF(Grade>90, ‘A’, IF(Grade>80, ‘B’, IF(Grade >70, ‘C’,’NC’)));
SELECT Lastname, Birthday, IF(YEAR(Birthday) <1965, ‘old’,
IF(YEAR(Birthday)<1975,’medium’, ‘young’))
FROM Members;
Functions and Column Aliases


You can use column aliases in any of these clauses:

GROUP BY (later)

HAVING (MySQL yes, PostgresSQL no)

ORDER BY
Note: You can’t refer to column aliases in the following clauses:

SELECT

FROM

WHERE
Functions and Column Aliases (2)

Note: If you have a long function expression and want to use it in the WHERE
clause as well, a column alias won’t work!

This will NOT WORK:
SELECT ArtistName, YEAR(EntryDate) AS Y
FROM Artists
WHERE Y > 2000;

You have to manually duplicate (copy and paste) the expression in the WHERE
clause:

This WILL WORK:
SELECT ArtistName, YEAR(EntryDate) AS Y
FROM Artists
WHERE YEAR(EntryDate) > 2000;
Aggregate Functions

Aggregate – to combine (many parts added together)

Aggregate Functions / Non-aggregate Functions

Similarities:

Both take (some type) of input

Both perform operations on the input

Both have a single output

Differences:

Input to aggregate function is a set of data
Aggregate Functions (2)
Column

Non-aggregate function:
Row
SELECT UPPER(LastName)
Row
FROM Members;


Row
Runs once for each row
Aggregate Function:
SELECT COUNT(*)
Column
FROM Members;

All rows are part of input for one function!
Row
Row
Row
Aggregate Functions (3)

Five Main Aggregate Functions:

COUNT(*)

COUNT(expression)


AVG(expression)

MIN(expression)

MAX(expression)

SUM(expression)
Others, but we won’t cover in class (you are not responsible for them)
Aggregate Arguments

Every aggregate function takes a single argument


May be another function, but not another aggregate function!
Argument also may be from a CASE Statement
Aggregate Arguments (2)

Ex:
SELECT SUM(lengthseconds BETWEEN 60 AND 120) FROM Tracks;

This actually produces the same results as:
SELECT COUNT(TrackNum)
FROM Tracks
WHERE Lengthseconds BETWEEN 60 AND 120;

Why? What does the first query do? The second? Why do they give the same
result?
COUNT

COUNT(*)

Counts the number of rows in a table

Excludes NULLS (doesn’t count them)
SELECT COUNT(*) AS ‘Number of Titles’
FROM Titles;


Returns 6
COUNT (expressions)

Counts the number of rows in a table (same)
SELECT COUNT(titleID) AS ‘Number of Titles’
FROM Titles;
More COUNT Examples
SELECT COUNT(*) FROM Artists;
SELECT COUNT(artistID) FROM Artists;
SELECT COUNT(WebAddress) FROM Artists;
SELECT COUNT(*) FROM Artists
WHERE !(WebAddress is NULL);
SELECT COUNT(*) FROM Artists
WHERE WebAddress is NULL;

Note : Why does this give you 0 (zero)?
SELECT COUNT(WebAddress) FROM Artists
WHERE Webaddress IS NULL;
Warnings with COUNT!

Be Careful what you Aggregate!

Ex: Show the number of artists who have recorded in the alternative genre:

Wrong:
SELECT COUNT(A.ArtistID)
FROM Artists A JOIN Titles T ON (A.ArtistID = T.ArtistID)
WHERE T.Genre = ‘Alternative’;

Why? This counts duplicates!

Correct:
SELECT COUNT(DISTINCT A.ArtistID)
FROM Artists A JOIN Titles T ON (A.ArtistID = T.ArtistID)
WHERE T.Genre = ‘Alternative’;
SUM (Aggregate Function)

SUM(expression)

Sums all data under expression

Excludes NULLs (doesn’t count NULL as 0)

Ex: Take a total of all runtimes:
SELECT SUM(Lengthseconds) AS ‘Total Length’
FROM Tracks;

Ex (2): Figure out how long the studio “MakeTrax” has been recording music(for tax
purposes). Present the output in minutes – call it “MakeTraxMinutes”
SELECT SUM(Lengthseconds/60) AS MakeTraxMinutes
FROM Tracks JOIN Titles USING(TitleID) JOIN Studios USING(StudioID)
WHERE StudioName = 'maketrax';
SUM (Aggregate Function) Steps

Beginning Steps:

Query Construction:
1.
Where the studios at?
Studios
1.
What's my order? SELECT, FROM, WHERE
2.
Where the track lengths at?
Tracks
2.
What do I select?
3.
What's between?
4.
So join Studios, titles, tracks.
5.
What do I join on? StudioID TitleID
6.
Then? My results. Lengthseconds.
7.
Do I want it in seconds? No. Minutes: Lengthseconds /
60
8.
What's my filter condition?
The studio name must be Maketrax.
9.
How do I check for studio name?
There's a column called 'studioname'. Therefore Must check
studioname = 'maketrax'

Results: Lengthseconds / 60.
Titles
3.
What do I call it?
MakeTraxMinutes
4.
What do I join from?
Tracks JOIN Titles USING(TitleID) JOIN Studios USING(StudioID)
5.
What do I check my filter condition with?
WHERE clause.
6.
What? studio name
7.
How?
WHERE studioname = 'maketrax'
I also need to sum my results: Results are SELECT
Lengthseconds/60 - So SELECT SUM(Lengthseconds/60)
Final Query:
SELECT SUM(Lengthseconds / 60) AS MakeTraxMinutes
FROM Tracks JOIN Titles USING(TitleID) JOIN Studios USING(StudioID)
WHERE Studioname = 'Maketrax';
AVG (Aggregate Function)

AVG (expression)

Takes the mean (average)

Averages all data under expression

Excludes NULLs (doesn’t count NULL as 0)

Ex: Average all track lengths that are listed:


SELECT AVG(LengthSeconds)

FROM Tracks;
Ex2: What’s the average length of track produced by studio “MakeTrax”?
AVG (Aggregate Function) (2)


Find the average age in years of all members:

SELECT AVG(DATEDIFF(CURDATE(), Birthday))/365.25

FROM Members;
Why?

DATEDIFF – Years between now and birthday


Values are returned in days
Take the AVG of that

Divide by 365.25

365.25? What happens every 4 years?
Excluding Data From Aggregation

To include or exclude items from being aggregated, use the WHERE clause

Ex: Count the number of male members:
SELECT COUNT(*)
FROM Members
WHERE Gender = ‘M’;

Ex: Count the number of female members:
SELECT COUNT(*)
FROM Members
WHERE Gender = ‘F’;
AVG Example
create table AvgDemo(
id int primary key,
name varchar(30) not null,
sex char,
age int);
insert into AvgDemo values(1, "Sue", 'f', null);
insert into AvgDemo values(2, "Fred", 'm', 20);
insert into AvgDemo values(3, "Mary", 'f', 30);
insert into AvgDemo values(4, "Marion", null, 40);
insert into AvgDemo values(5, "Bob", 'm', 50);
insert into AvgDemo values(6, "Jack", 'm', 60);
// in a real data base, we wouldn’t use an “age” field
// -why not?
AVG Example (2)

Lets run:
SELECT AVG(age) FROM avgdemo;

Do we need to run:
SELECT AVG(Age) FROM AvgDemo
WHERE Age is NOT NULL;
SELECT AVG(Age) FROM AvgDemo
WHERE Sex=‘F’;
SELECT AVG(Age) FROM AvgDemo
WHERE Sex=‘M’;
MIN and MAX

MIN(expression)

Returns the minimum value under the expression

Ex: Return the minimum runtime
SELECT MIN(LengthSeconds) AS ‘Shortest Track’
FROM Tracks;

MAX(expression)

Returns the maximum value under the expression

Ex: Return the maximum runtime
SELECT MAX(LengthSeconds) AS ‘Longest Track’
FROM Tracks;
More Aggregate Functions

There are many statistical aggregate functions:
STDDEV_POP(expression)
STDDEV_SAMP(expression)

Ex: SELECT STDDEV_POP(Age) FROM AvgDemo;
VAR_POP(expression)
VAR_SAMP(expression)

There are many, many more…
Aggregate Function Usage

You can use aggregate functions directly in these clauses:

SELECT clause

HAVING clause (later)

ORDER BY clause
Aggregates and the WHERE Clause

The WHERE clause filters data from being aggregated

Filters the table produced by the FROM clause

The WHERE clause filters data BEFORE it is aggregated (counted, summed, …)

So, YOU MAY NOT USE AGGREGATE FUNCTIONS DIRECTLY IN THE WHERE
CLAUSE

We will account for this later…
Aggregates and the WHERE Clause

WRONG!!!!!
SELECT TrackTitle
FROM Tracks
WHERE LengthSeconds > AVG(LengthSeconds);


This actually tries to filter out something it hasn’t calculated yet
(AVG(LengthSeconds))
DO NOT DO THIS!!!!!
How Aggregate Queries Work:


It may be helpful to think of Aggregate Queries as a two-step process:

Step 1: Filter out things you don’t want to aggregate.

Step 2: Aggregate (Count, sum, avg, min, max, etc.)
Because the WHERE Clause operates in the FIRST STEP, you can’t use it in the
second step.

We will address this later…