Transcript Slide 1

SQL Unit 4 Built-in Functions and Calculated Fields Kirk Scott

1

• • • • • •

4.1 Built-in Functions 4.2 Calculated Fields 4.3 The DATE/TIME Data Type and Date Functions 4.4 The Keyword FORMAT and Using the Date Data Type 4.5 String Operations 4.6 The Keywords FIRST, LAST, and TOP

2

4.1 Built-in Functions

3

4.1.1 A Function which Applies to Multiple Rows: COUNT

• • • SQL contains several built-in mathematical functions. What these functions have in common is that they cover or take into account all of the rows in the table. For this reason they can be called aggregate functions. 4

• • • • Syntactically, the usage of these functions is similar. The keyword for them is followed by a pair of parentheses and a field of interest is put into the parentheses. The COUNT function, which was introduced earlier, falls into this class. SELECT COUNT finds the count of all non-null occurrences of a field in a table. 5

• • • • COUNT can be applied to a single field, or to *, which signifies all fields. It can't be applied to a list of fields. If it's applied to a single field, that field can be either numeric or non-numeric. This is an example of COUNT applied to a single field: • • • SELECT COUNT(year) FROM Car 6

4.1.2 Other Aggregate Functions: MAX, MIN, SUM, AVG, VAR, STDEV

• • • More typical examples of aggregate functions are purely mathematical. They can only be applied to a single numeric field. These functions have in common with the COUNT function the fact that null values are ignored when computing them and they produce a single value as a result. 7

• • Here is a list of the aggregate functions which will be covered, along with a brief description of their meaning. Various implementations of SQL may support other mathematical functions, but these are sufficient to introduce the topic: 8

• • • • • • MAX MIN SUM AVG VAR STDEV Find the maximum value of a field in a table.

Find the minimum value of a field in a table.

Find the sum of the values of a field in a table.

Find the average of the values of a field in a table.

Find the variance of the values of a field in a table.

Find the standard deviation of the values of a field in a table.

9

4.1.3 Examples of Using Aggregate Functions

• • • Some examples of the use of the functions follow. All of these queries could be written with a WHERE clause that restricted the set of rows under consideration. For MAX and MIN, more than one record may contain the maximum or minimum value, but a MAX or MIN query will only produce a single value as a result: • • • SELECT MAX(commrate) FROM Salesperson 10

• • • • For MIN, it is important to note that even though the value NULL is considered to come before the lowest value when doing ORDER BY, NULL is not considered a value, so MIN will pull out the lowest actual value, not NULL, if NULL is present in the field.

SELECT MIN(commrate) FROM Salesperson 11

• • • • • The SUM function adds up the values in a column in a table. It ignores null values.

SELECT SUM(salesprice) FROM Carsale 12

• • The three statistical functions all depend in one way or another on dividing a sum by the number of occurrences. Null values are ignored when forming the sum, and any records with null in the field of interest are not included in the count which the sum is divided by. 13

• • • • The critical observation is this: Do not operate under the assumption that null numeric fields are treated as if they contain 0's. They are not. They are simply ignored in these functions.

14

• • • • • • • • SELECT AVG(salesprice) FROM Carsale SELECT VAR(stickerprice) FROM Car SELECT STDEV(dealercost) FROM Car 15

4.1.4 Row Runctions: Functions which do not Aggregate

• • • There is another kind of function in SQL with a use which is syntactically similar to the aggregate functions, namely a keyword followed by a set of parentheses containing a field name or some sort of arithmetic expression. These functions are known as row functions because they apply to the value of the field of each row individually. They do not combine the values in different rows together. 16

• • • Note that the examples given are simple queries, not update queries and they only change how the data is displayed in the query results. These functions could have such an effect on the stored data if they were used in update queries. Here are three functions that follow this model: 17

• • • ABS ROUND Round off a numeric value to a specified number of decimal places.

NZ Find the absolute value of a field in a table.

I call this the New Zealand function. It is actually the "null to zero" function and is named after the initials of the words null and zero.

18

• • • • • The example database doesn't have a field which illustrates the use of ABS very well, because all of the monetary and numeric fields contain only positive values. However, this example illustrates its syntax: SELECT ABS(commrate) AS [Absolute value of commrate] FROM Salesperson 19

• • The example database also doesn't have a field where the use of ROUND can be illustrated very practically. All of the monetary fields will be formatted with two decimal places by default, and the commrate field only holds values of this form: 0.0n. 20

• • • • • If you wanted to make sure that commrates were only displayed with two decimal places, no matter what kinds of values existed, you could do as shown below. The ROUND function takes two parameters, the field it is to be applied to and the number of decimal places to be displayed: SELECT ROUND(commrate, 2) FROM Salesperson 21

• • • • In use, it appears that strictly speaking, ROUND doesn’t affect output; It just affects numerical computation.

In other words, if you had an integer and applied ROUND with two decimal places to it, it would just display the integer value.

It would not display the value in the form n.00.

22

• • • • The data in the Salesperson table do allow for a use of the NZ function which would make a difference. There is one salesperson with a null value for commrate. In the query shown below you specify in the parentheses that null values of commrate should be replaced with the value 0. All other commrates are unchanged by the use of the function: • • • SELECT spno, name, NZ(commrate, 0) FROM Salesperson 23

4.2 Calculated Fields

24

4.2.1 Calculated Fields Using Arithmetic Operators

• • • In SELECT statements it is possible to create mathematical expressions which involve numeric fields, numeric constants, arithmetic operators like +, -, *, and /, and parentheses for grouping. Such expressions are row level functions defined by the user. The expressions apply to the values in a single row and do not aggregate. 25

• • • Expressions of this kind can include the built-in row level functions. If any of the fields in the expression are null, the result of the expression is null. Here is a self-explanatory example of a calculated field: • • • SELECT vin, stickerprice – dealercost AS [markup] FROM Car 26

4.2.2 Division in Calculated Fields and the Keyword ROUND

• • • • Here is an example of a calculated field containing division: SELECT (stickerprice – dealercost) / dealercost AS [Percent markup as decimal] FROM Car 27

• • • Because this example involves division, the results might have many decimal places. This allows for a practical use of the ROUND function.

Notice how the function is called on an arithmetic expression rather than just a single numeric field.

• • • SELECT ROUND((stickerprice – dealercost) / dealercost, 2) FROM Car 28

• • • Division could also be a problem if the dealercost were 0 in any record. In this case the system will return an "#error" message in the results. One approach to this problem would be to have a WHERE clause which eliminated from consideration records with null dealercosts.

29

4.2.3 Examples with a Join and NZ

• • • • • • • Suppose you would like to find the commission earned on every carsale. The salesprice field is in the Carsale table and the commrate field is in the Salesperson table. There is no problem doing a calculated field involving the fields from the different tables in a join query: SELECT salesprice * commrate FROM Carsale, Salesperson WHERE Carsale.spno = Salesperson.spno

30

• • • • • • • Because one of the salespeople has null for a commrate value, if that salesperson sold a car, that carsale would produce a null value as a result for this query. Rather than showing null, since the expected value for a commission is numeric, it may be preferable to explicitly show 0.0. This can be accomplished by using the NZ function: SELECT salesprice * NZ(commrate, 0) FROM Carsale, Salesperson WHERE Carsale.spno = Salesperson.spno

31

• • • It is also possible that all salespeople should have commrates and that an oversight has been made on data entry. In the long run, the data in the Salesperson table should be corrected. In the short run, you may want to produce query results that show that in all cases, at the very least, a commrate of 0.03 applies to a sale. 32

• • • • • This provides an example of using NZ with a value other than zero: SELECT salesprice * NZ(commrate, 0.03) FROM Carsale, Salesperson WHERE Carsale.spno = Salesperson.spno

33

• • •

4.2.4 Row Level Functions and Aggregate Functions Don't Mix

Row level functions and aggregate functions don't mix. Recall that by definition, row level functions apply to single rows in a table, while aggregate functions are based on all of the rows in a table. However, the following query is not syntactically possible in Access.

• • • • THIS DOES NOT WORK: SELECT stickerprice – AVG(stickerprice) FROM Car 34

• • • • Reading the query, it's clear what the person writing it would like to find. Some systems may support this syntax. In Access, other techniques will have to be used to obtain this result. They will be explained later.

35

• • • • On the other hand, here is a query where a calculated field, a row level function does mix successfully with an aggregate function.

SELECT AVG(stickerprice – dealercost) FROM Car 36

• • • It is possible to speculate about why one works and the other one doesn't based on the steps that would have to be taken to calculate them. For the first example, it would be necessary to scan the whole car table, finding the average, and then go back and find the differences. For the second example, row-by-row it would be possible to find the differences, saving those value for use when calculating the average. 37

• • • In other words, for the second example it would only be necessary to scan the table once. It is not so important to grasp this difference It's worthwhile to understand though, that as a general rule you can't assume that you can mix row level functions and aggregate functions.

38

4.3 The DATE/TIME Data Type and Date Functions

39

4.3.1 Important Note Regarding Versions of MS Access

• • Depending on the installation of Microsoft Access on your machine, you may have problems where some of the date related functions don't work. There is one general thing that can be done to at least make sure there are no background problems, and a specific thing that can be done to try and solve the problems with the date functions.

40

• • • The general thing to do is to make sure that the version of the example database that you're using agrees with the version of Microsoft Access that you're using. The example database for this course as posted is in the Microsoft Access 2000 version in case there are people who only have an older version of Access available to them. If you are using a version of Access later than 2000, then it would be useful to save the example database in that version and use it.

41

• • Whether you change versions of the example database or not, when you try to run queries with date functions you may get several different types of error messages. If one of these error messages appears, try the steps given on the following overheads and see if that resolves the problem.

42

• • • • Note, the steps are extremely inconvenient.

This problem has cropped up in the lab and the lab tech found the steps given to be a solution.

No one in their right mind would anticipate this.

Your goal is not to understand any of it, but just to try it if you have problems and see if the steps solve the problem 43

• • • • A. Press CTRL+g. This will open the Microsoft Visual Basic development environment (assuming that it is installed on your system).

B. Take the Tools option in the menu bar.

C. In the drop down menu, select References.

D. A box will pop up containing a list of things. – It may contain a checked box next to this text: – "MISSING: utility.mda".

44

• • • E. Click on the checked box, removing the check mark.

F. Then click OK.

G. Close the Microsoft Visual Basic development environment by clicking on the X in the upper right hand corner of the window.

45

• • • H. Then try to run your query. Hopefully, the date functions will now work.

I. If this solution works, it should be in effect for the duration of the current session using the database. It will have to be repeated every time the database is opened.

46

• • • • If you still can't get all of the date features to work—just fake it. In other words, save what you can in the assignments and don't worry about the rest.

When submitting your assignment, along with the attachment, you can put a note in the email message letting me know that the system wouldn’t let you do some of the queries. Still learn the stuff so that you can do it with paper and pencil on the test.

47

4.3.2 The DATE/TIME Data Type

• • If you were to look at the choices of data types in Microsoft Access's graphical user interface for designing tables, you would discover that there is a unified data type, named date/time, for fields that will contain dates, times, or both. The technical details for this type are that it is 8 bytes long and it can hold time information ranging from years down to seconds. 48

• • • In the example database the Carsale.salesdate

field is an example of this type.

It illustrates the idea that you can limit your attention to the date only and ignore the time component if you want to.

The DATE type has several different aspects. 49

• • • By nature, dates and times can be added or subtracted in order to arrive at lengths of time. However, dates are not generally represented as discrete numbers on a timeline. They are displayed as alphanumeric text, where characters, digits, and punctuation are interpreted in a special way. 50

• • In any implementation of SQL, there will be certain valid formats for entering or representing date and time values, and there will be a variety of ways of formatting this information for display. For the purposes of this section, only dates, and not times, will be dealt with.

51

4.3.3 The DATE, TIME, and NOW Functions

• • • Date fields have both date and time components. There is a set of functions that apply to DATE fields, and the set includes both date and time functions. A subset of the available functions will be covered. 52

• • • • • Here are some functions which can be used to have the system generate a date or time value: DATE() Generate today's date according to the system clock.

TIME() Generate the current time.

NOW() Generate the current date and time.

53

• • • • • This simple query would just give the current date maintained on the computer you're using. The results would consist of as many repetitions of the date as there are records in the Carsale table.

SELECT DATE() FROM Carsale 54

4.3.4 The YEAR, MONTH, DAY, and WEEKDAY Functions

• • • • The YEAR, MONTH, DAY, and WEEKDAY functions make it possible to extract constituent parts of a DATE field. The primary key of the Carsale table is the vin field. In the record with vin equal to '55555' the salesdate field contains the value #8/31/2006#. Here are illustrations of these functions using this record: 55

• • • • • The following query would give the value 8: SELECT MONTH(salesdate) FROM Carsale WHERE vin = '55555' 56

• • • • • The following query would give the value 31: SELECT DAY(salesdate) FROM Carsale WHERE vin = '55555' 57

• • • • • The following query would give the value 2006: SELECT YEAR(salesdate) FROM Carsale WHERE vin = '55555' 58

• • • The following query would give the value 5. In other words, the internal calendar knows that this date was the fifth day of the week, a Thursday.

Notice that just like on a normal (American) calendar, the days of the week are numbered beginning with Sunday.

• • • • • SELECT WEEKDAY(salesdate) FROM Carsale WHERE vin '55555' 59

• • • • • • •

4.3.5 Calculations and Nesting with Date Fields

For the purposes of the following examples, assume that there exists a Person table with a dob (date of birth) field. Suppose that you would like to find a person's age. In theory, the current date minus the dob would give the age. However, the result of the following query would be the difference in days, not years: SELECT DATE() – dob FROM Person 60

• • In order to get the difference in years you can apply the YEAR function to both terms in the expression. This means nesting YEAR and DATE, but this works: • • • SELECT YEAR(DATE()) – YEAR(dob) FROM Person 61

• • • It is also possible to use constants in these expressions. However, a simple, integral value will be treated as a day, not a year. The following query displays dates that are one day later than what are stored in the table: • • • SELECT dob + 1 FROM Person 62

• • If, for example, you wanted to find the date of a person's first birthday, it would be necessary to add one year to the dob. This can be accomplished by calling the YEAR function with a constant value: • • • SELECT dob + YEAR(1) FROM Person 63

• • • At the beginning of this topic an example was given showing how it was possible to simply select the current date. A date expression can be used in the same way. Future dates could be selected using the Carsale table with this query: • • • SELECT DATE() + YEAR(1) FROM Carsale 64

4.4 The Keyword FORMAT and Using the Date Data Type

65

4.4.1 Important Note Regarding Versions of MS Access

• • • Note: This is simply a repetition of the information given earlier.

Depending on the installation of Microsoft Access on your machine, you may have problems where some of the date related functions don't work. There is one general thing that can be done to at least make sure there are no background problems, and a specific thing that can be done to try and solve the problems with the date functions.

66

• • • The general thing to do is to make sure that the version of the example database that you're using agrees with the version of Microsoft Access that you're using. The example database for this course as posted is in the Microsoft Access 2000 version in case there are people who only have an older version of Access available to them. If you are using a version of Access later than 2000, then it would be useful to save the example database in that version and use it.

67

• • Whether you change versions of the example database or not, when you try to run queries with date functions you may get several different types of error messages. If one of these error messages appears, try the following steps and see if that resolves the problem: 68

• • • • A. Press CTRL+g. This will open the Microsoft Visual Basic development environment (assuming that it is installed on your system).

B. Take the Tools option in the menu bar.

C. In the drop down menu, select References.

D. A box will pop up containing a list of things. – It may contain a checked box next to this text: – "MISSING: utility.mda".

69

• • • E. Click on the checked box, removing the check mark.

F. Then click OK.

G. Close the Microsoft Visual Basic development environment by clicking on the X in the upper right hand corner of the window.

70

• • • H. Then try to run your query. Hopefully, the date functions will now work.

I. If this solution works, it should be in effect for the duration of the current session using the database. It will have to be repeated every time the database is opened.

71

• • • • If you still can't get all of the date features to work—just fake it. In other words, save what you can in the assignments and don't worry about the rest.

When submitting your assignment, along with the attachment, you can put a note in the email message letting me know that the system wouldn’t let you do some of the queries. Still learn the stuff so that you can do it with paper and pencil on the test.

72

4.4.2 General Background

• • • In any implementation of SQL, there will be certain valid formats for entering or representing date and time values, and there will be a variety of ways of formatting this information for display. The standard format is month/day/year, like this: #10/02/07#. It is not necessary to use two digits when one is enough. #10/2/07# would also be valid input. 73

• • • • SQL will also accept date values in some other formats. The acceptable formats are related to the kinds of formats which are supported for output. For the purposes of the discussion below, note that formats like these are also supported for input: #10-02-07#, #02-oct-07#. 74

• • Notice that text abbreviations of month names are allowed, and the format with the abbreviation expects the month in the second position. Regardless of how the data is entered, if not otherwise formatted, a query will return date values in mm/dd/yy form in output.

75

4.4.3 Invalid Data Entry into Date Fields

• • • This is kind of a vexed topic. Microsoft Access has some features which are supposed to help catch incorrect date values. Unfortunately, at least one of these features which is supposed to be helpful has the opposite effect.

76

• • • • • • Suppose you tried to enter something obviously wrong, like this. There is no month with the abbreviation 'abc': #02-abc-07#.

Access would warn you that something was wrong.

77

• • • • • • Suppose you tried to enter something obviously wrong, like this. There is no month with 40 days: #40-oct-2007# Access would warn you that something was wrong.

78

• • • • • • Suppose you tried to enter something less obviously wrong, like this. September has only 30 days, not 31: #31-sep-2007# Access would warn you that something was wrong.

79

• • Suppose you tried to enter something obviously wrong, like this. Neither 40 nor 70 can be interpreted as a day of the month: • • • • #40-oct-70# Access would warn you that something was wrong. 80

• • • However, here is the kicker. Suppose you tried to enter something obviously wrong, like this. There is no month with 40 days in it, but 07 would be a valid day of the month: • • • • #40-oct-07# You will not get a warning message. 81

• • • Access will by default assume that what you meant was #07-oct-40#, in other words, the seventh of October, 1940. This guarantees that bad data is going into your table without warning. It is not possible to state too strongly what a bad idea this is.

82

4.4.4 Background on Date Formats

• • • A relational database will accept date input in standard formats, but it is meaningless to think in terms of formatting the input data. Once the system accepts the data, regardless of the input format, all dates are stored in the same way. If a query simply selects a date field, the output will appear in the standard mm/dd/yy year format. 83

• • • It is also possible to use the FORMAT keyword with a SELECT statement to make the output of a date field take on a particular appearance. The query has to specify a format, and this is done using abbreviations for various types of date displays. Here are the standard abbreviations for dates along with their meanings: 84

• • • • • • • • • • mm mmm mmmm dd ddd dddd yy yyyy w ww 2 digit number of month in year 3 letter abbreviation of name of month full month name 2 digit number of day in month 3 letter abbreviation of name of day of week full day name 2 digit year 4 digit year 1 digit number of day in week 2 digit number of week in year (1-52) 85

4.4.5 Using the Keyword FORMAT and Writing Date Formats in Queries

• • • The keyword FORMAT follows the keyword SELECT in a query, and FORMAT works like a function. In other words, it is followed by a pair of parentheses. Inside the parentheses are the name of a date field separated by a comma from a format. 86

• • • The format is a string of characters inside quotation marks. The string of characters typically consists of punctuation marks and standard abbreviations of the components of a date.

When the value is retrieved from the date field, it is displayed in the query results according to the format specified in the string.

87

• • • • Here is an example: SELECT FORMAT(salesdate, 'mm-dd-yyyy') FROM Carsale 88

• • • • • A date entered using the standard format as #10/02/07# would be displayed as follows by this query: 10-02-2007. The letters in the format are interpreted as date abbreviations. The dashes are accepted as given and reproduced on the output. 2 digits are forced for both the month and the day, and 4 digits are forced for the year value.

89

• • • • Here is another example, illustrating another format and the fact that you can select other fields in a query when using date formats: SELECT vin, FORMAT(salesdate, 'dddd, dd mmmm, yyyy') FROM Carsale 90

• • • A date entered using the standard format as #10/02/07# would be displayed as follows by this query: Tuesday, 02 October, 2007. The commas and spaces are reproduced as given in the format, and the system makes use of the internal calendar to determine the name of the day of the week corresponding to the date given.

91

• • • • • • • Monday's child is fair of face Tuesday's child is full of grace Wednesday's child is full of woe Thursday's child has far to go Friday's child is loving and giving Saturday's child works hard for a living But the child who is born on the Sabbath Day Is bonny and blithe and good and gay. 92

• • • You can also include other alphanumeric symbols in a format, which will be reproduced unchanged in the output. However, there are limits to how effective this is. Occurrences of m, d, y, and w that agree with the abbreviations are interpreted as abbreviations regardless of where they might appear in the format. 93

• • • If you have included a word or phrase in the format which contains an abbreviation, the word or phrase will not be reproduced unchanged. Wherever the abbreviation occurs in the word, the corresponding date element will appear in the output. If you try this, you will also find out that the list of abbreviations given above is not complete. 94

• • If a word or phrase contains the letters h and s, you will be reminded that date fields are really date/time fields. These letters appear in the format abbreviations for hours and seconds, and if they appear in a format, values for them will appear in the output.

95

4.5 String Operations

96

4.5.1 Granularity, Design, and Integrity

• • • • The term granularity refers to the concept of the smallest unit of data which can be manipulated in a database. In a theoretical sense, this unit is the field. You can insert, update, and delete field values. You can select field values. You can query on given field values. 97

• • In practice, SQL does allow a finer level of granularity. As seen above, although you can declare a field to be of the DATE type, it consists of constituent parts, and these parts can be manipulated individually. 98

• • • Simple text fields consist of sequences of characters, or strings, and it turns out that there are string functions which make it possible to operate on parts of the contents of text fields. String functions are a type of row level function. They operate on the contents of a single field of a single record at a time.

99

• • • There are a number of string functions, and they can be combined into complex expressions. Only a subset of the functions will be given here, and they will only be illustrated in isolation. The explanation for this is as follows: 100

• • • • • Consider the name fields in the Salesperson and Customer tables, for example. Most of the data in them consists of first and last names, in that order, separated by a space. Suppose you wanted to retrieve and display names in the form last name, comma, first name. It would be possible to construct a string expression that would accomplish this. However, it would be a lot of work to do so. 101

• • The name fields were defined as they were in the example tables for simplicity's sake, but they are admittedly poorly designed. If the tables were redesigned with last, first, and middle name fields, that would be better and it would support queries that produced output in the form last name, comma, first name without the use of string functions.

102

• • • • Another small example where string functions could be used was included in the data in the tables. One of the salespeople has leading blanks before the value in the name field. It would be possible to construct a string expression that would remove these blanks. Once again, this is a use of string functions to overcome a defect, in this case a data integrity problem. 103

• • It is possible that you could inherit a database with design and integrity problems, and in that case it might be useful to learn more complex string functions. However, if the database is your own or you have the authority to make or change the design and enforce or improve the data integrity, doing that would be a better use of your time than learning the details of string functions.

104

4.5.2 Inserting Strings into Output

• • • The first example of the use of string functions involves inserting strings into output. Textual information can be included with the contents of a text field by putting the desired string in quotes and using a plus sign to concatenate the field contents and the string contents. For example: 105

• • • • • SELECT name + ' or current occupant' FROM Customer SELECT spno, 'Mr./Ms. ' + name FROM Salesperson 106

4.5.3 The Keyword MID and Extracting Substrings from Text Fields

• • • • Some string functions refer to the positions of the characters in text field. When doing so, the positions are numbered from 1 through the maximum length of the field. The MID function allows you to specify the name of a text field and the starting and ending positions of interest in the field. Used with SELECT, MID will pull just the desired substring out of the field. 107

• • • • For example, if you only wanted to see the first five characters of the city field in the output of a query, you could do this: SELECT MID(city, 1, 5) FROM Salesperson 108

4.5.4 The Functions UCASE and LCASE

• • • The functions UCASE and LCASE are other examples of simple string functions. UCASE will cause the output from a text field to be displayed as all upper case letters. LCASE will cause the output from a text field to be displayed as all lower case letters. 109

• • • • • • • For example: SELECT UCASE(name) FROM Salesperson SELECT LCASE(name) FROM Salesperson 110

4.6 The Keywords FIRST, LAST, and TOP

111

4.6.1 The Keywords FIRST and LAST

• • • • • When working with a database it is sometimes convenient to refresh your memory about the contents of tables. You might use a simple query to do this: SELECT vin FROM Car 112

• • • In real life, if a set of data is large enough to justify entering it into a database, there will be tables with hundreds to thousands of records or more. Selecting all of the records is overkill for refreshing your memory, and such a query can be time consuming to run. Another approach to refreshing your memory would be to take a look at either the first or last value in the table. 113

• • • • • • • This syntax only allows you to pick one field to look at: SELECT FIRST(vin) FROM Car SELECT LAST(vin) FROM Car 114

• Since there is no guaranteed order of values in a table, either of these examples will pick a value that is simply the result of the order in which records were entered into the table.

115

• • • • • • •

4.6.2 The Keyword TOP; Selecting the Same Field You're Ordering on

The FIRST and LAST functions are not highly useful, but they provide a lead-in to the TOP keyword. With it you can ask to see one or more field from more than one record either at the top or the bottom of a set of results that has been ordered in some way. Here is an example, which will be followed by explanations: SELECT TOP 2 make FROM Car ORDER BY make ASC 116

• • • The general purpose of the query is to show makes when the results have been ordered by make. The query specifies ascending order. Even though the keyword TOP is used, by picking the records at the top, you are picking the records that correspond to the lowest value of the ORDER BY field. 117

• • • If descending order is specified, then TOP really does pick the results according to the top values of the ORDER BY field. There is no keyword BOTTOM. Everything is done with TOP and either ascending or descending order.

118

• • • • • The first line of the query reads SELECT TOP 2 make. This does not mean select the top two records. It means select all occurrences of the top two values of the make field. The lowest value of make in the Car table is BMW. It occurs once. 119

• • • • The next lowest value of make is Chevrolet. It occurs four times. The results of this query will consist of BMW followed by four occurrences of Chevrolet. Null values are treated as the lowest by ORDER BY, so if the make field contained nulls, they would appear at the beginning of the results of this query.

120

4.6.3 The Keyword TOP; Selecting a Different Field than You're Ordering on

• • • • • • • Here is an example where the query results are ordered by a field other than the one selected and the ORDER BY has been changed to DESC. It is easy to state in words what the query means: Find the top two values of make where the records have been sorted in descending order by year. SELECT TOP 2 make FROM Car ORDER BY year DESC 121

• • • A query written in this way leads to these results: With the data in the example table, there are only two cars with the highest year, 2006. The makes of these cars are Toyota and Pontiac, so the query returns these values.

122

• • • • • Suppose the query is rewritten to select the top 3: SELECT TOP 3 make FROM Car ORDER BY year DESC 123

• • • At this point, all hell seems to break loose in the query results. In addition to Toyota and Pontiac you get Chrysler, Dodge, Ford, Chevrolet, Chevrolet, and Toyota again. Obviously, this seems to be more than the top three values for make. 124

• • • The explanation for this is that all of these additional make values come from cars in 2005, the next year down in the sort order. Not only will the same make be shown more than once if that value occurs more than once within the range specified by TOP and ORDER BY; More than the desired number of makes will be included in the results if they are tied in the ORDER BY. 125

• • • Because one more value is requested in TOP, it is necessary to step down by one year in the order. Several different makes have that year. There is no basis for ordering different makes within a single year, so none of them can be excluded.

126

4.6.4 The Keyword TOP; Selecting More than One Field

You can ask to see more than one field at a time with the TOP function. Here are two examples: • • • • • • • • SELECT TOP 2 make, model FROM Car ORDER BY year DESC SELECT TOP 2 make, model FROM Car ORDER BY make, model 127

• • In general, the more fields there are in the SELECT, the fewer duplicates there are likely to be in the results. In these examples, each different pair of make and model would count as a distinct value.

128

4.6.5 The Keyword TOP; Selecting a Percent

• • • • • • The TOP keyword also makes it possible to specify that a percent of the total number of values be shown, rather than a specific number of values. For example: SELECT TOP 10 PERCENT make FROM Car ORDER BY year DESC 129

• • • TOP is probably too complicated to be justified merely as a technique for refreshing your memory about the contents of tables. However, it may be that someone would be interested in the results of a query that could be phrased as "find the top 10 percent of the x values when sorted by y". This is exactly the kind of query that TOP makes possible, and it is not clear how you would do this with WHERE clauses containing conditions combined with AND, OR, and NOT.

130

The End

131