From: Jun 26, 2002 Shelley Doll
Due to popular demand, we're back with the second segment in our series of articles covering the basics of the SQL standard. Thanks to everyone for your feedback! In the last article, we covered some database terminology and looked at the four basic data query types. We also explained the WHERE clause and conditional statements and provided examples of the various query types.
In this article, we'll pick up where we left off with some additional functions and clauses you can use with the basic SELECT data query.
A clause is a statement that modifies results. Clauses are not required statements, but refine what and how data is displayed. The WHERE clause in a query is one example.
A keyword triggers functionality within the database. These are sometimes required with queries, such as INTO and VALUES in the statement “INSERT INTO table_name (column1) VALUES (‘data1’);”. We’ll look at the DISTINCT keyword, which triggers some useful optional functionality.
Functions are built-in statements that apply logic to a result set. We’ll cover several options of this type.
I’ve summarized the most common of these clauses, keywords, and functions below. I’ll explain each one and give examples of usage in the sections that follow.
Basic usage is as follows:
SELECT * FROM Contacts ORDER BY first_name;You are free to use ORDER BY with any select statement that might return multiple rows. You can also use it in conjunction with other clauses: SELECT first_name, last_name FROM Contacts WHERE first_name BETWEEN ‘a’ AND ‘k’ ORDER BY last_name;
You can give multiple columns to sort by. Precedence is given from left to right, so the order in which you list your columns is important.
SELECT * FROM Contacts ORDER BY company, last_name, first_name;Results are displayed in ascending order by default, either numerically or alphabetically. You can change this behavior by including the DESC keyword following any column name in the ORDER BY clause. In the example below, the highest net_amount will be listed first (in descending order). If two or more rows contain the same net_amount value, they will be displayed showing first the entry with the last_name value that comes earliest in the alphabet, because the last_name column is still sorted in ascending order.
SELECT * FROM Sales ORDER BY net_amount DESC, last_name, first_name;After running out of defined column names to sort by, most databases will then sort by the first column in the table and work toward the right. Implementation in this area varies though, so if the sort order is important, you should explicitly define which columns to use.
Another handy thing to note is that with the ORDER BY clause (as with the WHERE clause), the columns you are using to sort the results do not have to be part of the returned result set. The following example is perfectly valid, as long as all referenced columns exist in the table:
SELECT company, first_name, net_amount FROM Sales ORDER BY start_date, last_name;
SELECT DISTINCT company FROM Sales;When you use DISTINCT, it applies to all requested columns. If you want a list of all the salespeople in your table and the companies they represent but not every sales entry, you can use the following statement. Note that this may return several entries from the same company, etc. DISTINCT applies to the entire requested result set. SELECT DISTINCT company, last_name, first_name FROM Sales;
You can still use DISTINCT when narrowing and sorting your results, as with any SELECT statement. To determine what is displayed, the database first establishes whether the refined request matches a row, then applies the DISTINCT function. The ORDER BY clause is always processed after the entire result set has been determined. In the following example, only rows in which the net_amount is greater than 100 will be returned. Since DISTINCT keeps the first encountered row that matches the query result set and discards the rest, the net_amount column referenced in the ORDER BY statement may seemingly yield randomly ordered results.
SELECT DISTINCT company, last_name, first_name FROM Sales WHERE net_amount > 100 ORDER BY company, net_amount;
$keyname = “COUNT(*)”; $resultkey = “AVG(net_amount)”;
SELECT COUNT(*) FROM Sales;
You can also use it to count the number of rows in any result set.
SELECT COUNT(*) FROM Sales WHERE net_amount > 100;
If you want to see how many rows contain non-null values for a particular column, you can use COUNT on that column. Note that this will return the total number of rows unless the database is configured to default to NULL when a field is empty for the column in question. Also, listing more than one column name will cause an error.
SELECT COUNT(company) FROM Sales;COUNT can also be applied to count the number of rows in a DISTINCT result set.
SELECT COUNT(DISTINCT company, last_name) FROM Sales;The COUNT statement is frequently used to determine the number of iterations to use for a FOR loop in your code.
You can limit the scope of this function with clauses.
SELECT AVG(net_amount) FROM Sales WHERE company LIKE ‘%ABCD Co%’;
As with all aggregate functions, the ORDER BY statement will be ignored.
SELECT SUM(net_amount) FROM Sales WHERE net_amount > 100;
The AVG, SUM, MIN, and MAX functions will return an error if you do not specify a column, so you can't use the ‘*’ wildcard.
SELECT MIN(net_amount) FROM Sales WHERE last_name = “Smith”; SELECT MIN(last_name) FROM Sales;
SELECT MAX(net_amount) FROM Sales; SELECT MAX(company) FROM Sales WHERE net_amount > 100;
The MAX function is sometimes used on columns containing an auto-incremented key field to determine what the next entry’s key ID will be. Unless you’re running a nonpublic database, be wary of using this information to insert the next entry, in case another user beats you to the punch.
SELECT company, MAX(net_amount) FROM Sales GROUP BY company;
This gives you the highest net_amount for each company. You can also use this statement while selecting multiple column names and use multiple columns to group the function results.
The following examples illustrate the various ways this will produce results. First, including the GROUP BY clause lets you specify additional columns to display. However, be aware that this example will return the first last_name value encountered in the group; the SUM(net_amount) displayed will be for the entire company and not just rows matching the last name. This is because we're using only the company field to define our group.
SELECT company, last_name, SUM(net_amount) FROM Sales GROUP BY company;
In the above example, the last_name column provides pretty useless information, but you're allowed to request it in your query because of the functionality in the next example. You can create groups defined by multiple columns. This will produce function results for unique rows in the result set created by the combination of all specified GROUP BY columns:
SELECT company, AVG(net_amount), last_name FROM Sales GROUP BY company, last_name;The above example provides the average net_amount for each distinct last name in each company. The order in which you list GROUP BY columns controls the sort order of your results, but the actual function value results will be the same.
Another example shows how you can group results without displaying the grouped columns. This is useful, for example, if you want to show the number of sales per person but keep the names anonymous:
SELECT company, COUNT(sale_id) FROM Sales GROUP BY company, last_name;
The above example will apply the AVG function only to rows that qualified under the WHERE restriction. Note that the WHERE clause must precede the GROUP BY clause. You can also limit the result set returned after the groups are evaluated with the HAVING statement.
SELECT company, AVG(net_amount), FROM Sales WHERE last_name BETWEEN ‘a’ AND ‘m’ GROUP BY company HAVING AVG(net_amount) > 500;
The above statement calculates the average net_amount for each company, evaluating only the values for people whose last names meet the criteria, and displaying only the results that are greater than 500.