Retrieving Data from a Single Table |
This chapter shows you how to
retrieve data from a single table by using the SELECT statement
validate the correctness of a SELECT statement by using the VALIDATE statement.
Note: The examples in this chapter retrieve data from tables that are SAS data sets. However, you can use all of the operations that are described here with SAS data views.
The SELECT statement is the primary tool of PROC SQL. You use it to identify, retrieve, and manipulate columns of data from a table. You can also use several optional clauses within the SELECT statement to place restrictions on a query.
SELECT and FROM Clauses |
The following simple SELECT statement is sufficient to produce a useful result:
select Name from sql.countries;
The SELECT statement must contain a SELECT clause and a FROM clause, both of which are required in a PROC SQL query. This SELECT statement contains
a SELECT clause that lists the Name column
a FROM clause that lists the table in which the Name column resides.
WHERE Clause |
The WHERE clause enables you to restrict the data that you retrieve by specifying a condition that each row of the table must satisfy. PROC SQL output includes only those rows that satisfy the condition. The following SELECT statement contains a WHERE clause that restricts the query output to only those countries that have a population that is greater than 5,000,000 people:
select Name from sql.countries where Population gt 5000000;
ORDER BY Clause |
The ORDER BY clause enables you to sort the output from a table by one or more columns; that is, you can put character values in either ascending or descending alphabetical order, and you can put numerical values in either ascending or descending numerical order. The default order is ascending. For example, you can modify the previous example to list the data by descending population:
select Name from sql.countries where Population gt 5000000 order by Population desc;
GROUP BY Clause |
The GROUP BY clause enables you to break query results into subsets of rows. When you use the GROUP BY clause, you use an aggregate function in the SELECT clause or a HAVING clause to instruct PROC SQL how to group the data. For details about aggregate functions, see Summarizing Data. PROC SQL calculates the aggregate function separately for each group. When you do not use an aggregate function, PROC SQL treats the GROUP BY clause as if it were an ORDER BY clause, and any aggregate functions are applied to the entire table.
The following query uses the SUM function to list the total population of each continent. The GROUP BY clause groups the countries by continent, and the ORDER BY clause puts the continents in alphabetical order:
select Continent, sum(Population) from sql.countries group by Continent order by Continent;
HAVING Clause |
The HAVING clause works with the GROUP BY clause to restrict the groups in a query's results based on a given condition. PROC SQL applies the HAVING condition after grouping the data and applying aggregate functions. For example, the following query restricts the groups to include only the continents of Asia and Europe:
select Continent, sum(Population) from sql.countries group by Continent having Continent in ('Asia', 'Europe') order by Continent;
Ordering the SELECT Statement |
When you construct a SELECT statement, you must specify the clauses in the following order:
SELECT
FROM
WHERE
GROUP BY
HAVING
ORDER BY
Note: Only the SELECT and FROM clauses are required.
The PROC SQL SELECT statement and its clauses are discussed in further detail in the following sections.
Copyright © 2008 by SAS Institute Inc., Cary, NC, USA. All rights reserved.