Retrieving Data from a Single Table |
In addition to selecting columns that are stored in a table, you can create new columns that exist for the duration of the query. These columns can contain text or calculations. PROC SQL writes the columns that you create as if they were columns from the table.
Adding Text to Output |
You can add text to the output by including a string expression, or literal expression, in a query. The following query includes two strings as additional columns in the output:
proc sql outobs=12; title 'U.S. Postal Codes'; select 'Postal code for', Name, 'is', Code from sql.postalcodes;
U.S. Postal Codes Name Code ----------------------------------------------------------- Postal code for Alabama is AL Postal code for Alaska is AK Postal code for American Samoa is AS Postal code for Arizona is AZ Postal code for Arkansas is AR Postal code for California is CA Postal code for Colorado is CO Postal code for Connecticut is CT Postal code for Delaware is DE Postal code for District Of Columbia is DC Postal code for Florida is FL Postal code for Georgia is GA
To prevent the column headings Name and Code from printing, you can assign a label that starts with a special character to each of the columns. PROC SQL does not output the column name when a label is assigned, and it does not output labels that begin with special characters. For example, you could use the following query to suppress the column headings that PROC SQL displayed in the previous example:
proc sql outobs=12; title 'U.S. Postal Codes'; select 'Postal code for', Name label='#', 'is', Code label='#' from sql.postalcodes;
Suppressing Column Headings in Output
U.S. Postal Codes --------------------------------------------------------- Postal code for Alabama is AL Postal code for Alaska is AK Postal code for American Samoa is AS Postal code for Arizona is AZ Postal code for Arkansas is AR Postal code for California is CA Postal code for Colorado is CO Postal code for Connecticut is CT Postal code for Delaware is DE Postal code for District Of Columbia is DC Postal code for Florida is FL Postal code for Georgia is GA
Calculating Values |
You can perform calculations with values that you retrieve from numeric columns. The following example converts temperatures in the SQL.WORLDTEMPS table from Fahrenheit to Celsius:
proc sql outobs=12; title 'Low Temperatures in Celsius'; select City, (AvgLow - 32) * 5/9 format=4.1 from sql.worldtemps;
Note: This example uses the FORMAT attribute to modify the format of the calculated output. See Specifying Column Attributes for more information.
Low Temperatures in Celsius City ------------------------ Algiers 7.2 Amsterdam 0.6 Athens 5.0 Auckland 6.7 Bangkok 20.6 Beijing -8.3 Belgrade -1.7 Berlin -3.9 Bogota 6.1 Bombay 20.0 Bucharest -4.4 Budapest -3.9
Assigning a Column Alias |
By specifying a column alias, you can assign a new name to any column within a PROC SQL query. The new name must follow the rules for SAS names. The name persists only for that query.
When you use an alias to name a column, you can use the alias to reference the column later in the query. PROC SQL uses the alias as the column heading in output. The following example assigns an alias of LowCelsius to the calculated column from the previous example:
proc sql outobs=12; title 'Low Temperatures in Celsius'; select City, (AvgLow - 32) * 5/9 as LowCelsius format=4.1 from sql.worldtemps;
Assigning a Column Alias to a Calculated Column
Low Temperatures in Celsius City LowCelsius ------------------------------ Algiers 7.2 Amsterdam 0.6 Athens 5.0 Auckland 6.7 Bangkok 20.6 Beijing -8.3 Belgrade -1.7 Berlin -3.9 Bogota 6.1 Bombay 20.0 Bucharest -4.4 Budapest -3.9
Referring to a Calculated Column by Alias |
When you use a column alias to refer to a calculated value, you must use the CALCULATED keyword with the alias to inform PROC SQL that the value is calculated within the query. The following example uses two calculated values, LowC and HighC, to calculate a third value, Range:
proc sql outobs=12; title 'Range of High and Low Temperatures in Celsius'; select City, (AvgHigh - 32) * 5/9 as HighC format=5.1, (AvgLow - 32) * 5/9 as LowC format=5.1, (calculated HighC - calculated LowC) as Range format=4.1 from sql.worldtemps;
Note: You can specify a calculated column only in a SELECT clause or a WHERE clause.
Referring to a Calculated Column by Alias
Range of High and Low Temperatures in Celsius City HighC LowC Range --------------------------------------- Algiers 32.2 7.2 25.0 Amsterdam 21.1 0.6 20.6 Athens 31.7 5.0 26.7 Auckland 23.9 6.7 17.2 Bangkok 35.0 20.6 14.4 Beijing 30.0 -8.3 38.3 Belgrade 26.7 -1.7 28.3 Berlin 23.9 -3.9 27.8 Bogota 20.6 6.1 14.4 Bombay 32.2 20.0 12.2 Bucharest 28.3 -4.4 32.8 Budapest 26.7 -3.9 30.6
Note: Because this query sets a numeric format of 4.1 on the HighC, LowC, and Range columns, the values in those columns are rounded to the nearest tenth. As a result of the rounding, some of the values in the HighC and LowC columns do not reflect the range value output for the Range column. When you round numeric data values, this type of error sometimes occurs. If you want to avoid this problem, then you can specify additional decimal places in the format.
Assigning Values Conditionally |
CASE expressions enable you to interpret and change some or all of the data values in a column to make the data more useful or meaningful.
You can use conditional logic within a query by using a CASE expression to conditionally assign a value. You can use a CASE expression anywhere that you can use a column name.
The following table, which is used in the next example, describes the world climate zones (rounded to the nearest degree) that exist between Location 1 and Location 2:
Climate zone | Location 1 | Latitude at Location 1 | Location 2 | Latitude at Location 2 |
---|---|---|---|---|
North Frigid | North Pole | 90 | Arctic Circle | 67 |
North Temperate | Arctic Circle | 67 | Tropic of Cancer | 23 |
Torrid | Tropic of Cancer | 23 | Tropic of Capricorn | -23 |
South Temperate | Tropic of Capricorn | -23 | Antarctic Circle | -67 |
South Frigid | Antarctic Circle | -67 | South Pole | -90 |
In this example, a CASE expression determines the climate zone for each city based on the value in the Latitude column in the SQL.WORLDCITYCOORDS table. The query also assigns an alias of ClimateZone to the value. You must close the CASE logic with the END keyword.
proc sql outobs=12; title 'Climate Zones of World Cities'; select City, Country, Latitude, case when Latitude gt 67 then 'North Frigid' when 67 ge Latitude ge 23 then 'North Temperate' when 23 gt Latitude gt -23 then 'Torrid' when -23 ge Latitude ge -67 then 'South Temperate' else 'South Frigid' end as ClimateZone from sql.worldcitycoords order by City;
Using a Simple CASE Expression
Climate Zones of World Cities City Country Latitude ClimateZone --------------------------------------------------------------------------- Abadan Iran 30 North Temperate Acapulco Mexico 17 Torrid Accra Ghana 5 Torrid Adana Turkey 37 North Temperate Addis Ababa Ethiopia 9 Torrid Adelaide Australia -35 South Temperate Aden Yemen 13 Torrid Ahmenabad India 22 Torrid Algiers Algeria 37 North Temperate Alice Springs Australia -24 South Temperate Amman Jordan 32 North Temperate Amsterdam Netherlands 52 North Temperate
You can also construct a CASE expression by using the CASE-OPERAND form, as in the following example. This example selects states and assigns them to a region based on the value of the Continent column:
proc sql outobs=12; title 'Assigning Regions to Continents'; select Name, Continent, case Continent when 'North America' then 'Continental U.S.' when 'Oceania' then 'Pacific Islands' else 'None' end as Region from sql.unitedstates;
Note: When you use the CASE-OPERAND form of the CASE expression, the conditions must all be equality tests; that is, they cannot use comparison operators or other types of operators, as are used in Using a Simple CASE Expression.
Using a CASE Expression in the CASE-OPERAND Form
Assigning Regions to Continents Name Continent Region ------------------------------------------------------------------------------------------ Alabama North America Continental U.S. Alaska North America Continental U.S. Arizona North America Continental U.S. Arkansas North America Continental U.S. California North America Continental U.S. Colorado North America Continental U.S. Connecticut North America Continental U.S. Delaware North America Continental U.S. District of Columbia North America Continental U.S. Florida North America Continental U.S. Georgia North America Continental U.S. Hawaii Oceania Pacific Islands
Replacing Missing Values |
The COALESCE function enables you to replace missing values in a column with a new value that you specify. For every row that the query processes, the COALESCE function checks each of its arguments until it finds a nonmissing value, then returns that value. If all of the arguments are missing values, then the COALESCE function returns a missing value. For example, the following query replaces missing values in the LowPoint column in the SQL.CONTINENTS table with the words Not Available:
proc sql; title 'Continental Low Points'; select Name, coalesce(LowPoint, 'Not Available') as LowPoint from sql.continents;
Using the COALESCE Function to Replace Missing Values
Continental Low Points Name LowPoint ------------------------------------------------------------------------ Africa Lake Assal Antarctica Not Available Asia Dead Sea Australia Lake Eyre Central America and Caribbean Not Available Europe Caspian Sea North America Death Valley Oceania Not Available South America Valdes Peninsula
The following CASE expression shows another way to perform the same replacement of missing values; however, the COALESCE function requires fewer lines of code to obtain the same results:
proc sql; title 'Continental Low Points'; select Name, case when LowPoint is missing then 'Not Available' else Lowpoint end as LowPoint from sql.continents;
Specifying Column Attributes |
You can specify the following column attributes, which determine how SAS data is displayed:
FORMAT=
INFORMAT=
LABEL=
LENGTH=
The following example assigns a label of State to the Name column and a format of COMMA10. to the Area column:
proc sql outobs=12; title 'Areas of U.S. States in Square Miles'; select Name label='State', Area format=comma10. from sql.unitedstates;
Note: Using the LABEL= keyword is optional. For example, the following two select clauses are the same:
select Name label='State', Area format=comma10. select Name 'State', Area format=comma10.
Areas of U.S. States in Square Miles State Area ----------------------------------------------- Alabama 52,423 Alaska 656,400 Arizona 114,000 Arkansas 53,200 California 163,700 Colorado 104,100 Connecticut 5,500 Delaware 2,500 District of Columbia 100 Florida 65,800 Georgia 59,400 Hawaii 10,900
Copyright © 2008 by SAS Institute Inc., Cary, NC, USA. All rights reserved.