The SQL procedure (PROC SQL) provides numerous arithmetic, statistical, and summary functions to manipulate numeric data. With one numeric data type to represent numeric data, the NUMERIC or NUM column definition is automatically assigned a default length of 8 bytes, even if the column is created with a numeric length less than 8 bytes.
The variety of functions and operators available to PROC SQL can be used to manipulate character data, including words, text, and codes,
using the CHARACTER or CHAR data type. The CHARACTER or CHAR data type allows the use of ASCII or EBCDIC character sets and stores fixed-length
character strings consisting of a maximum of 32K characters. This tip illustrates how columns based on the numeric and character data types are
defined; and how string functions, pattern matching, phonetic matching techniques, and a variety of other techniques are used with numeric and
character data.
PROC SQL users have a number of ways to accomplish their objectives, particularly when the goal is to manipulate data. The SELECT statement is an extremely powerful statement in the SQL language. Its syntax can be somewhat complex because of the number of ways that columns, tables, operators, and functions can be combined into executable statements. There are several types of operators and functions in PROC SQL: (1) comparison operators, (2) logical operators, (3) arithmetic operators, (4) character-string operators, and (5) summary functions. This tip illustrates concatenation, character alignment, and phonetic and pattern-matching character-string operations in manipulating data.
Character-String Operators and Functions
Character-string operators and functions are typically used with character data. Numerous operators are presented to acquaint users
with the power available with the SQL procedure. You'll see a number of operators including string concatenation and character alignment.
Concatenation and Character Alignment
The default alignment for character data is to the left; however, character columns or expressions can also be aligned to the right.
Two functions are available for character alignment: LEFT and RIGHT. The following example combines the concatenation operator ("||")
and the TRIM function with the LEFT function to left-align a character expression while inserting blank spaces and a dash ("-") between two character columns to subset PG-rated movies.
PROC SQL Code
PROC SQL; SELECT LEFT(TRIM(title) || " - " || category) AS Concatenation_Alignment FROM movies WHERE rating = "PG"; QUIT;
Results
Phonetic Matching (Sounds-Like Operator =*)
A technique for finding names that sound alike or have variations in spelling is available in PROC SQL. Although not technically a
function, the sounds-like operator "=*" searches and selects character data based on two expressions: the search value and the
matched value. Anyone who has looked for a last name in a local telephone directory is quickly reminded of the possible phonetic variations.
To illustrate how the sounds-like operator works, we search on the movie title in the MOVIES table, using the character string "Rucky", for
any and all phonetic variations on the movie title "Rocky."
PROC SQL Code
PROC SQL; SELECT title, rating, category FROM movies WHERE title =* "Rucky"; QUIT;
Results
Title Rating Category -------------------------------------------------------- Rocky PG Action Adventure
Finding Patterns in a String (Pattern Matching % and _)
Constructing specific search patterns in string expressions is a simple process with the LIKE predicate. The % acts as a wildcard
character representing any number of characters, including any combination of uppercase or lowercase characters. Combining the LIKE
predicate with the % (percent sign) permits case-sensitive searches and is a popular technique used by savvy SQL programmers to find
patterns in their data. The next example finds patterns in the movie category containing the uppercase character "D" in the first position,
followed by any number of characters in the CATEGORY column.
PROC SQL Code
PROC SQL; SELECT title, rating, category FROM movies WHERE category LIKE 'D%'; QUIT;
Results
Title Rating Category --------------------------------------------------------- Casablanca PG Drama Forrest Gump PG-13 Drama Michael PG-13 Drama Dressed to Kill R Drama Mysteries Ghost PG-13 Drama Romance Titanic PG-13 Drama Romance Silence of the Lambs R Drama Suspense
About the Author
Kirk Paul Lafler is the author of PROC SQL: Beyond the Basics Using SAS, published by SAS Press. He also writes the
popular SAS tips column "Kirk's Korner," which appears regularly in several SAS users group newsletters, and is a frequent speaker at
SAS users group meetings.
His book is available from the online bookstore.
These sample files and code examples are provided by SAS Institute Inc. "as is" without warranty of any kind, either express or implied, including but not limited to the implied warranties of merchantability and fitness for a particular purpose. Recipients acknowledge and agree that SAS Institute shall not be liable for any damages whatsoever arising out of their use of this material. In addition, SAS Institute will provide no support for the materials contained herein.
These sample files and code examples are provided by SAS Institute Inc. "as is" without warranty of any kind, either express or implied, including but not limited to the implied warranties of merchantability and fitness for a particular purpose. Recipients acknowledge and agree that SAS Institute shall not be liable for any damages whatsoever arising out of their use of this material. In addition, SAS Institute will provide no support for the materials contained herein.
Type: | Sample |
Topic: | SAS Reference ==> Procedures ==> SQL Internal Administration ==> BBU Non SAS Authors ==> Kirk Paul Lafler |
Date Modified: | 2005-06-01 11:39:51 |
Date Created: | 2005-04-06 14:46:54 |
Product Family | Product | Host | SAS Release | |
Starting | Ending | |||
SAS System | Base SAS | All | n/a | n/a |