COMPRESS Function

Returns a character string with specified characters removed from the original string.

Category: Character
Restriction: I18N Level 0 functions are designed for use with Single Byte Character Sets (SBCS) only.
Tip: DBCS equivalent function is KCOMPRESS.

Syntax

COMPRESS(<source> <, chars> <, modifiers> )

Optional Arguments

source

specifies a character constant, variable, or expression from which specified characters will be removed.

chars

specifies a character constant, variable, or expression that initializes a list of characters.

By default, the characters in this list are removed from the source argument. If you specify the K modifier in the third argument, then only the characters in this list are kept in the result.
Tip
You can add more characters to this list by using other modifiers in the third argument.
Tip
Enclose a literal string of characters in quotation marks.

modifier

specifies a character constant, variable, or expression in which each non-blank character modifies the action of the COMPRESS function. Blanks are ignored. The following characters can be used as modifiers:

a or A adds alphabetic characters to the list of characters.
c or C adds control characters to the list of characters.
d or D adds digits to the list of characters.
f or F adds the underscore character and English letters to the list of characters.
g or G adds graphic characters to the list of characters.
h or H adds a horizontal tab to the list of characters.
i or I ignores the case of the characters to be kept or removed.
k or K keeps the characters in the list instead of removing them.
l or L adds lowercase letters to the list of characters.
n or N adds digits, the underscore character, and English letters to the list of characters.
o or O processes the second and third arguments once rather than every time the COMPRESS function is called. Using the O modifier in the DATA step (excluding WHERE clauses), or in the SQL procedure, can make COMPRESS run much faster when you call it in a loop where the second and third arguments do not change.
p or P adds punctuation marks to the list of characters.
s or S adds space characters (blank, horizontal tab, vertical tab, carriage return, line feed, and form feed) to the list of characters.
t or T trims trailing blanks from the first and second arguments.
u or U adds uppercase letters to the list of characters.
w or W adds printable characters to the list of characters.
x or X adds hexadecimal characters to the list of characters.
Tip
If the modifier is a constant, enclose it in quotation marks. Specify multiple constants in a single set of quotation marks. Modifier can also be expressed as a variable or an expression.

Details

Length of Returned Variable

In a DATA step, if the COMPRESS function returns a value to a variable that has not previously been assigned a length, then that variable is given the length of the first argument.

The Basics

The COMPRESS function allows null arguments. A null argument is treated as a string that has a length of zero.
Based on the number of arguments, the COMPRESS functions works as follows:
Number of Arguments
Result
only the first argument, source
The argument has all blanks removed. If the argument is completely blank, then the result is a string with a length of zero. If you assign the result to a character variable with a fixed length, then the value of that variable will be padded with blanks to fill its defined length.
the first two arguments, source and chars
All characters that appear in the second argument are removed from the result.
three arguments, source, chars, and modifier(s)
The K modifier (specified in the third argument) determines whether the characters in the second argument are kept or removed from the result.
The COMPRESS function compiles a list of characters to keep or remove, comprising the characters in the second argument plus any types of characters that are specified by the modifiers. For example, the D modifier specifies digits. Both of the following function calls remove digits from the result:
COMPRESS(source, "1234567890");
COMPRESS(source, , "d");
To remove digits and plus or minus signs, you can use either of the following function calls:
COMPRESS(source, "1234567890+-");
COMPRESS(source, "+-", "d");

Examples

Example 1: Compressing Blanks

SAS Statement
Result
----+----1
a='AB C D ';
b=compress(a);
put b;
 
ABCD

Example 2: Compressing Lowercase Letters

SAS Statement
Result
----+----1----+----2----+----3
x='123-4567-8901 B 234-5678-9012 c';
y=compress(x,'ABCD','l');
put y;
  
123-4567-8901 234-5678-9012

Example 3: Compressing Space Characters

SAS Statement
Result
----+----1
x='1    2    3    4    5';
y=compress(x,,'s');
put y;
 
12345

Example 4: Keeping Characters in the List

SAS Statement
Result
----+----1
x='Math A English B Physics A';
y=compress(x,'ABCD','k');
put y;
 
ABA

Example 5: Compressing a String and Returning a Length of 0

SAS Statement
Result
----+----1
x=' ';
l=lengthn(compress(x));
put l;
 
0