COMPRESS= Data Set Option

Specifies the compression of rows in an output table.

Valid in: DATA and PROC steps
Category: Data Set Control
Restriction: Use with output data sets only.
Supports: SAS data set

Syntax

COMPRESS= NO | YES | CHAR | BINARY

Syntax Description

NO

specifies that the rows in a newly created data set are uncompressed (fixed-length records).

YES | CHAR

specifies that the rows in a newly created data set are compressed (variable-length records) by SAS using RLE (Run Length Encoding). RLE compresses rows by reducing repeated consecutive characters (including blanks) to two-byte or three-byte representations.

Alias ON
Tip Use this compression algorithm for character data.

BINARY

specifies that the rows in a newly created data set are compressed (variable-length records) by SAS using RDC (Ross Data Compression). RDC combines run-length encoding and sliding-window compression to compress the file.

Tip This method is highly effective for compressing medium to large (several hundred bytes or larger) blocks of binary data (numeric columns). Because the compression function operates on a single record at a time, the record length must be several hundred bytes or larger for effective compression.

Details

Compressing a file is a process that reduces the number of bytes that are required to represent each row. Advantages of compressing a file include reduced storage requirements for the file and fewer I/O operations necessary to read or write to the data during processing. However, more CPU resources are required to read a compressed file (because of the overhead of uncompressing each row). There are also situations where the resulting file size might increase rather than decrease.
Use the COMPRESS= data set option to compress an individual file. Specify the option for output data sets only, that is, data sets named in the DATA statement of a DATA step or in the OUT= option of a SAS procedure.
After a file is compressed, the setting is a permanent attribute of the file, which means that to change the setting, you must re-create the file. That is, to uncompress a file, specify COMPRESS=NO for a DATA step that copies the compressed file.

Comparisons

The COMPRESS= data set option overrides the COMPRESS= option in the LIBNAME statement, the COMPRESS= connection string option, and the COMPRESS= system option.
When you create a compressed SAS data set, you can also specify REUSE=YES (as a data set option or connection option) in order to track and reuse space. With REUSE=YES, new rows are inserted in space freed when other rows are updated or deleted. When the default REUSE=NO is in effect, new rows are appended to the existing file.