Previous Page | Next Page

SAS System Options

COMPRESS= System Option



Specifies the type of compression of observations to use for output SAS data sets.
Valid in: configuration file, SAS invocation, OPTIONS statement, SAS System Options window
Category: Files: SAS Files
System administration: Performance
PROC OPTIONS GROUP= SASFILES
PERFORMANCE
Restriction: The TAPE engine does not support the COMPRESS= system option.

Syntax
Syntax Description
Details
Comparisons
See Also

Syntax

COMPRESS=NO | YES | CHAR | BINARY


Syntax Description

NO

specifies that the observations in a newly created SAS data set are uncompressed (fixed-length records).

Alias: N | OFF
YES | CHAR

specifies that the observations in a newly created SAS data set are compressed (variable-length records) by SAS using RLE (Run Length Encoding). RLE compresses observations by reducing repeated consecutive characters (including blanks) to two-byte or three-byte representations.

Alias: Y, ON
Tip: Use this compression algorithm for character data.

Note:   COMPRESS=CHAR is accepted by Version 7 and later versions.  [cautionend]

BINARY

specifies that the observations in a newly created SAS data set are compressed (variable-length records) by SAS using RDC (Ross Data Compression). RDC combines run-length encoding and sliding-window compression to compress the file.

Tip: This method is highly effective for compressing medium to large (several hundred bytes or larger) blocks of binary data (numeric variables). Because the compression function operates on a single record at a time, the record length needs to be several hundred bytes or larger for effective compression.

Operating Environment Information:   The syntax that is shown here applies to the OPTIONS statement. On the command line or in a configuration file, the syntax is specific to your operating environment. For details, see the SAS documentation for your operating environment.  [cautionend]


Details

Compressing a file is a process that reduces the number of bytes required to represent each observation. Advantages of compressing a file include reduced storage requirements for the file and fewer I/O operations necessary to read or write to the data during processing. However, more CPU resources are required to read a compressed file (because of the overhead of uncompressing each observation), and there are situations when the resulting file size might increase rather than decrease.

Use the COMPRESS= system option to compress all output data sets that are created during a SAS session. Use the option only when you are creating SAS data files (member type DATA). You cannot compress SAS views, because they contain no data.

Once a file is compressed, the setting is a permanent attribute of the file, which means that to change the setting, you must re-create the file. That is, to uncompress a file, specify COMPRESS=NO for a DATA step that copies the compressed file.

Note:   For the COPY procedure, the default value CLONE uses the compression attribute from the input data set for the output data set. If the engine for the input data set does not support the compression attribute, then PROC COPY uses the current value of the COMPRESS= system option. For more information about CLONE and NOCLONE, see COPY statement. This interaction does not apply when using SAS/SHARE or SAS/CONNECT.  [cautionend]


Comparisons

The COMPRESS= system option can be overridden by the COMPRESS= option in the LIBNAME statement and the COMPRESS= data set option.

The data set option POINTOBS=YES, which is the default, determines that a compressed data set can be processed with random access (by observation number) rather than sequential access. With random access, you can specify an observation number in the FSEDIT procedure and the POINT= option in the SET and MODIFY statements.

When you create a compressed file, you can also specify REUSE=YES (as a data set option or system option) in order to track and reuse space. With REUSE=YES, new observations are inserted in space freed when other observations are updated or deleted. When the default REUSE=NO is in effect, new observations are appended to the existing file.

POINTOBS=YES and REUSE=YES are mutually exclusive. That is, they cannot be used together. REUSE=YES takes precedence over POINTOBS=YES. That is, if you set REUSE=YES, SAS automatically sets POINTOBS=NO.

The TAPE engine does not support the COMPRESS= system option, but the engine does support the COMPRESS= data set option.

The XPORT engine does not support compression.


See Also

Data Set Options:

COMPRESS= Data Set Option

POINTOBS= Data Set Option

REUSE= Data Set Option

Statements:

LIBNAME Statement

System Option:

REUSE= System Option

Compressing Data Files in SAS Language Reference: Concepts

Previous Page | Next Page | Top of Page