Previous Page | Next Page

Data Representation

Using the LENGTH Statement to Save Storage Space

When SAS writes a numeric variable to a SAS data set, it writes the number in IBM double-precision floating-point format (as described in SAS Language Reference: Concepts). In this format, 8 bytes are required for storing a number in a SAS data set with full precision. However, you can use the LENGTH statement in the DATA step to specify that you want to store a particular numeric variable in fewer bytes.

Using the LENGTH statement can greatly reduce the amount of space that is required for storing your data. For example, if you were storing a series of test scores whose values could range from 0 to 100, you could use numeric variables with a length of 2 bytes. This value would save 6 bytes of storage per variable for each observation in your data set.

However, you must use the LENGTH statement cautiously in order to avoid losing significant data. One byte is always used to store the exponent and the sign. The remaining bytes are used for the mantissa. When you store a numeric variable in fewer than 8 bytes, the least significant digits of the mantissa are truncated. If the part of the mantissa that is truncated contains any nonzero digits, then precision is lost.

Use the LENGTH statement only for variables whose values are always integers. Fractional numbers lose precision if they are truncated. In addition, you must ensure that the values of your variable are always represented exactly in the number of bytes that you specify. Use the following table to determine the largest integer that can be stored in numeric variables of various lengths:

Variable Length and Largest Exact Integer
Length in Bytes Significant Digits Retained Largest Integer Represented Exactly
2 2 256
3 4 65,536
4 7 16,777,216
5 9 4,294,967,296
6 12 1,099,511,627,776
7 14 281,474,946,710,656
8 16 72,057,594,037,927,936

When you use the OUTREP option of the LIBNAME statement to create a SAS data set that is written in a data representation other than one that is native to SAS on z/OS, the information in the preceding table, does not apply. The largest integer that can be represented exactly is generally smaller.

Note:   No warning is issued when the length that you specify in the LENGTH statement results in truncated data.  [cautionend]

For more information, see LENGTH Statement: z/OS.

Previous Page | Next Page | Top of Page