Previous Page | Next Page

Working with Numeric Variables

Storing Numeric Variables Efficiently

The data sets shown in this section are very small, but data sets are often very large. If you have a large data set, you may need to think about the storage space that your data set occupies. There are ways to save space when you store numeric variables in SAS data sets.

Note:   The SAS documentation for your operating environment provides information about storing numeric variables whose values are limited to 1 or 0 in the minimum number of bytes used by SAS (either 2 or 3 bytes, depending on your operating environment).  [cautionend]

By default, SAS uses 8 bytes of storage in a data set for each numeric variable. Therefore, storing the variables for each observation in the earlier data set MORETOUR requires 75 bytes:

   56 bytes for numeric variables
      (8 bytes per variable * 7 numeric variables)
   11 bytes for Country
    8 bytes for Vendor
   __________________________
   75 bytes for all variables

When numeric variables contain only integers (whole numbers), you can often shorten them in the data set being created. For example, a length of 4 bytes accurately stores all integers up to at least 2,000,000.

Note:   Under some operating environments, the maximum number of bytes is much greater. For more information, refer to the documentation provided by the vendor for your operating environment.  [cautionend]

To change the number of bytes used for each variable, use a LENGTH statement.

A LENGTH statement contains the names of the variables followed by the number of bytes to be used for their storage. For numeric variables, the LENGTH statement affects only the data set being created; it does not affect the program data vector. The following program changes the storage space for all numeric variables that are in the data set SHORTER:

options pagesize=60 linesize=80 pageno=1 nodate;
data shorter;
   set mylib.populartours;
   length Nights AirCost LandCost RoundAir TotalCostR
          Costsum RoundSum 4;
   RoundAir = round(AirCost,50);
   TotalCostR = round(AirCost + LandCost,100);
   CostSum = sum(AirCost,LandCost);
   RoundSum = round(sum(AirCost,LandCost),100);
run;

By calculating the storage space that is needed for the variables in each observation of SHORTER, you can see how the LENGTH statement changes the amount of storage space used:

28 bytes for numeric variables
   (4 bytes per variable in the LENGTH statement X 7 numeric variables)
11 bytes for Country
 8 bytes for Vendor
__________________________
47 bytes for all variables

Because of the 7 variables in SHORTER are shortened by the LENGTH statement, the storage space for the variables in each observation is reduced by almost half.

CAUTION:
Be careful in shortening the length of numeric variables if your variable values are not integers.

Fractional numbers lose precision permanently if they are truncated. In general, use the LENGTH statement to truncate values only when disk space is limited. Use the default length of 8 bytes to store variables containing fractions.  [cautionend]

Previous Page | Next Page | Top of Page