Summing Series

Simple cumulative sums are easy to compute using SAS sum statements. The following statements show how to compute the running sum of variable X in data set A, adding XSUM to the data set.

   data a;
      set a;
      xsum + x;
   run;

The SAS sum statement automatically retains the variable XSUM and initializes it to 0, and the sum statement treats missing values as 0. The sum statement is equivalent to using a RETAIN statement and the SUM function. The previous example could also be written as follows:

   data a;
      set a;
      retain xsum;
      xsum = sum( xsum, x );
   run;

You can also use the EXPAND procedure to compute summations. For example:

   proc expand data=a out=a method=none;
      convert x=xsum / transform=( sum );
   run;

Like differencing, summation can be done at different lags and can be repeated to produce higher-order sums. To compute sums over observations separated by lags greater than 1, use the LAG and SUM functions together, and use a RETAIN statement that initializes the summation variable to zero.

For example, the following statements add the variable XSUM2 to data set A. XSUM2 contains the sum of every other observation, with even-numbered observations containing a cumulative sum of values of X from even observations, and odd-numbered observations containing a cumulative sum of values of X from odd observations.

   data a;
      set a;
      retain xsum2 0;
      xsum2 = sum( lag( xsum2 ), x );
   run;

Assuming that A is a quarterly data set, the following statements compute running sums of X for each quarter. XSUM4 contains the cumulative sum of X for all observations for the same quarter as the current quarter. Thus, for a first-quarter observation, XSUM4 contains a cumulative sum of current and past first-quarter values.

   data a;
      set a;
      retain xsum4 0;
      xsum4 = sum( lag3( xsum4 ), x );
   run;

To compute higher-order sums, repeat the preceding process and sum the summation variable. For example, the following statements compute the first and second summations of X:

   data a;
      set a;
      xsum + x;
      x2sum + xsum;
   run;

The following statements compute the second order four-period sum of X:

   data a;
      set a;
      retain xsum4 x2sum4 0;
      xsum4 = sum( lag3( xsum4 ), x );
      x2sum4 = sum( lag3( x2sum4 ), xsum4 );
   run;

You can also use PROC EXPAND to compute cumulative statistics and moving window statistics. See Chapter 15: The EXPAND Procedure, for details.