The PANEL Procedure

LAG, ZLAG, XLAG, SLAG, or CLAG Statement

  • LAG var$_1$( lag$_1$ lag$_2$lag$_ T$ ) , …, var$_ N$( lag$_1$ lag$_2$lag$_ T$ ) < / OUT= SAS-data-set > ;

Generally, creating lags of variables in a panel setting is a tedious process in which you must generate many DATA step statements. The PANEL procedure now enables you to generate lags of any series without jumping across the boundary of any individual series. The LAG statement is a data set generation tool. Using the data created by a LAG statement requires a subsequent PROC PANEL call. You can specify more than one LAG statement in each call to PROC PANEL.

You must specify the OUT= option in the LAG statement. The output data set includes all variables in the input set, plus the lags that are denoted with the convention var_lag. The LAG statement tends to generate many missing values in the data. This can be problematic, because the number of usable observations diminishes with the lag length. Therefore, PROC PANEL offers the following alternatives to the LAG statement. The following statements can be used instead of LAG with otherwise identical syntax:

  • CLAG var$_1$( lag$_1$ lag$_2$lag$_ T$ ) , …, var$_ N$( lag$_1$ lag$_2$lag$_ T$ ) < / OUT= SAS-data-set > ;

replaces missing values with the cross section mean for that variable in that cross section. Missing values are replaced only if they are in the generated (lagged) series. Missing variables in the original variables are not changed.

  • SLAG var$_1$( lag$_1$ lag$_2$lag$_ T$ ) , …, var$_ N$( lag$_1$ lag$_2$lag$_ T$ ) < / OUT= SAS-data-set > ;

replaces missing values with the time mean for that variable in that time period. Missing values are replaced only if they are in the generated (lagged) series. Missing variables in the original variables are not changed.

  • XLAG var$_1$( lag$_1$ lag$_2$lag$_ T$ ) , …, var$_ N$( lag$_1$ lag$_2$lag$_ T$ ) < / OUT= SAS-data-set > ;

replaces missing values with the overall mean for that variable. Missing values are replaced only if they are in the generated (lagged) series. Missing variables in the original variables are not changed.

  • ZLAG var$_1$( lag$_1$ lag$_2$lag$_ T$ ) , …, var$_ N$( lag$_1$ lag$_2$lag$_ T$ ) < / OUT= SAS-data-set > ;

replaces missing values with 0 for that variable. Missing values are replaced only if they are in the generated (lagged) series. Missing variables in the original variables are not changed.

Assume that data set A has been sorted by cross section and by time period within cross section (or that the FLATDATA statement has been specified) and that the variables are Y, X1, X2, and X3. The following PROC PANEL statements generate a series with lags 1 and 3 of the X1 variable; lags 3, 6, and 9 of the X2 variable; and lag 2 of the X3 variable.

   proc panel data=A;
      id i t;
      lag X1(1 3) X2(3 6 9) X3(2) / out=A_lag;
   run;

If you want a zeroing instead of missing values, then you specify the following:

   proc panel data=A;
      id i t;
      zlag X1(1 3) X2(3 6 9) X3(2) / out=A_zlag;
   run;

Similarly, you can specify XLAG to replace with overall means, SLAG to replace with time means, and CLAG to replace with cross section means.