The X12 Procedure

Example 37.6 User-Defined Regressors

This example demonstrates the use of the USERVAR= option in the REGRESSION statement to include user-defined regressors in the regARIMA model. The user-defined regressors must be defined as nonmissing values for the span of the series being modeled plus any backcast or forecast values. Suppose you have the data set SALESDATA with 132 monthly observations beginning in January of 1949.

title 'Data Set to be Seasonally Adjusted';
data salesdata;  
   set sashelp.air(obs=132);
run;  

Because the regARIMA model forecasts one year ahead, you must define the regressor for 144 observations that start in January of 1949. You can construct a simple length-of-month regressor by using the following DATA step:

title 'User-defined Regressor for Data to be Seasonally Adjusted';
data regressors(keep=date LengthOfMonth); 
   set sashelp.air;
   LengthOfMonth = INTNX('MONTH',date,1) - date;
run;  

In this example, the two data sets are merged to use them as input to PROC X12. You can also use the AUXDATA= data set to input user-defined regressors. See Example 37.11 for more information. The BY statement is used to align the regressors with the time series by the time ID variable DATE.

title 'Data Set Containing Series and Regressors';
data datain; 
   merge regressors salesdata;
   by date;
run;  
proc print data=datain(firstobs=121);
run;

The last 24 observations of the input data set are displayed in Output 37.6.1. The regressor variable is defined for one year (12 observations) beyond the span of the time series to be seasonally adjusted.

Output 37.6.1: PROC X12 Input Data Set with User-Defined Regressor

Data Set Containing Series and Regressors

Obs DATE LengthOfMonth AIR
121 JAN59 31 360
122 FEB59 28 342
123 MAR59 31 406
124 APR59 30 396
125 MAY59 31 420
126 JUN59 30 472
127 JUL59 31 548
128 AUG59 31 559
129 SEP59 30 463
130 OCT59 31 407
131 NOV59 30 362
132 DEC59 31 405
133 JAN60 31 .
134 FEB60 29 .
135 MAR60 31 .
136 APR60 30 .
137 MAY60 31 .
138 JUN60 30 .
139 JUL60 31 .
140 AUG60 31 .
141 SEP60 30 .
142 OCT60 31 .
143 NOV60 30 .
144 DEC60 31 .


The DATAIN data set is now ready to be used as input to PROC X12. The DATE= variable and the user-defined regressors are automatically excluded from the variables to be seasonally adjusted.

title 'regARIMA Model with User-defined Regressor';
proc x12 data=datain date=DATE interval=MONTH plots=none;
   transform function=log;
   regression uservar=LengthOfMonth / usertype=lom;
   automdl;
   x11;
   output out=out a1 d11;
run;

The parameter estimates for the regARIMA model are shown in Output 37.6.2

Output 37.6.2: PROC X12 Output for User-Defined Regression Parameter

regARIMA Model with User-defined Regressor

The X12 Procedure

Regression Model Parameter Estimates
For Variable AIR
Type Parameter NoEst Estimate Standard Error t Value Pr > |t|
User Defined LengthOfMonth Est 0.04683 0.01834 2.55 0.0119

Exact ARMA Maximum Likelihood Estimation
For Variable AIR
Parameter Lag Estimate Standard Error t Value Pr > |t|
Nonseasonal MA 1 0.33678 0.08506 3.96 0.0001
Seasonal MA 12 0.54078 0.07726 7.00 <.0001


Another way to include user-defined regressors in the regARIMA model is to specify the SPAN= option in the PROC X12 statement. The following user-defined regressor is similar to the one defined previously. However, this length-of-month regressor is mean adjusted. Using a zero-mean regressor prevents the regressor from altering the level of the series. In this instance, the series to be seasonally adjusted, AIR, and the regression variable, LengthOfMonth, have nonmissing observations at all time periods in the data set DATAIN.

title 'User-defined Regressor for Data to be Seasonally Adjusted, Mean Adjusted';
data datain(keep=date AIR LengthOfMonth); 
   set sashelp.air;
   LengthOfMonth = INTNX('MONTH',date,1) - date - 30.4375;
run;  

Because the default forecast period is one year ahead, the span of the series must be limited to one year before the end of the regression variable definition to forecast using the regression variable LengthOfMonth,

title 'regARIMA Model with Zero-Mean User-defined Regressor';
proc x12 data=datain date=DATE interval=MONTH span=(,DEC1959) plots=none;
   transform function=log;
   regression uservar=LengthOfMonth / usertype=lom;
   automdl;
   x11;
   output out=outzm a1 d11;
run;

The parameter estimates for the regARIMA model that are estimated using a zero-mean regressor are shown in Output 37.6.3

Output 37.6.3: PROC X12 Output for Zero-Mean User-Defined Regression Parameter

regARIMA Model with Zero-Mean User-defined Regressor

The X12 Procedure

Regression Model Parameter Estimates
For Variable AIR
Type Parameter NoEst Estimate Standard Error t Value Pr > |t|
User Defined LengthOfMonth Est 0.04683 0.01834 2.55 0.0119

Exact ARMA Maximum Likelihood Estimation
For Variable AIR
Parameter Lag Estimate Standard Error t Value Pr > |t|
Nonseasonal MA 1 0.33678 0.08506 3.96 0.0001
Seasonal MA 12 0.54078 0.07726 7.00 <.0001


Specifying USERTYPE=LOM causes the regression effect to be removed from the seasonally adjusted series. The effect of the mean of the regression variable on the seasonally adjusted series can be seen by examining the plots of the original series and the seasonally adjusted series.

title 'regARIMA Model with Non-Zero-Mean User-Defined Regressor';
proc sgplot data=out;
   series x=date y=air_A1 / name = "A1" markers
                              markerattrs=(color=red symbol='asterisk')
                              lineattrs=(color=red);
   series x=date y=air_D11 / name= "D11" markers
                               markerattrs=(symbol='circle')
                               lineattrs=(color=blue);
   yaxis label='Original and Seasonally Adjusted Time Series';
run;
title 'regARIMA Model with Zero-Mean User-Defined Regressor';
proc sgplot data=outzm;
   series x=date y=air_A1 / name = "A1" markers
                              markerattrs=(color=red symbol='asterisk')
                              lineattrs=(color=red);
   series x=date y=air_D11 / name= "D11" markers
                               markerattrs=(symbol='circle')
                               lineattrs=(color=blue);
   yaxis label='Original and Seasonally Adjusted Time Series';
run;

The graph of the original and seasonally adjusted series in Output 37.6.4 shows that the level of the seasonally adjusted series has been altered due to the user-defined regressor. The graph of the original and seasonally adjusted series in Output 37.6.5 shows that the level of the seasonally adjusted series is the same as the original series since the user-defined regressor has zero-mean.

Output 37.6.4: Plot of Original and Seasonally Adjusted Data


Output 37.6.5: Plot of Original and Seasonally Adjusted Data (Zero-Mean Regressor)


When actual values are available for the forecast periods, information about forecast error is available in the output. Output 37.6.6 shows the table Forecasts and Standard Errors of the Transformed Data on the Original Scale for a series with missing values in the forecast period. Output 37.6.7 shows the table Forecasts and Standard Errors of the Transformed Data on the Original Scale for a series with actual values in the forecast period. Thus, it is more desirable to use SPAN= option to limit the span of a series if the actual values are available for the forecast period.

Output 37.6.6: PROC X12 Forecasts for Series Extended with Missing Values

Forecasts and Standard Errors of the Transformed
Data
On the Original scale
For Variable AIR
Date Forecast Standard Error 95% Confidence Limits
JAN1960 419.600 14.85053 391.509 449.705
FEB1960 416.480 19.05188 380.826 455.472
MAR1960 466.697 22.66762 424.402 513.208
APR1960 454.468 24.53242 408.951 505.051
MAY1960 473.876 27.91366 422.353 531.684
JUN1960 547.601 34.74893 483.769 619.855
JUL1960 623.318 42.20549 546.139 711.405
AUG1960 631.731 45.30824 549.231 726.623
SEP1960 527.221 39.81839 455.011 610.890
OCT1960 462.774 36.63020 396.605 539.984
NOV1960 407.155 33.64286 346.608 478.277
DEC1960 452.702 38.91914 382.913 535.212


Output 37.6.7: PROC X12 Forecasts for Series with Actual Values in Forecast Periods

Forecasts and Standard Errors of the Transformed Data
On the Original scale
For Variable AIR
Date Data Forecast Forecast Error Standard Error t Value 95% Confidence Limits
JAN1960 417.000 419.600 -2.600 14.85053 -0.18 391.509 449.705
FEB1960 391.000 416.480 -25.480 19.05188 -1.34 380.826 455.472
MAR1960 419.000 466.697 -47.697 22.66762 -2.10 424.402 513.208
APR1960 461.000 454.468 6.532 24.53242 0.27 408.951 505.051
MAY1960 472.000 473.876 -1.876 27.91366 -0.07 422.353 531.684
JUN1960 535.000 547.601 -12.601 34.74893 -0.36 483.769 619.855
JUL1960 622.000 623.318 -1.318 42.20549 -0.03 546.139 711.405
AUG1960 606.000 631.731 -25.731 45.30824 -0.57 549.231 726.623
SEP1960 508.000 527.221 -19.221 39.81839 -0.48 455.011 610.890
OCT1960 461.000 462.774 -1.774 36.63020 -0.05 396.605 539.984
NOV1960 390.000 407.155 -17.155 33.64286 -0.51 346.608 478.277
DEC1960 432.000 452.702 -20.702 38.91914 -0.53 382.913 535.212