An inherent problem with the X-11 method is the revision of the seasonal factor estimates as new data become available. The X-11 method uses a set of centered moving averages to estimate the seasonal components. These moving averages apply symmetric weights to all observations except those at the beginning and end of the series, where asymmetric weights have to be applied. These asymmetric weights can cause poor estimates of the seasonal factors, which then can cause large revisions when new data become available.
While large revisions to seasonally adjusted values are not common, they can happen. When they do happen, it undermines the credibility of the X-11 seasonal adjustment method.
A method to address this problem was developed at Statistics Canada (Dagum, 1980, 1982a). This method, known as X-11-ARIMA, applies an ARIMA model to the original data (after adjustments, if any) to forecast the series one or more years. This extended series is then seasonally adjusted, allowing symmetric weights to be applied to the end of the original data. This method was tested against a large number of Canadian economic series and was found to greatly reduce the amount of revisions as new data were added.
The X-11-ARIMA method is available in PROC X11 through the use of the ARIMA statement. The ARIMA statement extends the original series either with a user-specified ARIMA model or by an automatic selection process in which the best model from a set of five predefined ARIMA models is used.
The following example illustrates the use of the ARIMA statement. The ARIMA statement does not contain a user-specified model, so the best model is chosen by the automatic selection process. Forecasts from this best model are then used to extend the original series by one year. The following partial listing shows parameter estimates and model diagnostics for the ARIMA model chosen by the automatic selection process.
proc x11 data=sales; monthly date=date; var sales; arima; run;
Figure 37.4: X-11-ARIMA Model Selection
Monthly Retail Sales Data (in $1000) |
Conditional Least Squares Estimation | ||||
---|---|---|---|---|
Parameter | Estimate | Approx. Std Error | t Value | Lag |
MU | 0.0001728 | 0.0009596 | 0.18 | 0 |
MA1,1 | 0.3739984 | 0.0893427 | 4.19 | 1 |
MA1,2 | 0.0231478 | 0.0892154 | 0.26 | 2 |
MA2,1 | 0.5727914 | 0.0790835 | 7.24 | 12 |
Conditional Least Squares Estimation | ||
---|---|---|
Variance Estimate = | 0.0014313 | |
Std Error Estimate = | 0.0378326 | |
AIC = | -482.2412 | * |
SBC = | -470.7404 | * |
Number of Residuals= | 131 | |
* Does not include log determinant |
Criteria Summary for Model 2: (0,1,2)(0,1,1)s, Log Transform |
Box-Ljung Chi-square: 22.03 with 21 df Prob= 0.40 |
(Criteria prob > 0.05) |
Test for over-differencing: sum of MA parameters = 0.57 |
(must be < 0.90) |
MAPE - Last Three Years: 2.84 (Must be < 15.00 %) |
- Last Year: 3.04 |
- Next to Last Year: 1.96 |
- Third from Last Year: 3.51 |
Table D11 (final seasonally adjusted series) is now constructed using symmetric weights on observations at the end of the actual data. This should result in better estimates of the seasonal factors and, thus, smaller revisions in Table D11 as more data become available.