# What's New in SAS/ETS 9.0 and 9.1

## Overview

New procedures in SAS/ETS include the following:

• The experimental ENTROPY procedure provides Generalized Maximum Entropy estimation for linear systems of equations.
• The QLIM procedure analyzes univariate and multivariate models where dependent variables take discrete values or values in a limited range.
• The TIMESERIES procedure analyzes time-stamped transactional data with respect to time and accumulates the data into a time series format.
• The UCM procedure provides estimation for Unobserved Component Models, also referred to as Structural Models.

Several new financial and date, time, and datetime functions have been added.

The new experimental SASEHAVR interface engine is now available to SAS/ETS for Windows users for accessing economic and financial data residing in a HAVER ANALYTICS Data Link Express (DLX) database.

New features have been added to the following SAS/ETS components:

• PROC ARIMA
• PROC EXPAND
• PROC MDC
• PROC MODEL
• PROC VARMAX
• PROC X12
• Time Series Forecasting System

## Financial Functions

SAS/ETS now provides new financial functions. They are described in detail in Chapter 4, "SAS Macros and Functions."

CUMIPMT
Returns the cumulative interest paid on a loan between the start period and the end period.
CUMPRINC
Returns the cumulative principal paid on a loan between the start period and the end period.
IPMT
Returns the interest payment for a given period for an investment based on periodic, constant payments and a constant interest rate.
PMT
Returns the periodic payment for a constant payment loan or the periodic saving for a future balance.
PPMT
Returns the payment on the principal for an investment for a given period.

## Date, Time, and Datetime Functions

SAS/ETS now provides the following new date, time, and datetime functions. See Chapter 3, "Date Intervals, Formats, and Functions," for more details.

INTFMT
Returns a recommended format given a date, time, or datetime interval.
INTCINDEX
Returns the cycle index given a date, time, or datetime interval and value.
INTCYCLE
Returns the date, time, or datetime interval at the next higher seasonal cycle given a date, time, or datetime interval.
INTINDEX
Returns the seasonal index given a date, time, or datetime interval and value.
INTSEA
Returns the length of the seasonal cycle given a date, time, or datetime interval.

## SASEHAVR Engine

The experimental SASEHAVR interface engine gives Windows users random access to economic and financial data residing in a HAVER ANALYTICS Data Link Express (DLX) database. You can limit the range of data that is read from the time series and specify a desired conversion frequency. Start dates are recommended on the libname statement to help you save resources when processing large databases or when processing a large number of observations. You can further the subsetting of your data by using the WHERE, KEEP, or DROP statements in your DATA step. You can use the SQL procedure to create a view of your resulting SAS data set.

## ARIMA Procedure

The OUTLIER statement of the ARIMA procedure has become production in SAS System 9. A new ID option that provides date labels to the discovered outliers has been added.

In the presence of embedded missing values, the new default White Noise test of residuals uses the one proposed by Stoffer and Toloi (1992), which is more appropriate.

The default forecasting algorithm when the data have embedded missing values and the model has multiple orders of differencing for the dependent series has been slightly modified. This modification usually improves the statistical properties of the forecasts.

## ENTROPY Procedure

The new experimental ENTROPY procedure implements a parametric method of linear estimation based on Generalized Maximum Entropy.

Often the statistical-economic model of interest is ill-posed or underdetermined for the observed data, for example when limited data is available or acquiring data is costly. For the general linear model this can imply that high degrees of collinearity exist among explanatory variables or that there are more parameters to estimate than observations to estimate them with. These conditions lead to high variances or non-estimability for traditional GLS estimates.

The principle of maximum entropy, at the base of the ENTROPY procedure, is the foundation for an estimation methodology that is characterized by its robustness to ill-conditioned designs and its ability to fit overparameterized models.

Generalized Maximum Entropy, GME, is a means of selecting among probability distributions so as to choose the distribution that maximizes uncertainty or uniformity remaining in the distribution, subject to information already known about the distribution itself. Information takes the form of data or moment constraints in the estimation procedure. PROC ENTROPY creates a GME distribution for each parameter in the linear model, based upon support points supplied by the user. The mean of each distribution is used as the estimate of the parameter. Estimates tend to be biased, as they are a type of shrinkage estimate, but will typically portray smaller variances than OLS counterparts, making them more desirable from a mean squared error viewpoint.

PROC ENTROPY can be used to fit simultaneous systems of linear regression models, Markov models, and seemingly unrelated regression models as well as to solve pure inverse problems and  unordered, multinomial choice problems. Bounds and restrictions on parameters can be specified and Wald, Likelihood ratio, and Lagrange multiplier tests can be computed. Prior information can also be supplied to enhance estimates and data.

## EXPAND Procedure

The EXPAND procedure has several new transformation operators: moving product, moving rank, moving geometric mean, sequence operators, fractional differencing, Hodrick-Prescott filtering, and scaling.

The EXPAND procedure has a new option for creating time series graphics. The PLOT= option enables you to graph the input, output, and transformed time series.

## MDC Procedure

The RESTRICT statement now has a new syntax and supports linear restrictions.

The new BOUNDS statement enables you to specify simple boundary constraints on the parameter estimates. You can use both the BOUNDS statement and the RESTRICT statement to impose boundary constraints; however, the BOUNDS statement provides a simpler syntax for specifying these kinds of constraints.

## MODEL Procedure

The SMM (Simulated Method of Moments) estimation is now available as an option in the FIT statement. This method of estimation is appropriate for estimating models in which integrals appear in the objective function and these integrals can be approximated by simulation. There may be various reasons for that to happen, for example, transformation of a latent model into an observable model, missing data, random coefficients, heterogeneity, etc. A typical use of SMM is in estimating stochastic volatility models in finance, where only the stock return is observable, while the volatility process is not, and needs to be integrated out of the likelihood function. The simulation method can be used with all the estimation methods except Full Information Maximum Likelihood (FIML) in PROC MODEL. Simulated Generalized Method of Moments (SGMM) is the default estimation method.

Heteroscedastic Corrected Covariance Matrix Estimators (HCCME) have been implemented. The HCCME= option selects which correction is applied to the covariance matrix.

Instrumental variables can now be specified for specific equations rather than for all equations. This is done with expanded syntax on the INSTRUMENT statement.

## QLIM Procedure

The new QLIM procedure analyzes univariate and multivariate limited dependent variable models where dependent variables take discrete values or dependent variables are observed only in a limited range of values. This procedure includes logit, probit, tobit, selection, and multivariate models. The multivariate model can contain discrete choice and limited endogenous variables as well as continuous endogenous variables.

The QLIM procedure supports the following models:

• linear regression model with heteroscedasticity
• probit with heteroscedasticity
• logit with heteroscedasticity
• tobit (censored and truncated) with heteroscedasticity
• Box-Cox regression with heteroscedasticity
• bivariate probit
• bivariate tobit
• sample selection and switching regression models
• multivariate limited dependent variables

## TIMESERIES Procedure

The new TIMESERIES procedure analyzes time-stamped transactional data with respect to time and accumulates the data into a time series format. The procedure can perform trend and seasonal analysis on the transactions. Once the transactional data are accumulated, time domain and frequency domain analysis can be performed on the resulting time series. The procedure produces numerous graphical results related to time series analysis.

## UCM Procedure

The new UCM procedure, experimental in SAS System 9, is production in SAS 9.1. You can use this procedure to analyze and forecast equally spaced univariate time series data using Unobserved Components Models (UCM).

The UCMs can be regarded as regression models where, apart from the usual regression variables, the model consists of components such as trend, seasonals, and cycles. In time series literature UCMs are also referred to as Structural Models. The different components in a UCM can be modeled separately and are customized to represent salient features of a given time series. The analysis provides separate in-sample and out of sample estimates (forecasts) of these component series. In particular, model-based seasonal decomposition and seasonal adjustment of the dependent series is easily available. The distribution of errors in the model is assumed to be Gaussian and the model parameters are estimated by maximizing the Gaussian likelihood. The UCM procedure can handle missing values in the dependent series.

The domains of applicability of PROC UCM and PROC ARIMA are virtually identical; however, decomposition of a series in features such as trend, seasonals, and cycles is more convenient in PROC UCM. A seasonal decomposition of a time series can also be obtained using other procedures, for example, PROC X12. However, these seasonal decompositions generally do not take into account regression and other effects and are not model based. The seasonal decomposition in PROC UCM is based on a comprehensive model, providing all the advantages of model diagnostics.

## VARMAX Procedure

The VARMAX procedure now provides the following features:

• The ECTREND option is available in the ECM=( ) option of the MODEL statement to fit the VECM(p) with a restriction on the drift. The ECTREND option is ignored when either the NSEASON or NOINT option is specified.
• You can now use the DFTEST option at multiple lags. For example, DFTEST=(DLAG=(1)(12)) provides the Dickey-Fuller regular unit root test and seasonal unit root test. If the TREND= option is specified, the seasonal unit root test is not available.
• The DYNAMIC option is added to the PRINT=( ) option. This representation displays the contemporaneous relationships among the components of the vector time series.
• The CORRX, CORRY, COVPE, COVX, COVY, DECOMPOSE, IARR, IMPULSE, IMPULSX, PARCOEF, PCANCORR, and PCORR options can be used with the number in parentheses in the PRINT=( ) option. For example, you can use CORRX or CORRX(number). The options print the number of lags specified by number. The default is the number of lags specified by the LAGMAX=number.
• The subset BVAR model is now available.
• The statistics for the one lagged coefficient matrix are removed in the ECM.
• The last columns of the BETA and ALPHA are removed in the COINTTEST option when the NOINT option is not specified.
• The long variable names are available in the model parameter estimation table.
• The schematic representation of the estimates that shows the significance of the parameters is now available.
• Two new ODS Tables, ParameterGraph and GARCHParameterGraph, are added.

Many ODS table names have been changed.

## X12 Procedure

The X12 procedure default behavior has changed with regard to missing leading and trailing values. Previously the default was not to trim leading/trailing missing values from the series. This made it difficult to process multiple series within a data set when the series had differing spans. Now the default is to trim leading and trailing missing values. The new NOTRIMMISS option provides the old default behavior; when NOTRIMMISS is specified, PROC X12 will automatically generate missing value regressors for any missing value within the span of the series, including leading and trailing missing values.

The following statements and options are new:

• The AUTOMDL statement uses the TRAMO method based on the work of Gomez and Maravall (1997a and 1997b) to automatically select the ARIMA part of a regARIMA model for the time series.
• The OUTLIER statement automatically detects additive, level shift, and temporary change outliers in the time series. After the outliers are identified, the appropriate regression variables are incorporated into the model.
• The MAXITER and TOL options of the ESTIMATE statement provide additional control over the convergence of the nonlinear estimation.
• The ITPRINT and PRINTERR options of the ESTIMATE statement enable you to examine the iterations history of the nonlinear estimation.
•  The FINAL and FORCE options of the X11 statement enable you to control the final seasonally adjusted series. The FINAL option specifies whether outlier, level shift, and temporary change effects should be removed from the final seasonally adjusted series. The FORCE option specifies whether or not the yearly totals of final seasonally adjusted series match the totals of the original series.

## Time Series Forecasting System

Enhancements to this graphical point-and-click system provide new kinds of forecasting models, better ways to customize lists of models, greater flexibility in sharing projects over a network, and support for graphical and tabular Web reports:

• The Factored ARIMA Model Specification window provides a general purpose interface for specifying ARIMA models. You can specify any number of factors and select the AR and MA lags to include in each factor. This makes it easy to model series with unusual and/or multiple seasonal cycles.
•  Improvements to the Model Selection List Editor window enable you to open alternate model lists included with the software as well as user defined model lists. You can create a new model list, open an existing model list, modify it, use it to replace the current list, append it to the current list, save it in a catalog, assign it to a project, or assign it as a user default list for newly created projects. Several new ARIMA and dynamic regression model lists are provided. You can combine these into large sets for automatic model selection and select from them to create the best set of candidate models for your data.
•  Project options are no longer stored exclusively in the SASUSER library. You can use any path to which you have write access by assigning the libname TSFSUSER. The system prompts you for this path if you do not have write access to the SASUSER library.
•  The Series Viewer and Model Viewer support saving graphs and tables via the Output Delivery System (ODS). Select the "Use Output Delivery System" option in the Save As dialog to create html pages and corresponding gif files. You can access and organize these using the ODS Results window, display them automatically in your browser (depending on your results preferences settings), or publish them via the Internet or an intranet. You can also create other forms of output by providing your own ODS statements.
• The Time Series Viewer and Time Series Forecasting System can now be started from a macro submitted from the Program Editor or the Enhanced Editor. The FORECAST and TSVIEW macros accept the same arguments as the FORECAST and TSVIEW commands. You can use the FORECAST macro to generate and submit any number of independent unattended forecasting runs from a data step program.

## References

Gomez, V. and A. Maravall (1997a), "Program TRAMO and SEATS: Instructions for the User, Beta Version," Banco de Espana.

Gomez, V. and A. Maravall (1997b), "Guide for Using the Programs TRAMO and SEATS, Beta Version," Banco de Espana.

Stoffer, D. and Toloi, C. (1992), "A Note on the Ljung-Box-Pierce Portmanteau Statistic with Missing Data," Statistics & Probability Letters 13, 391-396.