FOCUS AREAS

Enhancements in SAS/ETS 9.2 Software

Many SAS/ETS procedures now produce graphical output using the SAS Output Delivery System. This output is produced when you turn on ODS graphics with the following ODS statement:

   ods graphics on;
Several procedures now support the PLOTS= option to control the graphical output produced. (See the chapters for individual SAS/ETS procedures for details on the plots supported.)

With SAS 9.2, SAS/ETS offers three new modules:

autoreg

AUTOREG Procedure

Two new features have been added to the AUTOREG procedure.

An alternative test for stationarity, proposed by Kwiatkowski, Phillips, Schmidt, and Shin (KPSS), is implemented. The null hypothesis for this test is a stationary time series, which is a natural choice for many applications. Bartlett and quadratic spectral kernels for estimating long-run variance can be used. Automatic bandwidth selection is an option.

Corrected Akaike information criterion (AICC) is implemented. This modification of AIC corrects for small-sample bias. Along with the corrected Akaike information criterion, the mean absolute error (MAE) and mean absolute percentage error (MAPE) are now included in the summary statistics.

COUNTREG Procedure

Often the data that is being analyzed take the form of nonnegative integer (count) values. The new COUNTREG procedure implements count data models that take this discrete nature of data into consideration. The dependent variable in these models is a count that represents various discrete events (such as number of accidents, number of doctor visits, or number of children). The conditional mean of the dependent variable is a function of various covariates. Typically, you are interested in estimating the probability of the number of event occurrences using maximum likelihood estimation. The COUNTREG procedure supports the following types of models:

DATASOURCE Procedure

PROC DATASOURCE now supports the newest Compustat Industrial Universal Character Annual and Quarterly data by providing the new filetypes CSAUCY3 for annual data and CSQUCY3 for quarterly data.

esm

New ESM Procedure

The ESM (Exponential Smoothing Models) procedure provides a quick way to generate forecasts for many time series or transactional data in one step. All parameters associated with the forecast model are optimized based on the data.

MODEL Procedure

The t copula and the normal mixture copula have been added to the MODEL procedure. Both copulas support asymmetric parameters. The copula is used to modify the correlation structure of the model residuals for simulation.

Starting with SAS 9.2, the MODEL procedure stores MODEL files in SAS datasets using an XML-like format instead of in SAS catalogs. This makes MODEL files more readily extendable in the future and enables Java-based applications to read the MODEL files directly. More information is stored in the new format MODEL files; this enables some features that are not available when the catalog format is used.

The MODEL procedure continues to read and write old-style catalog MODEL files, and model files created by previous releases of SAS/ETS continue to work, so you should experience no direct impact from this change.

The CMPMODEL= option can be used in an OPTIONS statement to modify the behavior of the MODEL when reading and writing MODEL files. The values allowed are CMPMODEL= BOTH | XML | CATALOG. For example, the following statements restore the previous behavior:

options cmpmodel=catalog;

The CMPMODEL= option defaults to BOTH in SAS 9.2; this option is intended for transitional use while customers become accustomed to the new file format. If CMPMODEL=BOTH, the MODEL procedure writes both formats; when loading model files, PROC MODEL attempts to load the XML version first and the CATALOG version second (if the XML version is not found). If CMPMODEL=XML the MODEL procedure reads and writes only the XML format. If CMPMODEL=CATALOG, only the catalog format is used.

PANEL Procedure

The PANEL procedure expands the estimation capability of the TSCSREG procedure in the time-series cross-sectional framework. The new methods include: between estimators, pooled estimators, and dynamic panel estimators using GMM method. Creating lags of variables in a panel setting is simplified by the LAG statement. Because the presence of heteroscedasticity can result in inefficient and biased estimates of the variance covariance matrix in the OLS framework, several methods that produce heteroscedasticity-corrected covariance matrices (HCCME) are added. The new RESTRICT statement specifies linear restrictions on the parameters. New ODS Graphics plots simplify model development by providing visual analytical tools.

model

QLIM Procedure

Stochastic frontier models are now available in the QLIM procedure. Specification of these models allows for random shocks of production or cost along with technological or cost inefficiencies. The nonnegative error-term component that represents technological or cost inefficiencies has half-normal, exponential, or truncated normal distributions.

SASECRSP Engine

The SASECRSP interface now supports reading of CRSP stock, indices, and combined stock/indices databases by using a variety of keys, not just CRSP's primary key PERMNO.

In addition, SASECRSP can now read the CRSP/Compustat Merged (CCM) database and fully supports cross-database access, enabling you to access the CCM database by CRSP's main identifiers PERMNO and PERMCO, as well as to access the CRSP Stock databases by Compustat's GVKEY identifier.

A list of other new features follows:

SASEFAME Engine

The SASEFAME interface enables you to access and process financial and economic time series data that resides in a FAME database. SASEFAME for SAS 9.2 supports Windows, Solaris, AIX, Linux, Linux Opteron, and HP-UX hosts. You can now use the SAS windowing environment to view FAME data and use the SAS viewtable commands to navigate your FAME data base. You can select the time span of data by specifying a range of dates in the RANGE= option. You can use an input SAS data set with a WHERE clause to specify selection of variables based on BY variables, such as tickers or issues stored in a FAME string case series. You can use a FAME crosslist to perform selection based on the crossproduct of two FAME namelists. The new FAMEOUT= option now supports the following classes and types of data series objects: FORMULA, TIME, BOOLEAN, CASE, DATE, and STRING.

It is easy to use a SAS input data set with the INSET= option to create a specific view of your FAME data. Multiple views can be created by using multiple LIBNAME statements with customized options tailored to the unique view that you want to create. See Selecting Time Series Using CROSSLIST= Option with INSET= and WHERE=TICK in Chapter 33, The SASEFAME Interface Engine.

The INSET variables define the BY variables that enable you to view cross sections or slices of your data. When used in conjunction with the WHERE clause and the CROSSLIST= option, SASEFAME can show any or all of your BY groups in the same view or in multiple views. The INSET= option is invalid without a WHERE that clause specifies the BY variables you want to use in your view, and it must be used with the CROSSLIST=option.

The CROSSLIST= option provides a more efficient means of selecting cross sections of financial time series data. This option can be used without using the INSET= option. There are two methods for performing the crosslist selection function. The first method uses two FAME namelists, and the second method uses one namelist and one BY group specified in the WHERE= clause of the INSET=option. See Selecting Time Series Using CROSSLIST= Option with a FAME Namelist of Tickers in Chapter 33, The SASEFAME Interface Engine.

The FAMEOUT= option provides efficient selection of the class and type of the FAME data series objects you want in your SAS output data set. The possible values for fame_data_object_class_type are FORMULA, TIME, BOOLEAN, CASE, DATE, and STRING. If the FAMEOUT=option is not specified, numeric time series are output to the SAS data set. FAMEOUT=CASE defaults to case series of numeric type, so if you want another type of case series in your output, then you must specify it. Scalar data objects are not supported. See Reading Other FAME Data Objects with the FAMEOUT=Option in Chapter 33, The SASEFAME Interface Engine.

SASEHAVR Engine

The SASEHAVR interface engine is now production, giving Windows users random access to economic and financial data that resides in a Haver Analytics Data Link Express (DLX) database. You can now use the SAS windowing environment to view HAVERDLX data and use the SAS viewtable commands to navigate your Haver database. You can use the SQL procedure to create a view of your resulting SAS data set. You can limit the range of data that is read from the time series and specify a desired conversion frequency. Start dates are recommended in the LIBNAME statement to help you save resources when processing large databases or when processing a large number of observations. You can further subset your data by using the WHERE, KEEP, or DROP statements in your DATA step. New options are provided for more efficient subsetting by time series variables, groups, or sources. You can force the aggregation of all variables selected to be of the frequency specified by the FREQ= option if you also specify the FORCE=FREQ option. Aggregation is supported only from a more frequent time interval to a less frequent time interval, such as from weekly to monthly.

A list of other new features follows:

New SIMILARITY Procedure (Experimental)

The new SIMILARITY procedure provides similarity analysis between two time series and other sequentially ordered numeric data. The SIMILARITY procedure computes similarity measures between an input sequence and target sequence, as well as similarity measures that "slide" the target sequence with respect to the input sequence. The "slides" can be by observation index (sliding-sequence similarity measures) or by seasonal index (seasonal-sliding-sequence similarity measures).

UCM Procedure

The following features are new to the UCM procedure:

VARMAX Procedure

The VARMAX procedure now enables independent (exogenous) variables with their distributed lags to influence dependent (endogenous) variables in various models, such as VARMAX, BVARX, VECMX, BVECMX, and GARCH-type multivariate conditional heteroscedasticity models.

Multivariate GARCH Models?New GARCH Statement

Multivariate GARCH modeling is now a production feature of VARMAX.

To enable greater flexibility in specifying multivariate GARCH models, the new GARCH statement has been added to the VARMAX procedure. With the addition of the GARCH statement, the GARCH= option is no longer supported on the MODEL statement.

The OUTHT= option can be specified in the GARCH statement to write the estimated conditional covariance matrix to an output data set. See GARCH Statement in Chapter 29, The VARMAX Procedure, for details.

The VARMAX Model

The VARMAX procedure provides modeling of a VARMAX(p,q,s) process.

If the Kalman filtering method is used for the parameter estimation of the VARMAX(p,q,s) model, then the dimension of the state-space vector is large, which takes time and memory for computing. For convenience, the parameter estimation of the VARMAX(p,q,s) model uses the two-stage estimation method, which computes the estimation of deterministic terms and exogenous parameters and then maximizes the log-likelihood function of the VARMA(p,q) model.

Some examples of VARMAX modeling are:


   model y1 y2 = x1 / q=1;
   nloptions tech=qn;

   model y1 y2 = x1 / p=1 q=1 xlag=1 nocurrentx;
   nloptions tech=qn;

The BVARX Model

Bayesian modeling allows independent (exogenous) variables with their distributed lags. For example:
   model y1 y2 = x1 / p=2 prior=(theta=0.2 lambda=5);

The VECMX Model

Vector error correction modeling now allows independent (exogenous) variables with their distributed lags. For example:
   model y1 y2 = x1 / p=2 ecm=(rank=1);

The BVECMX Model

Bayesian vector error correction modeling allows independent (exogenous) variables with their distributed lags. For example:
   model y1 y2 = x1 / p=2 prior=(theta=0.2 lambda=5)  ecm=(rank=1);

The VARMAX-GARCH Model

VARMAX modeling now supports an error term that has a GARCH-type multivariate conditional heteroscedasticity model. For example:
   model y1 y2 = x1 / p=1 q=1;
   garch q=1;

New Printing Control Options

The PRINT= option can be used in the MODEL statement to control the results printed. See the description of the PRINT= option in Chapter 29, The VARMAX Procedure, for details. x12

X12 Procedure

The X12 procedure has many new statements and options. Many of the new features are related to the regARIMA modeling, which is used to extend the series to be seasonally adjusted. A new experimental input and output data set has been added which describes the times series model fit to the series.

The following miscellaneous statements and options are new:

The NOINT option on the AUTOMDL statement suppresses the fitting of a constant term in automatically identified models.

The following tables are now available through the OUTPUT statement: A7, A9, A10, C20, D1, and D7.

The TABLES statement enables you to display some tables that represent intermediate calculations in the X11 method and that are not displayed by default.

The following statements and options related to the regression component of regARIMA modeling are new:

The following new experimental options specify input and output data sets that describe the times series model: