FOCUS AREAS

Enhancements in SAS/ETS® 9.22

Overview

SAS/ETS 9.22 is the latest release and introduces the new experimental SEVERITY and TIMEID procedures and a new experimental interactive application called the SAS/ETS Model Editor. Significant enhancements have also been made to a number of existing procedures and to the data access engines. The following are highlights of this new release.

severity

The SEVERITY Procedure

The new SEVERITY procedure fits models for statistical distributions of the severity (magnitude) of events. The magnitude of events can be modeled as a random variable with a continuous parametric probability distribution. The procedure uses the maximum likelihood method to fit multiple specified distributions and identifies the best model based on a specified model-selection criterion.

The procedure is delivered with a set of predefined models for several commonly used distributions, including the Burr, exponential, gamma, inverse Gaussian, lognormal, Pareto, generalized Pareto, and Weibull distributions. You can also extend the procedure to fit any continuous parametric distribution.

Exogenous variables can be specified for fitting a model that has a scale parameter. The exogenous variables are modeled such that their linear combination uses a specified link function to affect the scale parameter. The regression coefficients that are associated with the variables in the linear combination are estimated along with the parameters of the distribution.

Censoring and truncation can be specified for each observed value of the response variable. Global values can also be specified to override the individual values that are associated with each observed value.

The TIMEID Procedure

The new TIMEID procedure analyzes the sequence of ID values in a SAS data set to identify the time interval between observations. PROC TIMEID then verifies that the observations in the data set represent a properly spaced time series. Specified time intervals and alignments can be used to evaluate a data setís time ID values in terms of the distributions of duplicated values, alignment offsets, and the gaps between adjacent observations. The time intervalís width, shift, and alignment can be inferred from a time ID variable. When either the interval or its alignment is specified, this information is used to guide the process of inferring the remaining quantity. When multiple BY groups are present, detailed diagnostics for each BY group are reported in addition to summarized diagnostic information which applies to all BY groups in the data set.

The SAS/ETS Model Editor

The new SAS/ETS Model Editor provides a convenient and interactive graphical user interface that enables you to define, fit, and simulate nonlinear statistical models using the MODEL procedure. The SAS/ETS Model Editor consists of a fitted model wizard, a program editor panel, and a model template.

The fitted model wizard enables you to create and define the equation statements, variables, parameters, constraints, and fit options on a step-by-step basis. You can apply a fitted model to any specific market data. The program editor panel enables you to write programming code to define your model and create additional dialog boxes that explicity specify properties associated with your model. Model templates are commonly used models that can be applied to a wide variety of data. They enable you to define the equation statements, variables, parameters, and constraints that you need in the programming code.

autoreg

AUTOREG Procedure

The AUTOREG procedure has many new enhancements including three new models and eleven new tests.

The three new models are asymmetric GARCH models, namely quadratic GARCH, threshold GARCH, and power GARCH. These models are implemented to measure the impact of news on the future volatility. Power GARCH also considers the long-memory property in the volatility.

The new tests include the following:

The AUTOREG procedure now supports the CLASS statement, and the MODEL statement now supports the use of CLASS variables and interaction terms as predictors. A CLASS statement enables you to declare classification variables for use as explanatory effects in a model. When a CLASS variable is used as a predictor in the MODEL statement, the procedure automatically creates a dummy regressor that corresponds to each discrete value or level of the CLASS variable.

Other enhancements to the AUTOREG procedure include the following: countreg

COUNTREG Procedure

The COUNTREG procedure now supports the CLASS, FREQ, WEIGHT, and NLOPTIONS statements.

The CLASS statement enables you to declare classification variables for use as explanatory effects in a model. When a CLASS variable is used as a predictor in the MODEL statement, the procedure automatically creates a dummy regressor that corresponds to each discrete value or level of the CLASS variable.

The FREQ statement specifies a variable whose values indicate the number of cases that are represented by each observation. That is, the procedure treats each observation as if it had appeared n times in the input data set, where n is the value of the FREQ variable.

The WEIGHT statement specifies a variable whose values supply weights for each observation in the data set. These weights control the importance given to the data observations in fitting the model.

The NLOPTIONS statement enables you to specify options for the subsystem that is used for the nonlinear optimization.

MDC Procedure

The MDC procedure now supports the CLASS and TEST statements.

The CLASS statement enables you to declare classification variables for use as explanatory effects in a model. When a CLASS variable is used as a predictor in the MODEL statement, the procedure automatically creates a dummy regressor that corresponds to each discrete value or level of the CLASS variable.

The TEST statement enables you to test linear equality restrictions on the parameters. Three tests are available: Wald, Lagrange multiplier, and likelihood ratio.

QLIM Procedure

The QLIM procedure now supports the WEIGHT statement. A WEIGHT statement identifies a variable to supply weights for each observation in the data set. By default, the weights are normalized so that they add up to the sample size. If the NONORMALIZE option is used, the actual weights are used without normalization.

The OUTPUT statement now provides the TE1 and TE2 options, which output technical efficiency measures for each producer in stochastic frontier models.

SASEFAME Engine

The SASEFAME interface engine provides a seamless interface between Fame and SAS data to enable SAS users to access and process time series, case series, and formulas that reside in a Fame database. The following enhancements have been made to the SASEFAME access engine for Fame databases:

SASEHAVR Engine

The SASEHAVR interface engine is a seamless interface between Haver and SAS data processing that enables SAS users to read economic and financial time series data that reside in a Haver Analytics DLX (Data Link Express) database. The following enhancements have been made to the SASEHAVR access engine for Haver Analytics databases:

timeseries

TIMESERIES Procedure

The TIMESERIES procedure now includes new facilities for performing singular spectrum analysis (SSA), new functionality for performing Fourier spectrum analysis, and new capabilities for native database accumulation of data for a time series.

Singular Spectrum Analysis

Singular spectrum analysis is a technique for decomposing a time series into additive components and categorizing these components based on the magnitudes of their contributions. SSA uses a single parameter, the window length, to quantify patterns in a time series without relying on preconceived notions about the structure of the time series. The window length represents the maximum lag considered in the analysis and corresponds to the dimensionality of the principle components analysis on which the SSA is based.

In addition to SSA output options, an SSA statement has been added to explicitly control the window length parameter and the grouping of SSA series components.

Fourier Spectrum Analysis

PROC TIMESERIES now offers functionality similar to that available in PROC SPECTRA for analyzing periodograms of time series data. Now ODS graphical representations of periodograms and spectral density estimates can be computed and displayed.

Database Accumulation

For Teradata-based input data sets, aggregation and accumulation can be performed using native facilities in the database server. Most ACCUMULATE= options specified in the ID and VAR statements can be performed by the database server.

X12

X12 Procedure

Many new features have been added to the X12 procedure. You can now generate six new tables and five new diagnostic plots with the CHECK statement, four new tables with the OUTPUT statement, three new tables with the TABLES statement, and five new tables through ODS. New auxiliary variables, _NAME_, Transform, Adjust, Regressors, Diff, and Sdiff, have also been added to ACF and PACF data sets that are available through ODS OUTPUT. The new variables help you identify the source of the data when multiple ACFs and PACFs are calculated.

There are two new enhancements to the X11 statement: The SIGMALIM option enables you to specify the upper and lower sigma limits that are used to identify and decrease the weight of extreme irregular values in the internal seasonal adjustment computations. The TYPE option controls which factors are removed from the original series to produce the seasonally adjusted series and also the final trend cycle.

The X12 statement has five new enhancements. The OUTSTAT= option specifies the optional output data set that contains the summary statistics related to each seasonally adjusted series. The data set is sorted by the BY-group variables, if any, and by series names. The PERIODOGRAM option enables you to specify that the periodogram rather than the spectrum of the series be plotted in the G tables and plots. The PLOTS= option controls the plots that are produced through ODS Graphics. The SPECTRUMSERIES option specifies the table name of the series that is used in the spectrum of the original series. The experimental AUXDATA= option specifies an auxiliary input data set that can contain user-defined variables specified in the INPUT statement, the USERVAR= option of the REGRESSION statment, or the USERDEFINED statement. The AUXDATA= option is useful when user-defined regressors are used for multiple time series data sets or multiple BY groups.

Finally, the IDENTIFY statement now supports a MAXLAG option, which specifies the maximum number of lags for the sample ACF and PACF that are associated with model identification.

Other Highlights