The TRANSREG Procedure |
The TRANSREG (transformation regression) procedure fits linear models, optionally with smooth, spline, Box-Cox, and other nonlinear transformations of the variables. You can use PROC TRANSREG to fit a curve through a scatter plot or fit multiple curves, one for each level of a classification variable. You can also constrain the functions to be parallel or monotone or have the same intercept. PROC TRANSREG can be used to code experimental designs and classification variables prior to their use in other analyses.
The TRANSREG procedure fits many types of linear models, including the following:
ordinary regression and ANOVA
metric and nonmetric conjoint analysis (Green and Wind; 1975; de Leeuw, Young, and Takane; 1976)
linear models with Box-Cox (1964) transformations of the dependent variables
regression with a smooth (Reinsch; 1967), spline (de Boor; 1978; van Rijckevorsel; 1982), monotone spline (Winsberg and Ramsay; 1980), or penalized B-spline (Eilers and Marx; 1996) fit function
metric and nonmetric vector and ideal point preference mapping (Carroll; 1972)
simple, multiple, and multivariate regression with variable transformations (Young, de Leeuw, and Takane; 1976; Winsberg and Ramsay; 1980; Breiman and Friedman; 1985)
redundancy analysis (Stewart and Love; 1968) with variable transformations (Israels; 1984)
canonical correlation analysis with variable transformations (van der Burg and de Leeuw; 1983)
response surface regression (Meyers; 1976; Khuri and Cornell; 1987) with variable transformations
The data set can contain variables measured on nominal, ordinal, interval, and ratio scales (Siegel; 1956). You can specify any mix of these variable types for the dependent and independent variables. PROC TRANSREG can do the following:
transform nominal variables by scoring the categories to minimize squared error (Fisher; 1938), or treat nominal variables as classification variables
transform ordinal variables by monotonically scoring the ordered categories so that order is weakly preserved (adjacent categories can be merged) and squared error is minimized. Ties can be optimally untied or left tied (Kruskal; 1964). Ordinal variables can also be transformed to ranks.
transform interval and ratio scale of measurement variables linearly or nonlinearly with spline (de Boor; 1978; van Rijckevorsel; 1982), monotone spline (Winsberg and Ramsay; 1980), penalized B-spline (Eilers and Marx; 1996), smooth (Reinsch; 1967), or Box-Cox (Box and Cox; 1964) transformations. In addition, logarithmic, exponential, power, logit, and inverse trigonometric sine transformations are available.
Transformations produced by the PROC TRANSREG multiple regression algorithm, requesting spline transformations, are often similar to transformations produced by the ACE smooth regression method of Breiman and Friedman (1985). However, ACE does not explicitly optimize a loss function (de Leeuw; 1986), while PROC TRANSREG explicitly minimizes a squared-error criterion.
PROC TRANSREG extends the ordinary general linear model by providing optimal variable transformations that are iteratively derived. PROC TRANSREG iterates until convergence, alternating two major steps: finding least-squares estimates of the model parameters given the current scoring of the data, and finding least-squares estimates of the scoring parameters given the current set of model parameters. This is called the method of alternating least squares (Young; 1981).
For more background on alternating least-squares optimal scaling methods and transformation regression methods, see Young, de Leeuw, and Takane (1976), Winsberg and Ramsay (1980), Young (1981), Gifi (1990), Schiffman, Reynolds, and Young (1981), van der Burg and de Leeuw (1983), Israels (1984), Breiman and Friedman (1985), and Hastie and Tibshirani (1986). (These are just a few of the many relevant sources.)
There have been a number of enhancements to PROC TRANSREG with this release, and a few changes. The most notable enhancement is the addition of graphical displays produced through ODS Graphics. Now, fit, transformation, residual, and other plots are created when you enable ODS Graphics. See the sections Fitting a Curve through a Scatter Plot, Box-Cox Transformations, Using Splines and Knots, Linear and Nonlinear Regression Functions, Simultaneously Fitting Two Regression Functions, Smoothing Splines, and ODS Graphics for more information. See Example 91.1, Example 91.2, Example 91.3, and Example 91.6 for examples. The PBSPLINE transformation (penalized B-spline) is new with this release. It can be used to fit curves through a scatter plot with an automatic selection of the smoothing parameter. See the sections Fitting a Curve through a Scatter Plot and Penalized B-Splines, and see Example 91.3 for an example. How the results of the SMOOTH transformation are used in PROC TRANSREG has changed with this release. In particular, some aspects of the syntax along with the coefficients, degrees of freedom, and predicted values have changed. See the section Smoothing Splines Changes and Enhancements for more information about the changes, and see the NSR algorithm option for a way to restore the old functionality. Also, the iteration history table is not printed in certain models, such as those with smoothing splines and penalized B-splines, where it is known that no iterations are necessary. See the section Iteration History Changes and Enhancements for more information.
New options include the following:
ORTHOGONAL t-option – requests an orthogonal-contrast coding
STANDORTH t-option – requests a standardized-orthogonal coding
PLOTS= PROC statement option – ODS Graphics selection
NSR a-option – no smoothing spline model restrictions
TSUFFIX= a-option – shortens transformed variable labels
PBSPLINE transform – penalized B-splines
New options for penalized B-splines include the following:
AIC t-option – Akaike’s information criterion
AICC t-option – corrected AIC
CV t-option – cross validation criterion
GCV t-option – generalized cross validation criterion
RANGE t-option – LAMBDA= specifies a range, not a list
SBC t-option – Schwarz’s Bayesian criterion
Changed options include the following:
LAMBDA= t-option – smoothing parameter list or range, now used with PBSPLINE
EVENLY= t-option – evenly spaced interior knots, now creates evenly spaced exterior knots as well
SMOOTH transform – new df calculations, printed output
SSPLINE transform – new df calculations, just like SMOOTH
Copyright © SAS Institute, Inc. All Rights Reserved.